Commits · 09f08930f0f6fd4a7350ac02f29124b919727198 · Abuhujair Javed / Postgres FD Implementation

22 Jul, 2019 5 commits

initdb: Change authentication defaults · 09f08930

Peter Eisentraut authored Jul 22, 2019

Change the defaults for the pg_hba.conf generated by initdb to "peer"
for local (if supported, else "md5") and "md5" for host.

(Changing from "md5" to SCRAM is left as a separate exercise.)

"peer" is currently not supported on AIX, HP-UX, and Windows.  Users
on those operating systems will now either have to provide a password
to initdb or choose a different authentication method when running
initdb.
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/bec17f0a-ddb1-8b95-5e69-368d9d0a3390%40postgresql.org

09f08930

Use appendBinaryStringInfo in more places where the length is known · 1e6a7598

David Rowley authored Jul 23, 2019

When we already know the length that we're going to append, then it
makes sense to use appendBinaryStringInfo instead of
appendStringInfoString so that the append can be performed with a simple
memcpy() using a known length rather than having to first perform a
strlen() call to obtain the length.

Discussion: https://postgr.es/m/CAKJS1f8+FRAM1s5+mAa3isajeEoAaicJ=4e0WzrH3tAusbbiMQ@mail.gmail.com

1e6a7598

Make identity sequence management more robust · 19781729

Peter Eisentraut authored Jul 22, 2019

Some code could get confused when certain catalog state involving both
identity and serial sequences was present, perhaps during an attempt
to upgrade the latter to the former. Specifically, dropping the
default of a serial column maintains the ownership of the sequence by
the column, and so it would then be possible to afterwards make the
column an identity column that would now own two sequences. This
causes the code that looks up the identity sequence to error out,
making the new identity column inoperable until the ownership of the
previous sequence is released.

To fix this, make the identity sequence lookup only consider sequences
with the appropriate dependency type for an identity sequence, so it
only ever finds one (unless something else is broken). In the above
example, the old serial sequence would then be ignored. Reorganize
the various owned-sequence-lookup functions a bit to make this
clearer.
Reported-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://www.postgresql.org/message-id/flat/470c54fc8590be4de0f41b0d295fd6390d5e8a6c.camel@cybertec.at

19781729

Make better use of the new List implementation in a couple of places · efdcca55

David Rowley authored Jul 22, 2019

In nodeAppend.c and nodeMergeAppend.c there were some foreach loops which
looped over the list of subplans and only performed any work if the
subplan index was found in a Bitmapset. With the old linked list
implementation of List, this form made sense as accessing the Nth list
element was O(N). However, thanks to 1cff1b95 we now have array-based
lists, so accessing the Nth element has become O(1).

Here we make the most of the O(1) lookups and just loop over the set
members of the Bitmapset with bms_next_member(). This performs slightly
better when a small number of the list items are in the Bitmapset. Micro
benchmarks show that when the Bitmapset contains all or most of the list
items then the new code is ever so slightly slower. In practice, the cost
is so small that it's drowned out by various other things such as locking
the relations belonging to each subplan, etc.

The primary goal here is to leave better code examples around which benefit
better from the new list implementation.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAKJS1f8ZcsLVgkF4wOfRyMYTcPgLFiUAOedFC+U2vK_aFZk-BA@mail.gmail.com

efdcca55

Fix inconsistencies and typos in the tree · 23bccc82

Michael Paquier authored Jul 22, 2019

This is numbered take 7, and addresses a set of issues with code
comments, variable names and unreferenced variables.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/dff75442-2468-f74f-568c-6006e141062f@gmail.com

23bccc82

21 Jul, 2019 4 commits

Adjust overly strict Assert · e1a0f6a9

David Rowley authored Jul 22, 2019

3373c715 changed how we determine EquivalenceClasses for relations and
added an Assert to ensure all relations mentioned in each EC's ec_relids
was a RELOPT_BASEREL. However, the join removal code may remove a LEFT
JOIN and since it does not clean up EC members belonging to the removed
relations it can leave RELOPT_DEADREL rels in ec_relids.

Fix this by adjusting the Assert to allow RELOPT_DEADREL rels too.

Reported-by: sqlsmith via Andreas Seltenreich
Discussion: https://postgr.es/m/87y30r8sls.fsf@ansel.ydns.eu

e1a0f6a9

Remove no-longer-helpful reliance on fixed-size local array. · 330cafdf

Tom Lane authored Jul 21, 2019

Coverity complained about this code, apparently because it uses a local
array of size FUNC_MAX_ARGS without a guard that the input argument list
is no longer than that.  (Not sure why it complained today, since this
code's been the same for a long time; possibly it re-analyzed everything
the List API change touched?)

Rather than add a guard, though, let's just get rid of the local array
altogether.  It was only there to avoid list_nth() calls, and those are
no longer expensive.

330cafdf

Fix compilation warning of pg_basebackup with MinGW · 90317ab7

Michael Paquier authored Jul 21, 2019

Several buildfarm members have been complaining about that with gcc,
like jacana.  Weirdly enough, Visual Studio's compilers do not find this
issue.

Author: Michael Paquier
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/20190719050830.GK1859@paquier.xyz

90317ab7

Speed up finding EquivalenceClasses for a given set of rels · 3373c715

David Rowley authored Jul 21, 2019

Previously in order to determine which ECs a relation had members in, we
had to loop over all ECs stored in PlannerInfo's eq_classes and check if
ec_relids mentioned the relation. For the most part, this was fine, as
generally, unless queries were fairly complex, the overhead of performing
the lookup would have not been that significant. However, when queries
contained large numbers of joins and ECs, the overhead to find the set of
classes matching a given set of relations could become a significant
portion of the overall planning effort.

Here we allow a much more efficient method to access the ECs which match a
given relation or set of relations. A new Bitmapset field in RelOptInfo
now exists to store the indexes into PlannerInfo's eq_classes list which
each relation is mentioned in. This allows very fast lookups to find all
ECs belonging to a single relation. When we need to lookup ECs belonging
to a given pair of relations, we can simply bitwise-AND the Bitmapsets from
each relation and use the result to perform the lookup.

We also take the opportunity to write a new implementation of
generate_join_implied_equalities which makes use of the new indexes.
generate_join_implied_equalities_for_ecs must remain as is as it can be
given a custom list of ECs, which we can't easily determine the indexes of.

This was originally intended to fix the performance penalty of looking up
foreign keys matching a join condition which was introduced by 100340e2.
However, we're speeding up much more than just that here.

Author: David Rowley, Tom Lane
Reviewed-by: Tom Lane, Tomas Vondra
Discussion: https://postgr.es/m/6970.1545327857@sss.pgh.pa.us

3373c715

20 Jul, 2019 3 commits

Don't rely on estimates for amcheck Bloom filters. · 894af78f

Peter Geoghegan authored Jul 20, 2019

Solely relying on a relation's reltuples/relpages estimate to size the
Bloom filters used by amcheck verification makes verification less
effective when the estimates are very stale. In extreme cases,
verification options that use Bloom filters internally could be totally
ineffective, without users receiving any clear indication that certain
types of corruption might easily be missed.

To fix, use RelationGetNumberOfBlocks() instead of relpages to size the
downlink block Bloom filter. Use the same RelationGetNumberOfBlocks()
value to derive a minimum size for the heapallindexed Bloom filter,
rather than completely trusting reltuples. Verification will still be
reasonably effective when the projected/estimated number of Bloom filter
elements is at least 1/5 of the final number of elements, which is
assured by the new sizing logic.

Reported-By: Alexander Korotkov
Discussion: https://postgr.es/m/CAH2-Wzk0ke2J42KrNYBKu0Xovjy-sU5ub7PWjgpbsKdAQcL4OA@mail.gmail.com
Backpatch: 11-, where downlink/heapallindexed verification were added.

894af78f

Use column collation for extended statistics · a63378a0

Tomas Vondra authored Jul 18, 2019

The current extended statistics code was a bit confused which collation
to use.  When building the statistics, the collations defined as default
for the data types were used (since commit 5e092800).  The MCV code was
however using the column collations for MCV serialization, and then
DEFAULT_COLLATION_OID when computing estimates. So overall the code was
using all three possible options, inconsistently.

This uses the column colation everywhere - this makes it consistent with
what 5e092800 did for regular stats.  We however do not track the
collations in a catalog, because we can derive them from column-level
information.  This may need to change in the future, e.g. after allowing
statistics on expressions.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu
Backpatch-to: 12

a63378a0

Rework examine_opclause_expression to use varonleft · e38a55ba

Tomas Vondra authored Jul 19, 2019

The examine_opclause_expression function needs to return information on
which side of the operator we found the Var, but the variable was called
"isgt" which is rather misleading (it assumes the operator is either
less-than or greater-than, but it may be equality or something else).
Other places in the planner use a variable called "varonleft" for this
purpose, so just adopt the same convention here.

The code also assumed we don't care about this flag for equality, as
(Var = Const) and (Const = Var) should be the same thing. But that does
not work for cross-type operators, in which case we need to pass the
parameters to the procedure in the right order. So just use the same
code for all types of expressions.

This means we don't need to care about the selectivity estimation
function anymore, at least not in this code. We should only get the
supported cases here (thanks to statext_is_compatible_clause).

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu
Backpatch-to: 12

e38a55ba

19 Jul, 2019 5 commits

pg_stat_statements: add missing check for pgss_enabled(). · 6f40ee4f

Jeff Davis authored Jul 19, 2019

Make pgss_post_parse_analyze() more consistent with the other hooks,
and avoid unnecessary overhead when pg_stat_statements.track=none.

Author: Raymond Martin
Reviewed-by: Fabien COELHO
Discussion: https://postgr.es/m/BN8PR21MB1217B003C4F79DE230AA36B9B1580%40BN8PR21MB1217.namprd21.prod.outlook.com

6f40ee4f

Silence compiler warning, hopefully. · 42146686

Tom Lane authored Jul 19, 2019

Absorb commit e5e04c962a5d12eebbf867ca25905b3ccc34cbe0 from upstream
IANA code, in hopes of silencing warnings from MSVC about negating
a bool value.

Discussion: https://postgr.es/m/20190719035347.GJ1859@paquier.xyz

42146686

Doc: clarify when table rewrites happen with column addition and DEFAULT · 1300fa66

Michael Paquier authored Jul 19, 2019

16828d5c has improved ALTER TABLE so as a column addition does not
require a rewrite for a non-NULL default with constant expressions, but
one spot in the documentation did not get updated consistently.
The documentation also now clarifies the fact that this does not apply
if the expression is volatile, where a table rewrite is still required.

Reported-by: Daniel Westermann
Author: Ian Barwick
Reviewed-by: Michael Paquier, Daniel Westermann
Discussion: https://postgr.es/m/DB6PR0902MB2184C7D5645CF15D75EB7957D2CF0@DB6PR0902MB2184.eurprd09.prod.outlook.com
Backpatch-through: 11

1300fa66

Refactor parallelization processing code in src/bin/scripts/ · 5f384037

Michael Paquier authored Jul 19, 2019

The existing facility of vacuumdb to handle parallel connections into a
given database with an authentication set is moved to a common file in
src/bin/scripts/, named scripts_parallel.c.  This introduces a set of
routines to initialize, wait and terminate a set of connections,
simplifying a bit the code of vacuumdb on the way.  More routines
related to result handling and database connection are moved to
common.c.

The initial plan is to use that for reindexdb, but it could be applied
to other tools like clusterdb.

While on it, clean up a set of variables "progname" which were defined
as routine arguments for error messages.  Since most of the callers have
switched to pg_log_error() and such there is no need for this variable.

Author: Julien Rouhaud
Reviewed-by: Michael Paquier, Álvaro Herrera
Discussion: https://postgr.es/m/CAOBaU_YrnH_Jqo46NhaJ7uRBiWWEcS40VNRQxgFbqYo9kApUsg@mail.gmail.com

5f384037

Fix error in commit . · b538c90b

Jeff Davis authored Jul 18, 2019

I was careless passing a datum directly to DATE_NOT_FINITE without
calling DatumGetDateADT() first.

Backpatch-through: 9.4

b538c90b

18 Jul, 2019 10 commits

Fix typo in mvdistinct.c · 70a33b21
Michael Paquier authored Jul 19, 2019
```
Noticed while browsing the code.
```
70a33b21

Fix daterange canonicalization for +/- infinity. · e6feef57

Jeff Davis authored Jul 18, 2019

The values 'infinity' and '-infinity' are a part of the DATE type
itself, so a bound of the date 'infinity' is not the same as an
unbounded/infinite range. However, it is still wrong to try to
canonicalize such values, because adding or subtracting one has no
effect. Fix by treating 'infinity' and '-infinity' the same as
unbounded ranges for the purposes of canonicalization (but not other
purposes).

Backpatch to all versions because it is inconsistent with the
documented behavior. Note that this could be an incompatibility for
applications relying on the behavior contrary to the documentation.

Author: Laurenz Albe
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/77f24ea19ab802bc9bc60ddbb8977ee2d646aec1.camel%40cybertec.at
Backpatch-through: 9.4

e6feef57

Fix nbtree metapage cache upgrade bug. · d004147e

Peter Geoghegan authored Jul 18, 2019

Commit 857f9c36, which taught nbtree VACUUM to avoid unnecessary
index scans, bumped the nbtree version number from 2 to 3, while adding
the ability for nbtree indexes to be upgraded on-the-fly.  Various
assertions that assumed that an nbtree index was always on version 2 had
to be changed to accept any supported version (version 2 or 3 on
Postgres 11).

However, a few assertions were missed in the initial commit, all of
which were in code paths that cache a local copy of the metapage
metadata, where the index had been expected to be on the current version
(no longer version 2) as a generic sanity check.  Rather than simply
update the assertions, follow-up commit 0a64b451 intentionally made
the metapage caching code update the per-backend cached metadata version
without changing the on-disk version at the same time.  This could even
happen when the planner needed to determine the height of a B-Tree for
costing purposes.  The assertions only fail on Postgres v12 when
upgrading from v10, because they were adjusted to use the authoritative
shared memory metapage by v12's commit dd299df8.

To fix, remove the cache-only upgrade mechanism entirely, and update the
assertions themselves to accept any supported version (go back to using
the cached version in v12).  The fix is almost a full revert of commit
0a64b451 on the v11 branch.

VACUUM only considers the authoritative metapage, and never bothers with
a locally cached version, whereas everywhere else isn't interested in
the metapage fields that were added by commit 857f9c36.  It seems
unlikely that this bug has affected any user on v11.

Reported-By: Christoph Berg
Bug: #15896
Discussion: https://postgr.es/m/15896-5b25e260fdb0b081%40postgresql.org
Backpatch: 11-, where VACUUM was taught to avoid unnecessary index scans.

d004147e

Further adjust SPITupleTable to provide a public row-count field. · bc8393cf

Tom Lane authored Jul 18, 2019

Now that commit fec0778c drew a clear line between public and private
fields in SPITupleTable, it seems pretty silly that the count of valid
tuples isn't on the public side of that line. The reason why not was
that there wasn't such a count. For reasons lost in the mists of time,
spi.c preferred to keep a count of remaining free entries in the array.
But that seems pretty pointless: it's unlike the way we handle similar
code everywhere else, and it involves extra subtractions that surely
outweigh having to do a comparison rather than test-for-zero to check
for array-full.

Hence, rearrange so that this code does the expansible array logic
the same as everywhere else, with a count of valid entries alongside
the allocated array length. And document the count as public.

I looked for core-code callers where it would make sense to start
relying on tuptable->numvals rather than the separate SPI_processed
variable. Right now there don't seem to be places where it'd be
a win to do so without more code restructuring than I care to
undertake today. In principle, though, having SPITupleTables be
fully self-contained should be helpful down the line.

Discussion: https://postgr.es/m/16852.1563395722@sss.pgh.pa.us

bc8393cf

Simplify bitmap updates in multivariate MCV code · 7d24f6a4

Tomas Vondra authored Jul 17, 2019

When evaluating clauses on a multivariate MCV list, we build a bitmap
tracking how the clauses match each item of the MCV list.  When updating
the bitmap we need to consider the current value (tracking how the item
matches preceding clauses), match for the current clause and whether the
clauses are connected by AND or OR.

Until now the logic was copied on every place updating the bitmap, which
was not quite readable.  So just move it to a separate function and call
it where needed.

Backpatch to 12, where the code was introduced. While not a bugfix, this
should make maintenance and future backpatches easier.

Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu

7d24f6a4

Fix handling of NULLs in MCV items and constants · e4deae73

Tomas Vondra authored Jul 15, 2019

There were two issues in how the extended statistics handled NULL values
in opclauses. Firstly, the code was oblivious to the possibility that
Const may be NULL (constisnull=true) in which case the constvalue is
undefined. We need to treat this as a mismatch, and not call the proc.

Secondly, the MCV item itself may contain NULL values too - the code
already did check that, and updated the match bitmap accordingly, but
failed to ensure we won't call the operator procedure anyway. It did
work for AND-clauses, because in that case false in the bitmap stops
evaluation of further clauses. But for OR-clauses ir was not easy to
get incorrect estimates or even trigger a crash.

This fixes both issues by extending the existing check so that it looks
at constisnull too, and making sure it skips calling the procedure.

Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu

e4deae73

Fix handling of opclauses in extended statistics · e8b6ae21

Tomas Vondra authored Jul 13, 2019

We expect opclauses to have exactly one Var and one Const, but the code
was checking the Const by calling is_pseudo_constant_clause() which is
incorrect - we need a proper constant.

Fixed by using plain IsA(x,Const) to check type of the node. We need to
do these checks in two places, so move it into a separate function that
can be called in both places.

Reported by Andreas Seltenreich, based on crash reported by sqlsmith.

Backpatch to v12, where this code was introduced.

Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu
Backpatch-to: 12

e8b6ae21

Remove unnecessary TYPECACHE_GT_OPR lookup · a4303a07

Tomas Vondra authored Jul 17, 2019

The TYPECACHE_GT_OPR is not needed (it used to be in older version of
the MCV code), but the compiler failed to detect this as the result was
used in a fmgr_info() call, populating a FmgrInfo entry.

Backpatch to v12, where this code was introduced.

Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu
Backpatch-to: 12

a4303a07

tableam: comment improvements. · 21039555

Andres Freund authored Jul 17, 2019

Author: Brad DeJong
Discussion: https://postgr.es/m/CAJnrtnxDYOQFsDfWz2iri0T_fFL2ZbbzgCOE=4yaMcszgcsf4A@mail.gmail.com
Backpatch: 12-

21039555

Simplify description of --data-checksums in documentation of initdb · 1c1602b8

Michael Paquier authored Jul 18, 2019

The documentation mentioned that data checksums cannot be changed after
initialization, which is not true as pg_checksums can do that with its
--enable option introduced in v12.  This simply removes the sentence
telling so.

Reported-by: Basil Bourque
Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/15909-e9d74271f1647472@postgresql.org
Backpatch-through: 12

1c1602b8

17 Jul, 2019 7 commits

Update time zone data files to tzdata release 2019b. · 93907478
Tom Lane authored Jul 17, 2019
```
Brazil no longer observes DST.
Historical corrections for Palestine, Hong Kong, and Italy.
```
93907478

Sync our copy of the timezone library with IANA release tzcode2019b. · f285322f