Commits · 8344d72cccae699e13884a5705b91dc1c4747c03 · Abuhujair Javed / Postgres FD Implementation

20 Dec, 2020 2 commits

Fixes for pg_dump.c regarding multiranges · 8344d72c
Alexander Korotkov authored Dec 20, 2020
```
This commit fixes two wrong version number checks and one wrong check for null.
```
8344d72c

Alexander Korotkov authored Dec 20, 2020

Multiranges are basically sorted arrays of non-overlapping ranges with
set-theoretic operations defined over them.

Since v14, each range type automatically gets a corresponding multirange
datatype. There are both manual and automatic mechanisms for naming multirange
types. Once can specify multirange type name using multirange_type_name
attribute in CREATE TYPE. Otherwise, a multirange type name is generated
automatically. If the range type name contains "range" then we change that to
"multirange". Otherwise, we add "_multirange" to the end.

Implementation of multiranges comes with a space-efficient internal
representation format, which evades extra paddings and duplicated storage of
oids. Altogether this format allows fetching a particular range by its index
in O(n).

Statistic gathering and selectivity estimation are implemented for multiranges.
For this purpose, stored multirange is approximated as union range without gaps.
This field will likely need improvements in the future.

Catversion is bumped.

Discussion: https://postgr.es/m/CALNJ-vSUpQ_Y%3DjXvTxt1VYFztaBSsWVXeF1y6gTYQ4bOiWDLgQ%40mail.gmail.com
Discussion: https://postgr.es/m/a0b8026459d1e6167933be2104a6174e7d40d0ab.camel%40j-davis.com#fe7218c83b08068bfffb0c5293eceda0
Author: Paul Jungwirth, revised by me
Reviewed-by: David Fetter, Corey Huinker, Jeff Davis, Pavel Stehule
Reviewed-by: Alvaro Herrera, Tom Lane, Isaac Morland, David G. Johnston
Reviewed-by: Zhihong Yu, Alexander Korotkov

6df7a969

19 Dec, 2020 2 commits

Remove now-useless ALWAYS_SUBDIRS entry in src/test/Makefile. · 08b01d4d

Tom Lane authored Dec 19, 2020

Commit 257836a7 added the "locale" subdirectory to SUBDIRS,
but neglected to remove it from ALWAYS_SUBDIRS.  This oversight
had no functional effect because the filter-out function would
remove it anyway.  Still, it's confusing to readers to list a
subdirectory in both places, especially because it makes the
associated comment into a partial lie.

08b01d4d

Update comment atop of ReorderBufferQueueMessage(). · 20659fd8

Amit Kapila authored Dec 19, 2020

The comments atop of this function describes behaviour in case of a
transactional WAL message only, but it accepts both transactional and
non-transactional WAL messages. Update the comments to describe
behaviour in case of non-transactional WAL message as well.

Ashutosh Bapat, rephrased by Amit Kapila
Discussion: https://postgr.es/m/CAGEoWWTTzNzHOi8bj0wfAo1siGi-YEh6wqH1oaz4DrkTJ6HbTQ@mail.gmail.com

20659fd8

18 Dec, 2020 3 commits

Add a couple of missed .gitignore entries. · 8afca702

Tom Lane authored Dec 18, 2020

Any subdirectory that's ignoring /output_iso/ should also
ignore /tmp_check_iso/, which could be left behind by a
failed pg_isolation_regress_check run.

I think these have been wrong for awhile, but it doesn't
seem important to fix in back branches.

8afca702

Avoid memcpy() with same source and destination during relmapper init. · 53d4f5fe

Tom Lane authored Dec 18, 2020

A narrow reading of the C standard says that memcpy(x,x,n) is undefined,
although it's hard to envision an implementation that would really
misbehave. However, analysis tools such as valgrind might whine about
this; accordingly, let's band-aid relmapper.c to not do it.

See also 5b630501, d3f4e8a8, ad7b48ea, and other similar fixes.
Apparently, none of those folk tried valgrinding initdb? This has been
like this for long enough that I'm surprised it hasn't been reported
before.

Back-patch, just in case anybody wants to use a back branch on a platform
that complains about this; we back-patched those earlier fixes too.

Discussion: https://postgr.es/m/161790.1608310142@sss.pgh.pa.us

53d4f5fe

pg_stat_statements: Track time at which all statistics were last reset. · 2e0fedf0

Fujii Masao authored Dec 18, 2020

This commit adds "stats_reset" column into the pg_stat_statements_info
view. This column indicates the time at which all statistics in the
pg_stat_statements view were last reset.

Per discussion, this commit also changes pg_stat_statements_info code
so that "dealloc" column is reset at the same time as "stats_reset" is reset,
i.e., whenever all pg_stat_statements entries are removed, for the sake
of consistency. Previously "dealloc" was reset only when
pg_stat_statements_reset(0, 0, 0) is called and was not reset when
pg_stat_statements_reset() with non-zero value argument discards all
entries. This was confusing.

Author: Naoki Nakamichi, Yuki Seino
Reviewed-by: Yuki Seino, Kyotaro Horiguchi, Li Japin, Fujii Masao
Discussion: https://postgr.es/m/c102cf3180d0ee73c1c5a0f7f8558322@oss.nttdata.com

2e0fedf0

17 Dec, 2020 1 commit

Revert "Get rid of the dedicated latch for signaling the startup process". · 00f690a2

Fujii Masao authored Dec 17, 2020

Revert ac22929a, as well as the followup fix 113d3591. Because it broke
the assumption that the startup process waiting for the recovery conflict
on buffer pin should be waken up only by buffer unpin or the timeout enabled
in ResolveRecoveryConflictWithBufferPin(). It caused, for example,
SIGHUP signal handler or walreceiver process to wake that startup process
up unnecessarily frequently.

Additionally, add the comments about why that dedicated latch that
the reverted patch tried to get rid of should not be removed.

Thanks to Kyotaro Horiguchi for the discussion.

Author: Fujii Masao
Discussion: https://postgr.es/m/d8c0c608-021b-3c73-fffd-3240829ee986@oss.nttdata.com

00f690a2

16 Dec, 2020 3 commits

Fix varchar_2.out to match reality in cs_CZ locale. · 88e014c1
Tom Lane authored Dec 15, 2020
```
Seems to be a copy-and-pasteo in c06d6aa4.  Per buildfarm.
```
88e014c1

doc: Fix explanation related to pg_shmem_allocations · bce641a2

Michael Paquier authored Dec 16, 2020

Offsets are shown as NULL only for anonymous allocations.

Author: Benoit Lobréau
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/CAPE8EZ5Lnoyqoz7aZpvQM0E8sW+hw+k6G2NULe+m4arFRrA1aA@mail.gmail.com
Backpatch-through: 13

bce641a2

doc: clarify COPY TO for partitioning/inheritance · 02c767b0

Bruce Momjian authored Dec 15, 2020

It was not clear how COPY TO behaved with partitioning/inheritance
because the paragraphs were so far apart.  Also reword to simplify.

Discussion: https://postgr.es/m/20201203211723.GR24052@telsasoft.com

Author: Justin Pryzby

Backpatch-through: 10

02c767b0

15 Dec, 2020 6 commits

Remove obsolete btrescan() comment. · 41ddc27f

Peter Geoghegan authored Dec 15, 2020

"Ordering stuff" refered to a _bt_first() call to _bt_orderkeys().
However, the _bt_orderkeys() function was renamed to
_bt_preprocess_keys() by commit fa5c8a05.

_bt_preprocess_keys() is directly referenced just after the removed
comment already, which seems sufficient.

41ddc27f

Remove useless variable stores · a18422a3

Alvaro Herrera authored Dec 15, 2020

Mistakenly introduced in 4cbe3ac3; bug repaired in 148e632c but
the stores were accidentally.

a18422a3

Error out when Gather Merge input is not sorted · 6bc27698

Tomas Vondra authored Dec 15, 2020

To build Gather Merge path, the input needs to be sufficiently sorted.
Ensuring this is the responsibility of the code constructing the paths,
but create_gather_merge_plan tried to handle unsorted paths by adding
an explicit Sort. In light of the recent issues related to Incremental
Sort, this is rather fragile. Some of the expressions may be volatile
or parallel unsafe, in which case we can't add the Sort here.

We could do more checks and add the Sort in at least some cases, but
it seems cleaner to just error out and make it clear this is a bug in
code constructing those paths.

Author: James Coleman
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/CAAaqYe8cK3g5CfLC4w7bs%3DhC0mSksZC%3DH5M8LSchj5e5OxpTAg%40mail.gmail.com
Discussion: https://postgr.es/m/CAJGNTeNaxpXgBVcRhJX%2B2vSbq%2BF2kJqGBcvompmpvXb7pq%2BoFA%40mail.gmail.com

6bc27698

Clean up ancient test style · c06d6aa4

Peter Eisentraut authored Dec 15, 2020

Many older tests where written in a style like

    SELECT '' AS two, i.* FROM INT2_TBL

where the first column indicated the number of expected result rows.
This has gotten increasingly out of date, as the test data fixtures
have expanded, so a lot of these were wrong and misleading.  Moreover,
this style isn't really necessary, since the psql output already shows
the number of result rows.

To clean this up, remove all those extra columns.

Discussion: https://www.postgresql.org/message-id/flat/1a25312b-2686-380d-3c67-7a69094a999f%40enterprisedb.com

c06d6aa4

Improve hash_create()'s API for some added robustness. · b3817f5f

Tom Lane authored Dec 15, 2020

Invent a new flag bit HASH_STRINGS to specify C-string hashing, which
was formerly the default; and add assertions insisting that exactly
one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set.
This is in hopes of preventing recurrences of the type of oversight
fixed in commit a1b8aa1e (i.e., mistakenly omitting HASH_BLOBS).

Also, when HASH_STRINGS is specified, insist that the keysize be
more than 8 bytes.  This is a heuristic, but it should catch
accidental use of HASH_STRINGS for integer or pointer keys.
(Nearly all existing use-cases set the keysize to NAMEDATALEN or
more, so there's little reason to think this restriction should
be problematic.)

Tweak hash_create() to insist that the HASH_ELEM flag be set, and
remove the defaults it had for keysize and entrysize.  Since those
defaults were undocumented and basically useless, no callers
omitted HASH_ELEM anyway.

Also, remove memset's zeroing the HASHCTL parameter struct from
those callers that had one.  This has never been really necessary,
and while it wasn't a bad coding convention it was confusing that
some callers did it and some did not.  We might as well save a few
cycles by standardizing on "not".

Also improve the documentation for hash_create().

In passing, improve reinit.c's usage of a hash table by storing
the key as a binary Oid rather than a string; and, since that's
a temporary hash table, allocate it in CurrentMemoryContext for
neatness.

Discussion: https://postgr.es/m/590625.1607878171@sss.pgh.pa.us

b3817f5f

Revert "Cannot use WL_SOCKET_WRITEABLE without WL_SOCKET_READABLE." · a58db3aa

Jeff Davis authored Dec 14, 2020

This reverts commit 3a9e64aa.

Commit 4bad60e3 fixed the root of the problem that 3a9e64aa worked
around.

This enables proper pipelining of commands after terminating
replication, eliminating an undocumented limitation.

Discussion: https://postgr.es/m/3d57bc29-4459-578b-79cb-7641baf53c57%40iki.fi
Backpatch-through: 9.5

a58db3aa

14 Dec, 2020 2 commits

Improve some code around cryptohash functions · 9b584953

Michael Paquier authored Dec 14, 2020

This adjusts some code related to recent changes for cryptohash
functions:
- Add a variable in md5.h to track down the size of a computed result,
moved from pgcrypto.  Note that pg_md5_hash() assumed a result of this
size already.
- Call explicit_bzero() on the hashed data when freeing the context for
fallback implementations.  For MD5, particularly, it would be annoying
to leave some non-zeroed data around.
- Clean up some code related to recent changes of uuid-ossp.  .gitignore
still included md5.c and a comment was incorrect.

Discussion: https://postgr.es/m/X9HXKTgrvJvYO7Oh@paquier.xyz

9b584953

Add some checkpoint/restartpoint status to ps display · df9274ad

Michael Paquier authored Dec 14, 2020

This is done for end-of-recovery and shutdown checkpoints/restartpoints
(end-of-recovery restartpoints don't exist) rather than all types of
checkpoints, in cases where it may not be possible to rely on
pg_stat_activity to get a status from the startup or checkpointer
processes.

For example, at the end of a crash recovery, this is useful to know if a
checkpoint is running in the startup process, while previously the ps
display may only show some information about "recovering" something,
that can be confusing while a checkpoint runs.

Author: Justin Pryzby
Reviewed-by: Nathan Bossart, Kirk Jamison, Fujii Masao, Michael Paquier
Discussion: https://postgr.es/m/20200818225238.GP17022@telsasoft.com

df9274ad

13 Dec, 2020 2 commits

Use HASH_BLOBS for xidhash. · a1b8aa1e

Noah Misch authored Dec 12, 2020

This caused BufFile errors on buildfarm member sungazer, and SIGSEGV was
possible.  Conditions for reaching those symptoms were more frequent on
big-endian systems.

Discussion: https://postgr.es/m/20201129214441.GA691200@rfd.leadboat.com

a1b8aa1e

Correct behavior descriptions in comments, and correct a test name. · 73aae452
Noah Misch authored Dec 12, 2020

73aae452

12 Dec, 2020 2 commits
- initdb: complete getopt_long alphabetization · d6abfdf8
  Bruce Momjian authored Dec 12, 2020
```
Backpatch-through: 9.5
```
  d6abfdf8
- initdb: properly alphabetize getopt_long options in C string · 39f3a9d2
  Bruce Momjian authored Dec 12, 2020
```
Backpatch-through: 9.5
```
  39f3a9d2
11 Dec, 2020 4 commits

Allow subscripting of hstore values. · 0ec5f7e7

Tom Lane authored Dec 11, 2020

This is basically a finger exercise to prove that it's possible for
an extension module to add subscripting ability. Subscripted fetch
from an hstore is not different from the existing "hstore -> text"
operator. Subscripted update does seem to be a little easier to
use than the traditional update method using hstore concatenation,
but it's not a fundamentally new ability.

However, there may be some value in the code as sample code, since
it shows what's basically the minimum-complexity way to implement
subscripting when one needn't consider nested container objects.

Discussion: https://postgr.es/m/3724341.1607551174@sss.pgh.pa.us

0ec5f7e7

Allow ALTER TYPE to update an existing type's typsubscript value. · 8c15a297

Tom Lane authored Dec 11, 2020

This is essential if we'd like to allow existing extension data types
to support subscripting in future, since dropping and recreating the
type isn't a practical thing for an extension upgrade script, and
direct manipulation of pg_type isn't a great answer either.

There was some discussion about also allowing alteration of typelem,
but it's less clear whether that's a good idea or not, so for now
I forebore.

Discussion: https://postgr.es/m/3724341.1607551174@sss.pgh.pa.us

8c15a297

Provide an error cursor for "can't subscript" error messages. · 653aa603

Tom Lane authored Dec 11, 2020

Commit c7aba7c1 didn't add this, but after more fooling with the
feature I feel that it'd be useful. To make this possible, refactor
getSubscriptingRoutines() so that the caller is responsible for
throwing any error. (In clauses.c, I just chose to make the
most conservative assumption rather than throwing an error. We don't
expect failures there anyway really, so the code space for an error
message would be a poor investment.)

653aa603

pg_dump: Don't use enums for defining bit mask values · d2a2808e

Peter Eisentraut authored Dec 11, 2020

This usage would mean that values of the enum type are potentially not
one of the enum values. Use macros instead, like everywhere else.

Discussion: https://www.postgresql.org/message-id/14dde730-1d34-260e-fa9d-7664df2d6313@enterprisedb.com

d2a2808e

10 Dec, 2020 2 commits

Fix compilation of uuid-ossp · 525e60b7

Michael Paquier authored Dec 10, 2020

This module had a dependency on pgcrypto's md5.c that got removed by
b67b57a9.  Instead of the code from pgcrypto, this code can just use the
new cryptohash routines for MD5 as a drop-in replacement, so let's just
do this switch.  This has also the merit to simplify a bit the
compilation of uuid-ossp.

This requires --with-uuid to be reproduced, and I have used e2fs as a
way to reproduce the failure, then test this commit.

Per reports from buildfarm members longfin, florican and sifaka.

Discussion: https://postgr.es/m/X9GToVd3QmWeNvj8@paquier.xyz

525e60b7

Refactor MD5 implementations according to new cryptohash infrastructure · b67b57a9

Michael Paquier authored Dec 10, 2020

This commit heavily reorganizes the MD5 implementations that exist in
the tree in various aspects.

First, MD5 is added to the list of options available in cryptohash.c and
cryptohash_openssl.c.  This means that if building with OpenSSL, EVP is
used for MD5 instead of the fallback implementation that Postgres had
for ages.  With the recent refactoring work for cryptohash functions,
this change is straight-forward.  If not building with OpenSSL, a
fallback implementation internal to src/common/ is used.

Second, this reduces the number of MD5 implementations present in the
tree from two to one, by moving the KAME implementation from pgcrypto to
src/common/, and by removing the implementation that existed in
src/common/.  KAME was already structured with an init/update/final set
of routines by pgcrypto (see original pgcrypto/md5.h) for compatibility
with OpenSSL, so moving it to src/common/ has proved to be a
straight-forward move, requiring no actual manipulation of the internals
of each routine.  Some benchmarking has not shown any performance gap
between both implementations.

Similarly to the fallback implementation used for SHA2, the fallback
implementation of MD5 is moved to src/common/md5.c with an internal
header called md5_int.h for the init, update and final routines.  This
gets then consumed by cryptohash.c.

The original routines used for MD5-hashed passwords are moved to a
separate file called md5_common.c, also in src/common/, aimed at being
shared between all MD5 implementations as utility routines to keep
compatibility with any code relying on them.

Like the SHA2 changes, this commit had its round of tests on both Linux
and Windows, across all versions of OpenSSL supported on HEAD, with and
even without OpenSSL.

Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20201106073434.GA4961@paquier.xyz

b67b57a9

09 Dec, 2020 4 commits

Support subscripting of arbitrary types, not only arrays. · c7aba7c1

Tom Lane authored Dec 09, 2020

This patch generalizes the subscripting infrastructure so that any
data type can be subscripted, if it provides a handler function to
define what that means. Traditional variable-length (varlena) arrays
all use array_subscript_handler(), while the existing fixed-length
types that support subscripting use raw_array_subscript_handler().
It's expected that other types that want to use subscripting notation
will define their own handlers. (This patch provides no such new
features, though; it only lays the foundation for them.)

To do this, move the parser's semantic processing of subscripts
(including coercion to whatever data type is required) into a
method callback supplied by the handler. On the execution side,
replace the ExecEvalSubscriptingRef* layer of functions with direct
calls to callback-supplied execution routines. (Thus, essentially
no new run-time overhead should be caused by this patch. Indeed,
there is room to remove some overhead by supplying specialized
execution routines. This patch does a little bit in that line,
but more could be done.)

Additional work is required here and there to remove formerly
hard-wired assumptions about the result type, collation, etc
of a SubscriptingRef expression node; and to remove assumptions
that the subscript values must be integers.

One useful side-effect of this is that we now have a less squishy
mechanism for identifying whether a data type is a "true" array:
instead of wiring in weird rules about typlen, we can look to see
if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this
to be bulletproof, we have to forbid user-defined types from using
that handler directly; but there seems no good reason for them to
do so.

This patch also removes assumptions that the number of subscripts
is limited to MAXDIM (6), or indeed has any hard-wired limit.
That limit still applies to types handled by array_subscript_handler
or raw_array_subscript_handler, but to discourage other dependencies
on this constant, I've moved it from c.h to utils/array.h.

Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov,
Peter Eisentraut, Pavel Stehule

Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com
Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com

c7aba7c1

Change get_constraint_index() to use pg_constraint.conindid · 8b069ef5

Peter Eisentraut authored Dec 09, 2020

It was still using a scan of pg_depend instead of using the conindid
column that has been added since.

Since it is now just a catalog lookup wrapper and not related to
pg_depend, move from pg_depend.c to lsyscache.c.
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/4688d55c-9a2e-9a5a-d166-5f24fe0bf8db%40enterprisedb.com

8b069ef5

Simplify code for getting a unicode codepoint's canonical class. · 16c302f5

Michael Paquier authored Dec 09, 2020

Three places of unicode_norm.c use a similar logic for getting the
combining class from a codepoint.  Commit 2991ac5f has added the function
get_canonical_class() for this purpose, but it was only called by the
backend.  This commit refactors the code to use this function in all
the places where the combining class is retrieved from a given
codepoint.

Author: John Naylor
Discussion: https://postgr.es/m/CAFBsxsHUV7s7YrOm6hFz-Jq8Sc7K_yxTkfNZxsDV-DuM-k-gwg@mail.gmail.com

16c302f5

jit: Reference function pointer types via llvmjit_types.c. · df99ddc7

Andres Freund authored Dec 07, 2020

It is error prone (see 5da871bf) and verbose to manually create function
types. Add a helper that can reference a function pointer type via
llvmjit_types.c and and convert existing instances of manual creation.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20201207212142.wz5tnbk2jsaqzogb@alap3.anarazel.de

df99ddc7

08 Dec, 2020 7 commits

Teach contain_leaked_vars that assignment SubscriptingRefs are leaky. · 62ee7033

Tom Lane authored Dec 08, 2020

array_get_element and array_get_slice qualify as leakproof, since
they will silently return NULL for bogus subscripts. But
array_set_element and array_set_slice throw errors for such cases,
making them clearly not leakproof. contain_leaked_vars was evidently
written with only the former case in mind, as it gave the wrong answer
for assignment SubscriptingRefs (nee ArrayRefs).

This would be a live security bug, were it not that assignment
SubscriptingRefs can only occur in INSERT and UPDATE target lists,
while we only care about leakproofness for qual expressions; so the
wrong answer can't occur in practice. Still, that's a rather shaky
answer for a security-related question; and maybe in future somebody
will want to ask about leakproofness of a tlist. So it seems wise to
fix and even back-patch this correction.

(We would need some change here anyway for the upcoming
generic-subscripting patch, since extensions might make different
tradeoffs about whether to throw errors. Commit 558d77f2 attempted
to lay groundwork for that by asking check_functions_in_node whether a
SubscriptingRef contains leaky functions; but that idea fails now that
the implementation methods of a SubscriptingRef are not SQL-visible
functions that could be marked leakproof or not.)

Back-patch to 9.6. While 9.5 has the same issue, the code's a bit
different. It seems quite unlikely that we'd introduce any actual bug
in the short time 9.5 has left to live, so the work/risk/reward balance
isn't attractive for changing 9.5.

Discussion: https://postgr.es/m/3143742.1607368115@sss.pgh.pa.us

62ee7033

Remove operator_precedence_warning. · a676386b

Tom Lane authored Dec 08, 2020

This GUC was always intended as a temporary solution to help with
finding 9.4-to-9.5 migration issues. Now that all pre-9.5 branches
are out of support, and 9.5 will be too before v14 is released,
it seems like it's okay to drop it. Doing so allows removal of
several hundred lines of poorly-tested code in parse_expr.c,
which have been a fertile source of bugs when people did use this.

Discussion: https://postgr.es/m/2234320.1607117945@sss.pgh.pa.us

a676386b

Improve estimation of ANDs under ORs using extended statistics. · 4f5760d4

Dean Rasheed authored Dec 08, 2020

Formerly, extended statistics only handled clauses that were
RestrictInfos. However, the restrictinfo machinery doesn't create
sub-AND RestrictInfos for AND clauses underneath OR clauses.
Therefore teach extended statistics to handle bare AND clauses,
looking for compatible RestrictInfo clauses underneath them.

Dean Rasheed, reviewed by Tomas Vondra.

Discussion: https://postgr.es/m/CAEZATCW=J65GUFm50RcPv-iASnS2mTXQbr=CfBvWRVhFLJ_fWA@mail.gmail.com

4f5760d4

Improve estimation of OR clauses using multiple extended statistics. · 88b0898f

Dean Rasheed authored Dec 08, 2020

When estimating an OR clause using multiple extended statistics
objects, treat the estimates for each set of clauses for each
statistics object as independent of one another. The overlap estimates
produced for each statistics object do not apply to clauses covered by
other statistics objects.

Dean Rasheed, reviewed by Tomas Vondra.

Discussion: https://postgr.es/m/CAEZATCW=J65GUFm50RcPv-iASnS2mTXQbr=CfBvWRVhFLJ_fWA@mail.gmail.com

88b0898f

Doc: clarify that CREATE TABLE discards redundant unique constraints. · f2a69b35

Tom Lane authored Dec 08, 2020

The SQL standard says that redundant unique constraints are disallowed,
but we long ago decided that throwing an error would be too
user-unfriendly, so we just drop redundant ones.  The docs weren't very
clear about that though, as this behavior was only explained for PRIMARY
KEY vs UNIQUE, not UNIQUE vs UNIQUE.

While here, I couldn't resist doing some copy-editing and markup-fixing
on the adjacent text about INCLUDE options.

Per bug #16767 from Matthias vd Meent.

Discussion: https://postgr.es/m/16767-1714a2056ca516d0@postgresql.org

f2a69b35

Doc: explain that the string types can't store \0 (ASCII NUL). · 9a264191

Tom Lane authored Dec 08, 2020

This restriction was mentioned in connection with string literals,
but it wasn't made clear that it's a general restriction not just
a syntactic limitation in query strings.

Per unsigned documentation comment.

Discussion: https://postgr.es/m/160720552914.710.16625261471128631268@wrigleys.postgresql.org

9a264191

Speed up rechecking if relation needs to be vacuumed or analyze in autovacuum. · e2ac3fed

Fujii Masao authored Dec 08, 2020

After autovacuum collects the relations to vacuum or analyze, it rechecks
whether each relation still needs to be vacuumed or analyzed before actually
doing that. Previously this recheck could be a significant overhead
especially when there were a very large number of relations. This was
because each recheck forced the statistics to be refreshed, and the refresh
of the statistics for a very large number of relations could cause heavy
overhead. There was the report that this issue caused autovacuum workers
to have gotten “stuck” in a tight loop of table_recheck_autovac() that
rechecks whether a relation needs to be vacuumed or analyzed.

This commit speeds up the recheck by making autovacuum worker reuse
the previously-read statistics for the recheck if possible. Then if that
"stale" statistics says that a relation still needs to be vacuumed or analyzed,
autovacuum refreshes the statistics and does the recheck again.

The benchmark shows that the more relations exist and autovacuum workers
are running concurrently, the more this change reduces the autovacuum
execution time. For example, when there are 20,000 tables and 10 autovacuum
workers are running, the benchmark showed that the change improved
the performance of autovacuum more than three times. On the other hand,
even when there are only 1000 tables and only a single autovacuum worker
is running, the benchmark didn't show any big performance regression by
the change.

Firstly POC patch was proposed by Jim Nasby. As the result of discussion,
we used Tatsuhito Kasahara's version of the patch using the approach
suggested by Tom Lane.

Reported-by: Jim Nasby
Author: Tatsuhito Kasahara
Reviewed-by: Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/3FC6C2F2-8A47-44C0-B997-28830B5716D0@amazon.com

e2ac3fed