Commits · ffa8444ce4828108e49d961cfa64e31078d978f0 · Abuhujair Javed / Postgres FD Implementation

29 Mar, 2019 8 commits

Andres Freund authored Mar 29, 2019

Author: Haribabu Kommi
Discussion: CAJrrPGeeYOqP3hkZyohDx_8dot4zvPuPMDBmhJ=iC85cTBNeYw@mail.gmail.com

ffa8444c

Reorganize Notes section in documentation of pg_checksums · a7cc5237

Michael Paquier authored Mar 29, 2019

This commit reorders the paragraphs of the Notes section in order of
importance, and clarifies better the safe uses of pg_checksums for
replication setups.

Author: Fabien Coelho
Discussion: https://postgr.es/m/alpine.DEB.2.21.1903231404280.18811@lancre

a7cc5237

doc: Refine README.links further · c0a2ff47
Peter Eisentraut authored Mar 29, 2019
```
suggested by Chapman Flack <chap@anastigmatix.net>
```
c0a2ff47

Allow existing VACUUM options to take a Boolean argument. · 41b54ba7

Robert Haas authored Mar 29, 2019

This makes VACUUM work more like EXPLAIN already does without changing
the meaning of any commands that already work. It is intended to
facilitate the addition of future VACUUM options that may take
non-Boolean parameters or that default to false.

Masahiko Sawada, reviewed by me.

Discussion: http://postgr.es/m/CA+TgmobpYrXr5sUaEe_T0boabV0DSm=utSOZzwCUNqfLEEm8Mw@mail.gmail.com
Discussion: http://postgr.es/m/CAD21AoBaFcKBAeL5_++j+Vzir2vBBcF4juW7qH8b3HsQY=Q6+w@mail.gmail.com

41b54ba7

Warn more strongly about the dangers of exclusive backup mode. · c900c152

Robert Haas authored Mar 29, 2019

Especially, warn about the hazards of mishandling the backup_label
file. Adjust a couple of server messages to be more clear about
the hazards associated with removing backup_label files, too.

David Steele and Robert Haas, reviewed by Laurenz Albe, Martín
Marqués, Peter Eisentraut, and Magnus Hagander.

Discussion: http://postgr.es/m/7d85c387-000e-16f0-e00b-50bf83c22127@pgmasters.net

c900c152

Fix incorrect code in new REINDEX CONCURRENTLY code · bb76134b

Peter Eisentraut authored Mar 29, 2019

The previous code was adding pointers to transient variables to a
list, but by the time the list was read, the variable might be gone,
depending on the compiler.  Fix it by making copies in the proper
memory context.

bb76134b

REINDEX CONCURRENTLY · 5dc92b84

Peter Eisentraut authored Mar 29, 2019

This adds the CONCURRENTLY option to the REINDEX command. A REINDEX
CONCURRENTLY on a specific index creates a new index (like CREATE
INDEX CONCURRENTLY), then renames the old index away and the new index
in place and adjusts the dependencies, and then drops the old
index (like DROP INDEX CONCURRENTLY). The REINDEX command also has
the capability to run its other variants (TABLE, DATABASE) with the
CONCURRENTLY option (but not SYSTEM).

The reindexdb command gets the --concurrently option.

Author: Michael Paquier, Andreas Karlsson, Peter Eisentraut
Reviewed-by: Andres Freund, Fujii Masao, Jim Nasby, Sergei Kornilov
Discussion: https://www.postgresql.org/message-id/flat/60052986-956b-4478-45ed-8bd119e9b9cf%402ndquadrant.com#74948a1044c56c5e817a5050f554ddee

5dc92b84

tableam: relation creation, VACUUM FULL/CLUSTER, SET TABLESPACE. · d25f5191

Andres Freund authored Mar 28, 2019

This moves the responsibility for:
- creating the storage necessary for a relation, including creating a
  new relfilenode for a relation with existing storage
- non-transactional truncation of a relation
- VACUUM FULL / CLUSTER's rewrite of a table
below tableam.

This is fairly straight forward, with a bit of complexity smattered in
to move the computation of xid / multixid horizons below the AM, as
they don't make sense for every table AM.

Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de

d25f5191

28 Mar, 2019 7 commits

Fix typo. · 7e69323b
Thomas Munro authored Mar 29, 2019
```
Author: Masahiko Sawada
```
7e69323b
Fix a few comment copy & pastos. · 46bcd2af
Andres Freund authored Mar 28, 2019

46bcd2af

Fix deserialization of pg_mcv_list values · 62bf0fb3

Tomas Vondra authored Mar 28, 2019

There were multiple issues in deserialization of pg_mcv_list values.

Firstly, the data is loaded from syscache, but the deserialization was
performed after ReleaseSysCache(), at which point the data might have
already disappeared. Fixed by moving the calls in statext_mcv_load,
and using the same NULL-handling code as existing stats.

Secondly, the deserialized representation used pointers into the
serialized representation. But that is also unsafe, because the data
may disappear at any time. Fixed by reworking and simplifying the
deserialization code to always copy all the data.

And thirdly, when deserializing values for types passed by value, the
code simply did memcpy(d,s,typlen) which however does not work on
bigendian machines. Fixed by using fetch_att/store_att_byval.

62bf0fb3

doc: Fix typo · f3afbbda
Peter Eisentraut authored Mar 28, 2019

f3afbbda

Use FullTransactionId for the transaction stack. · ad308058

Thomas Munro authored Mar 28, 2019

Provide GetTopFullTransactionId() and GetCurrentFullTransactionId().
The intended users of these interfaces are access methods that use
xids for visibility checks but don't want to have to go back and
"freeze" existing references some time later before the 32 bit xid
counter wraps around.

Use a new struct to serialize the transaction state for parallel
query, because FullTransactionId doesn't fit into the previous
serialization scheme very well.

Author: Thomas Munro
Reviewed-by: Heikki Linnakangas
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com

ad308058

Add basic infrastructure for 64 bit transaction IDs. · 2fc7af5e

Thomas Munro authored Mar 28, 2019

Instead of inferring epoch progress from xids and checkpoints,
introduce a 64 bit FullTransactionId type and use it to track xid
generation.  This fixes an unlikely bug where the epoch is reported
incorrectly if the range of active xids wraps around more than once
between checkpoints.

The only user-visible effect of this commit is to correct the epoch
used by txid_current() and txid_status(), also visible with
pg_controldata, in those rare circumstances.  It also creates some
basic infrastructure so that later patches can use 64 bit
transaction IDs in more places.

The new type is a struct that we pass by value, as a form of strong
typedef.  This prevents the sort of accidental confusion between
TransactionId and FullTransactionId that would be possible if we
were to use a plain old uint64.

Author: Thomas Munro
Reported-by: Amit Kapila
Reviewed-by: Andres Freund, Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com

2fc7af5e

tableam: Support for an index build's initial table scan(s). · 2a96909a

Andres Freund authored Mar 27, 2019

To support building indexes over tables of different AMs, the scans to
do so need to be routed through the table AM.  While moving a fair
amount of code, nearly all the changes are just moving code to below a
callback.

Currently the range based interface wouldn't make much sense for non
block based table AMs. But that seems aceptable for now.

Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de

2a96909a

27 Mar, 2019 14 commits

Fix vpath build · 12bb35fc
Peter Eisentraut authored Mar 27, 2019
```
Skip doc/src/sgml/images/Makefile since the directory is not created.
```
12bb35fc

doc: Add some images · ea55aec0

Peter Eisentraut authored Mar 27, 2019

Add infrastructure for having images in the documentation, in SVG
format.  Add two images to start with.  See the included README file
for instructions.

Author: Jürgen Purtz <juergen@purtz.de>
Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/aaa54502-05c0-4ea5-9af8-770411a6bf4b@purtz.de

ea55aec0

doc: Move htmlhelp output to subdirectory · 477422c9

Peter Eisentraut authored Mar 27, 2019

This makes it behave more like the html output.  That will make some
subsequent changes across all output formats easier.

477422c9

Use Pandoc also for plain-text documentation output · 2488ea7a

Peter Eisentraut authored Mar 27, 2019

The makefile rule for the (rarely used) plain-text output postgres.txt
was still written to use lynx, but in
96b8b8b6, where the INSTALL file was
switched to pandoc, the rest of the makefile support for lynx was
removed, so this was broken.  Rewrite the rule to also use pandoc for
postgres.txt.

2488ea7a

Minor improvements for the multivariate MCV lists · a63b29a1

Tomas Vondra authored Mar 27, 2019

The MCV build should always call get_mincount_for_mcv_list(), as the
there is no other logic to decide whether the MCV list represents all
the data. So just remove the (ngroups > nitems) condition.

Also, when building MCV lists, the number of items was limited by the
statistics target (i.e. up to 10000). But when deserializing the MCV
list, a different value (8192) was used to check the input, causing
an error. Simply ensure that the same value is used in both places.

This should have been included in 7300a699, but I forgot to include it
in that commit.

a63b29a1

Add support for multivariate MCV lists · 7300a699

Tomas Vondra authored Mar 27, 2019

Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.

Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.

Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com

7300a699

Avoid passing query tlist around separately from root->processed_tlist. · 333ed246

Tom Lane authored Mar 27, 2019

In the dim past, the planner kept the fully-processed version of the query
targetlist (the result of preprocess_targetlist) in grouping_planner's
local variable "tlist", and only grudgingly passed it to individual other
routines as needed. Later we discovered a need to still have it available
after grouping_planner finishes, and invented the root->processed_tlist
field for that purpose, but it wasn't used internally to grouping_planner;
the tlist was still being passed around separately in the same places as
before.

Now comes a proposed patch to allow appendrel expansion to add entries
to the processed tlist, well after preprocess_targetlist has finished
its work. To avoid having to pass around the tlist explicitly, it's
proposed to allow appendrel expansion to modify root->processed_tlist.
That makes aliasing the tlist with assorted parameters and local
variables really scary. It would accidentally work as long as the
tlist is initially nonempty, because then the List header won't move
around, but it's not exactly hard to think of ways for that to break.
Aliased values are poor programming practice anyway.

Hence, get rid of local variables and parameters that can be identified
with root->processed_tlist, in favor of just using that field directly.
And adjust comments to match. (Some of the new comments speak as though
it's already possible for appendrel expansion to modify the tlist; that's
not true yet, but will happen in a later patch.)

Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp

333ed246

pgbench: doExecuteCommand -> executeMetaCommand · 9938d116

Alvaro Herrera authored Mar 27, 2019

The new function is only in charge of meta commands, not SQL commands.
This change makes the code a little clearer: now all the state changes
are effected by advanceConnectionState.  It also removes one indent
level, which makes the diff look bulkier than it really is.

Author: Fabien Coelho
Reviewed-by: Kirk Jamison
Discussion: https://postgr.es/m/alpine.DEB.2.21.1811240904500.12627@lancre

9938d116

Suppress uninitialized-variable warning. · a51cc7e9

Tom Lane authored Mar 27, 2019

Apparently Andres' compiler is smart enough to see that hpage
must be initialized before use ... but mine isn't.

a51cc7e9

Improve error handling of column references in expression transformation · ecfed4a1

Michael Paquier authored Mar 27, 2019

Column references are not allowed in default expressions and partition
bound expressions, and are restricted as such once the transformation of
their expressions is done.  However, trying to use more complex column
references can lead to confusing error messages.  For example, trying to
use a two-field column reference name for default expressions and
partition bounds leads to "missing FROM-clause entry for table", which
makes no sense in their respective context.

In order to make the errors generated more useful, this commit adds more
verbose messages when transforming column references depending on the
context.  This has a little consequence though: for example an
expression using an aggregate with a column reference as argument would
cause an error to be generated for the column reference, while the
aggregate was the problem reported before this commit because column
references get transformed first.

The confusion exists for default expressions for a long time, and the
problem is new as of v12 for partition bounds.  Still per the lack of
complaints on the matter no backpatch is done.

The patch has been written by Amit Langote and me, and Tom Lane has
provided the improvement of the documentation for default expressions on
the CREATE TABLE page.

Author: Amit Langote, Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/20190326020853.GM2558@paquier.xyz

ecfed4a1

Fix off-by-one error in txid_status(). · d2fd7f74

Thomas Munro authored Mar 27, 2019

The transaction ID returned by GetNextXidAndEpoch() is in the future,
so we can't attempt to access its status or we might try to read a
CLOG page that doesn't exist.  The > vs >= confusion probably stemmed
from the choice of a variable name containing the word "last" instead
of "next", so fix that too.

Back-patch to 10 where the function arrived.

Author: Thomas Munro
Discussion: https://postgr.es/m/CA%2BhUKG%2Buua_BV5cyfsioKVN2d61Lukg28ECsWTXKvh%3DBtN2DPA%40mail.gmail.com

d2fd7f74

Switch some palloc/memset calls to palloc0 · 1983af8e

Michael Paquier authored Mar 27, 2019

Some code paths have been doing some allocations followed by an
immediate memset() to initialize the allocated area with zeros, this is
a bit overkill as there are already interfaces to do both things in one
call.

Author: Daniel Gustafsson
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/vN0OodBPkKs7g2Z1uyk3CUEmhdtspHgYCImhlmSxv1Xn6nY1ZnaaGHL8EWUIQ-NEv36tyc4G5-uA3UXUF2l4sFXtK_EQgLN1hcgunlFVKhA=@yesql.se

1983af8e

Switch function current_schema[s]() to be parallel-unsafe · 5bde1651

Michael Paquier authored Mar 27, 2019

When invoked for the first time in a session, current_schema() and
current_schemas() can finish by creating a temporary schema.  Currently
those functions are parallel-safe, however if for a reason or another
they get launched across multiple parallel workers, they would fail when
attempting to create a temporary schema as temporary contexts are not
supported in this case.

The original issue has been spotted by buildfarm members crake and
lapwing, after commit c5660e0a has introduced the first regression tests
based on current_schema() in the tree.  After that, 396676b0 has
introduced a workaround to avoid parallel plans but that was not
completely right either.

Catversion is bumped.

Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20190118024618.GF1883@paquier.xyz

5bde1651

Track unowned relations in doubly-linked list · 6ca015f9

Tomas Vondra authored Mar 27, 2019

Relations dropped in a single transaction are tracked in a list of
unowned relations.  With large number of dropped relations this resulted
in poor performance at the end of a transaction, when the relations are
removed from the singly linked list one by one.

Commit b4166911 attempted to address this issue (particularly when it
happens during recovery) by removing the relations in a reverse order,
resulting in O(1) lookups in the list of unowned relations.  This did
not work reliably, though, and it was possible to trigger the O(N^2)
behavior in various ways.

Instead of trying to remove the relations in a specific order with
respect to the linked list, which seems rather fragile, switch to a
regular doubly linked.  That allows us to remove relations cheaply no
matter where in the list they are.

As b4166911 was a bugfix, backpatched to all supported versions, do the
same thing here.

Reviewed-by: Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/flat/80c27103-99e4-1d0c-642c-d9f3b94aaa0a%402ndquadrant.com
Backpatch-through: 9.4

6ca015f9

26 Mar, 2019 11 commits

Compute XID horizon for page level index vacuum on primary. · 558a9165

Andres Freund authored Mar 26, 2019

Previously the xid horizon was only computed during WAL replay. That
had two major problems:
1) It relied on knowing what the table pointed to looks like. That was
   easy enough before the introducing of tableam (we knew it had to be
   heap, although some trickery around logging the heap relfilenodes
   was required). But to properly handle table AMs we need
   per-database catalog access to look up the AM handler, which
   recovery doesn't allow.
2) Not knowing the xid horizon also makes it hard to support logical
   decoding on standbys. When on a catalog table, we need to be able
   to conflict with slots that have an xid horizon that's too old. But
   computing the horizon by visiting the heap only works once
   consistency is reached, but we always need to be able to detect
   conflicts.

There's also a secondary problem, in that the current method performs
redundant work on every standby. But that's counterbalanced by
potentially computing the value when not necessary (either because
there's no standby, or because there's no connected backends).

Solve 1) and 2) by moving computation of the xid horizon to the
primary and by involving tableam in the computation of the horizon.

To address the potentially increased overhead, increase the efficiency
of the xid horizon computation for heap by sorting the tids, and
eliminating redundant buffer accesses. When prefetching is available,
additionally perform prefetching of buffers.  As this is more of a
maintenance task, rather than something routinely done in every read
only query, we add an arbitrary 10 to the effective concurrency -
thereby using IO concurrency, when not globally enabled.  That's
possibly not the perfect formula, but seems good enough for now.

Bumps WAL format, as latestRemovedXid is now part of the records, and
the heap's relfilenode isn't anymore.

Author: Andres Freund, Amit Khandekar, Robert Haas
Reviewed-By: Robert Haas
Discussion:
    https://postgr.es/m/20181212204154.nsxf3gzqv3gesl32@alap3.anarazel.de
    https://postgr.es/m/20181214014235.dal5ogljs3bmlq44@alap3.anarazel.de
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de

558a9165

Fix partitioned index creation bug with dropped columns · 126d6312

Alvaro Herrera authored Mar 26, 2019

ALTER INDEX .. ATTACH PARTITION fails if the partitioned table where the
index is defined contains more dropped columns than its partition, with
this message:
  ERROR:  incorrect attribute map
The cause was that one caller of CompareIndexInfo was passing the number
of attributes of the partition rather than the parent, which confused
the length check.  Repair.

This can cause pg_upgrade to fail when used on such a database.  Leave
some more objects around after regression tests, so that the case is
detected by pg_upgrade test suite.

Remove some spurious empty lines noticed while looking for other cases
of the same problem.

Discussion: https://postgr.es/m/20190326213924.GA2322@alvherre.pgsql

126d6312

Build "other rels" of appendrel baserels in a separate step. · 53bcf5e3

Tom Lane authored Mar 26, 2019

Up to now, otherrel RelOptInfos were built at the same time as baserel
RelOptInfos, thanks to recursion in build_simple_rel().  However,
nothing in query_planner's preprocessing cares at all about otherrels,
only baserels, so we don't really need to build them until just before
we enter make_one_rel.  This has two benefits:

* create_lateral_join_info did a lot of extra work to propagate
lateral-reference information from parents to the correct children.
But if we delay creation of the children till after that, it's
trivial (and much harder to break, too).

* Since we have all the restriction quals correctly assigned to
parent appendrels by this point, it'll be possible to do plan-time
pruning and never make child RelOptInfos at all for partitions that
can be pruned away.  That's not done here, but will be later on.

Amit Langote, reviewed at various times by Dilip Kumar, Jesper Pedersen,
Yoshikazu Imai, and David Rowley

Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp

53bcf5e3

Add ORDER BY to more ICU regression test cases. · 8994cc6f
Tom Lane authored Mar 26, 2019
```
Commit c77e1220 didn't fully fix the problem.  Per buildfarm
and local testing.
```
8994cc6f

Fix oversight in data-type change for autovacuum_vacuum_cost_delay. · 7c366ac9

Tom Lane authored Mar 26, 2019

Commit caf626b2 missed that the relevant reloptions entry needs
to be moved from the intRelOpts[] array to realRelOpts[].
Somewhat surprisingly, it seems to work anyway, perhaps because
the desired default and limit values are all integers.  We ought
to have either a simpler data structure or better cross-checking
here, but that's for another patch.

Nikolay Shaplov

Discussion: https://postgr.es/m/4861742.12LTaSB3sv@x200m

7c366ac9

psql: Schema-qualify typecast in one \d query · 1d21ba8a
Alvaro Herrera authored Mar 26, 2019
```
Bug introduced in my commit bc87f22e
```
1d21ba8a

Get rid of duplicate child RTE for a partitioned table. · e8d5dd6b

Tom Lane authored Mar 26, 2019

We've been creating duplicate RTEs for partitioned tables just
because we do so for regular inheritance parent tables.  But unlike
regular-inheritance parents which are themselves regular tables
and thus need to be scanned, partitioned tables don't need the
extra RTE.

This makes the conditions for building a child RTE the same as those
for building an AppendRelInfo, allowing minor simplification in
expand_single_inheritance_child.  Since the planner's actual processing
is driven off the AppendRelInfo list, nothing much changes beyond that,
we just have one fewer useless RTE entry.

Amit Langote, reviewed and hacked a bit by me

Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp

e8d5dd6b

Improve psql's \d display of foreign key constraints · 1af25ca0

Alvaro Herrera authored Mar 26, 2019

When used on a partition containing foreign keys coming from one of its
ancestors, \d would (rather unhelpfully) print the details about the
pg_constraint row in the partition. This becomes a bit frustrating when
the user tries things like dropping the FK in the partition; instead,
show the details for the foreign key on the table where it is defined.

Also, when a table is referenced by a foreign key on a partitioned
table, we would show multiple "Referenced by" lines, one for each
partition, which gets unwieldy pretty fast. Modify that so that it
shows only one line for the ancestor partitioned table where the FK is
defined.

Discussion: https://postgr.es/m/20181204143834.ym6euxxxi5aeqdpn@alvherre.pgsql
Reviewed-by: Tom Lane, Amit Langote, Peter Eisentraut

1af25ca0

Fix typo · 05295e36
Magnus Hagander authored Mar 26, 2019
```
Author: Daniel Gustafsson <daniel@yesql.se>
```
05295e36

Fix misplaced const · c8c885b7

Peter Eisentraut authored Mar 26, 2019

These instances were apparently trying to carry the const qualifier
from the arguments through the complex casts, but for that the const
qualifier was misplaced.

c8c885b7

Remove heap_hot_search(). · 2ac1b2b1

Andres Freund authored Mar 25, 2019

After 71bdc99d, "tableam: Add helper for indexes to check if a
corresponding table tuples exist." there's no in-core user left. As
there's unlikely to be an external user, and such an external user
could easily be adjusted to use table_index_fetch_tuple_check(),
remove heap_hot_search().

Per complaint from Peter Geoghegan

Author: Andres Freund
Discussion: https://postgr.es/m/CAH2-Wzn0Oq4ftJrTqRAsWy2WGjv0QrJcwoZ+yqWsF_Z5vjUBFw@mail.gmail.com

2ac1b2b1