Commits · 0c06534bd63b0bd23d7744a53f3b490a2adeea8a · Abuhujair Javed / Postgres FD Implementation

05 Jul, 2018 6 commits

doc: Reword old inheritance partitioning documentation · 0c06534b

Peter Eisentraut authored Jul 05, 2018

Prefer to use phrases like "child" instead of "partition" when
describing the legacy inheritance-based partitioning.  The word
"partition" now has a fixed meaning for the built-in partitioning, so
keeping it out of the documentation of the old method makes things
clearer.

Author: Justin Pryzby <pryzby@telsasoft.com>

0c06534b

doc: Fix typos · 17411e0f
Peter Eisentraut authored Jul 05, 2018
```
Author: Justin Pryzby <pryzby@telsasoft.com>
```
17411e0f

Reduce cost of test_decoding's new oldest_xmin test · 8d1c1ca7

Alvaro Herrera authored Jul 05, 2018

Change a whole-database VACUUM into doing just pg_attribute, which is
the portion that verifies what we want it to do.  The original
formulation wastes a lot of CPU time, which leads the test to fail when
runtime exceeds isolationtester timeout when it's super-slow, such as
under CLOBBER_CACHE_ALWAYS.  Per buildfarm member friarbird.

It turns out that the previous shape of the test doesn't always detect
the condition it is supposed to detect (on unpatched reorderbuffer
code): the reason is that there is a good chance of encountering a
xl_running_xacts record (logged every 15 seconds) before the checkpoint
-- and because we advance the xmin when we receive that WAL record, and
we *don't* advance the xmin twice consecutively without receiving a
client message in between, that means the xmin is not advanced enough
for the tuple to be pruned from pg_attribute by VACUUM.  So the test
would spuriously pass.

The reason this test deficiency wasn't detected earlier is that HOT
pruning removes the tuple anyway, even if vacuum leaves it in place, so
the test correctly fails (detecting the coding mistake), but for the
wrong reason.

To fix this mess, run the s0_get_changes step twice before vacuum
instead of once: this seems to cause the xmin to be advanced reliably,
wreaking havoc with more certainty.

Author: Arseny Sher
Discussion: https://postgr.es/m/87h8lkuxoa.fsf@ars-thinkpad

8d1c1ca7

Fix typo · f61988d1
Peter Eisentraut authored Jul 04, 2018

f61988d1

Prevent references to invalid relation pages after fresh promotion · 3c64dcb1

Michael Paquier authored Jul 05, 2018

If a standby crashes after promotion before having completed its first
post-recovery checkpoint, then the minimal recovery point which marks
the LSN position where the cluster is able to reach consistency may be
set to a position older than the first end-of-recovery checkpoint while
all the WAL available should be replayed.  This leads to the instance
thinking that it contains inconsistent pages, causing a PANIC and a hard
instance crash even if all the WAL available has not been replayed for
certain sets of records replayed.  When in crash recovery,
minRecoveryPoint is expected to always be set to InvalidXLogRecPtr,
which forces the recovery to replay all the WAL available, so this
commit makes sure that the local copy of minRecoveryPoint from the
control file is initialized properly and stays as it is while crash
recovery is performed.  Once switching to archive recovery or if crash
recovery finishes, then the local copy minRecoveryPoint can be safely
updated.

Pavan Deolasee has reported and diagnosed the failure in the first
place, and the base fix idea to rely on the local copy of
minRecoveryPoint comes from Kyotaro Horiguchi, which has been expanded
into a full-fledged patch by me.  The test included in this commit has
been written by Álvaro Herrera and Pavan Deolasee, which I have modified
to make it faster and more reliable with sleep phases.

Backpatch down to all supported versions where the bug appears, aka 9.3
which is where the end-of-recovery checkpoint is not run by the startup
process anymore.  The test gets easily supported down to 10, still it
has been tested on all branches.

Reported-by: Pavan Deolasee
Diagnosed-by: Pavan Deolasee
Reviewed-by: Pavan Deolasee, Kyotaro Horiguchi
Author: Michael Paquier, Kyotaro Horiguchi, Pavan Deolasee, Álvaro
Herrera
Discussion: https://postgr.es/m/CABOikdPOewjNL=05K5CbNMxnNtXnQjhTx2F--4p4ruorCjukbA@mail.gmail.com

3c64dcb1

Use context with correct lifetime in hypothetical_dense_rank_final. · 249126e7

Andres Freund authored Jul 04, 2018

The query lifetime expression context created in
hypothetical_dense_rank_final() was buggily allocated in the calling
memory context. I (Andres) broke that in bf6c614a.

Reported-By: Rajkumar Raghuwanshi
Author: Amit Langote
Discussion:  https://postgr.es/m/CAKcux6kmzWmur5HhA_aU6gYVFu0RLQdgJJ+aC9SLdcOvBSrpfA@mail.gmail.com
Backpatch: 11-

249126e7

04 Jul, 2018 4 commits

Check for interrupts inside the nbtree page deletion code. · 3a01f68e

Andres Freund authored Jul 04, 2018

When deleting pages the nbtree code has to walk through siblings of a
tree node. When those sibling links are corrupted that can lead to
endless loops - which are currently not interruptible.  This is
especially problematic if autovacuum is repeatedly blocked on such
indexes, as it can be hard to get out of that situation without
resorting to single user mode.

Thus add interrupt checks to appropriate places in such
loops. Unfortunately in one of the cases it's it's not easy to do so.

Between 9.3 and 9.4 the page deletion (and page split) code changed
significantly. Before it was significantly less robust against
interruptions. Therefore don't backpatch to 9.3.

Author: Andres Freund
Discussion: https://postgr.es/m/20180627191629.wkunw2qbibnvlz53@alap3.anarazel.de
Backpatch: 9.4-

3a01f68e

Improve the performance of relation deletes during recovery. · b4166911

Fujii Masao authored Jul 05, 2018

When multiple relations are deleted at the same transaction,
the files of those relations are deleted by one call to smgrdounlinkall(),
which leads to scan whole shared_buffers only one time. OTOH,
previously, during recovery, smgrdounlink() (not smgrdounlinkall()) was
called for each file to delete, which led to scan shared_buffers
multiple times. Obviously this could cause to increase the WAL replay
time very much especially when shared_buffers was huge.

To alleviate this situation, this commit changes the recovery so that
it also calls smgrdounlinkall() only one time to delete multiple
relation files.

This is just fix for oversight of commit 279628a0, not new feature.
So, per discussion on pgsql-hackers, we concluded to backpatch this
to all supported versions.

Author: Fujii Masao
Reviewed-by: Michael Paquier, Andres Freund, Thomas Munro, Kyotaro Horiguchi, Takayuki Tsunakawa
Discussion: https://postgr.es/m/CAHGQGwHVQkdfDqtvGVkty+19cQakAydXn1etGND3X0PHbZ3+6w@mail.gmail.com

b4166911

doc: Reorganize CREATE TABLE / LIKE option documentation · b46727e0

Peter Eisentraut authored Jul 04, 2018

This section once started out small but has now grown quite a bit and
needs a bit of structure.

Rewrite as list, add documentation of EXCLUDING, and improve the
documentation of INCLUDING ALL instead of just listing all the options
again.

per report from Yugo Nagata that EXCLUDING was not documented, that part
reviewed by Daniel Gustafsson, most of the rewrite was by me

b46727e0

Remove dead code for temporary relations in partition planning · fc057b2b

Michael Paquier authored Jul 04, 2018

Since recent commit 1c7c317c, temporary relations cannot be mixed with
permanent relations within the same partition tree, and the same counts
for temporary relations created by other sessions, which the planner
simply discarded. Instead be paranoid and issue an error, as those
should be blocked at definition time, at least for now.

At the same time, a test case is added to stress what has been moved
when expand_partitioned_rtentry gets called recursively but bumps on a
partitioned relation with no partitions which should be handled the same
way as the non-inheritance case. This code may be reworked in a close
future, and covering this code path will limit surprises.

Reported-by: David Rowley
Author: David Rowley
Reviewed-by: Amit Langote, Robert Haas, Michael Paquier
Discussion: https://postgr.es/m/CAKJS1f_HyV1txn_4XSdH5EOhBMYaCwsXyAj6bHXk9gOu4JKsbw@mail.gmail.com

fc057b2b

03 Jul, 2018 2 commits
- Add $Test::Builder::Level to pgbench test functions · 2c059c86
  Peter Eisentraut authored Jul 03, 2018
```
same as c4309f4a
```
  2c059c86
- Correct comment · 68370786
  Peter Eisentraut authored Jul 03, 2018
  
  68370786
02 Jul, 2018 2 commits

Add wait event for fsync of WAL segments · c55de5e5

Michael Paquier authored Jul 02, 2018

This has been visibly a forgotten spot in the first implementation of
wait events for I/O added by 249cf070, and what has been missing is a
fsync call for WAL segments which is a wrapper reacting on the value of
GUC wal_sync_method.

Reported-by: Konstantin Knizhnik
Author: Konstantin Knizhnik
Reviewed-by: Craig Ringer, Michael Paquier
Discussion: https://postgr.es/m/4a243897-0ad8-f471-aa40-242591f2476e@postgrespro.ru

c55de5e5

Correct function name in comment of logical decoding code · c072e803

Michael Paquier authored Jul 02, 2018

Reported-by: Dave Cramer
Author: Euler Taveira
Discussion: https://postgr.es/m/CADK3HHKnPGJDLhjOFBY6+70Wd14iEH8c2GKw7UrOuUHp_GNFrA@mail.gmail.com

c072e803

01 Jul, 2018 6 commits

pg_standby: Remove code for .backup files · a33969ee

Peter Eisentraut authored Jul 01, 2018

These files are no longer requested on recovery (since
06f82b29), so the code for handling them
here is useless.

Author: Yugo Nagata <nagata@sraoss.co.jp>

a33969ee

Fix libpq example programs · 7bdea626

Peter Eisentraut authored Jul 01, 2018

When these programs call pg_catalog.set_config, they need to check for
PGRES_TUPLES_OK instead of PGRES_COMMAND_OK.  Fix for
5770172c.
Reported-by: Ideriha, Takeshi <ideriha.takeshi@jp.fujitsu.com>

7bdea626

Use more modern instructions for creating a new dev cycle · 56b4da8c
Andrew Dunstan authored Jul 01, 2018

56b4da8c

Add tests for inheritance trees mixing permanent and temporary relations · 9994013f

Michael Paquier authored Jul 01, 2018

While working on 1c7c317c and related things, which has clarified the
use of partitions with temporary tables, I have noticed that there could
be better coverage for inheritance trees mixing temporary and permanent
relations.  A lot of cross-checks happen in MergeAttributes() which is
not designed for this purpose, so the tests added in this commit will
make sure that any kind of future refactoring will limit the amount of
compatibility breakage.

Author: Michael Paquier
Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/20180619022131.GE3314@paquier.xyz

9994013f

Use $Test::Builder::Level in TAP test functions · c4309f4a

Peter Eisentraut authored May 22, 2018

In TAP test functions, that is, those that produce test results, locally
increment $Test::Builder::Level.  This has the effect that test failures
are reported at the callers location rather than somewhere in the test
support libraries.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>

c4309f4a

Use optimized bitmap set function for membership test in postgres_fdw · 65782346

Michael Paquier authored Jul 01, 2018

Deparsing logic in postgres_fdw for locking, FROM clause (alias) and Var
(column qualification) does not need to know the exact number of members
involved, which can be calculated with bms_num_members(), but just if
there is more than one relation involved, which is what bms_membership()
does.  The latter is more performant than the former so this shaves a
couple of cycles.

Author: Daniel Gustafsson
Reviewed-by: Ashutosh Bapat, Nathan Bossart
Discussion: https://postgr.es/m/C73594E0-2B67-4E10-BB35-CDE0E41CC384@yesql.se

65782346

30 Jun, 2018 4 commits
- Stamp HEAD as 12devel · feced138
  Andrew Dunstan authored Jun 30, 2018
```
Let the hacking begin ...
```
  feced138
- perltidy run prior to branching · d8421390
  Andrew Dunstan authored Jun 30, 2018
  
  d8421390
- pgindent run prior to branching · 1e9c8580
  Andrew Dunstan authored Jun 30, 2018
  
  1e9c8580
- Update typedefs list · 2c64d200
  Andrew Dunstan authored Jun 30, 2018
  
  2c64d200
29 Jun, 2018 6 commits

Documentation spell checking and markup improvements · f7481d2c
Peter Eisentraut authored Jun 29, 2018

f7481d2c
doc: Replace non-ASCII lines in psql example output · 539f32bd
Peter Eisentraut authored Jun 29, 2018

539f32bd

psql: show cloned triggers in partitions · bc87f22e

Alvaro Herrera authored Jun 29, 2018

In a partition, row triggers that had been cloned from their parent
partitioned table would not be listed at all in psql's \d, which could
surprise users, per insistent complaint from Ashutosh Bapat (though his
aim was elsewhere). The simplest possible fix, suggested by Peter
Eisentraut, seems to be to list triggers marked as internal if they have
a row in pg_depend that points to some other trigger.

Author: Álvaro Herrera
Discussion: https://postgr.es/m/20180618165910.p26vhk7dpq65ix54@alvherre.pgsql

bc87f22e

Fix crash when ALTER TABLE recreates indexes on partitions · 41372071

Alvaro Herrera authored Jun 29, 2018

The skip_build flag was not being passed correctly when recursing to
indexes on partitions, leading to attempts to rebuild indexes when they
were not yet ready to be rebuilt.

Reported-by: Rajkumar Raghuwanshi
Discussion: https://postgr.es/m/CAKcux6mxNCGsgATwf5CGMF8g4WSupCXicCVMeKUTuWbyxHOMsQ@mail.gmail.com

41372071

Replace search.cpan.org with metacpan.org · dad335b8

Michael Paquier authored Jun 29, 2018

search.cpan.org has been EOL'd, with metacpan.org being the official
replacement to which URLs now redirect.  Update links to match the new
URL. Also update links to CPAN to use https as it will redirect from
http.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/B74C0219-6BA9-46E1-A524-5B9E8CD3BDB3@yesql.se

dad335b8

Make capitalization of term "OpenSSL" more consistent · dad5f8a3

Michael Paquier authored Jun 29, 2018

This includes code comments and documentation. No backpatch as this is
cosmetic even if there are documentation changes which are user-facing.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/BB89928E-2BC7-489E-A5E4-6D204B3954CF@yesql.se

dad5f8a3

27 Jun, 2018 8 commits

Fix typo in comment · f5545287

Alvaro Herrera authored Jun 27, 2018

Author: Amit Langote
Discussion: https://postgr.es/m/b23dc88b-df41-ef07-22c5-12f77cf73b57@lab.ntt.co.jp

f5545287

Fix thinko in comments. · 2e61c507

Amit Kapila authored Jun 27, 2018

A slot can not be stored in a tuple but it's vice versa.

Reported-by: Ashutosh Bapat
Author: Ashutosh Bapat
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAFjFpRcHhNhXdegyJv3KKDWrwO1_NB_KYZM_ZSDeMOZaL1A5jQ@mail.gmail.com

2e61c507

Change pqformat.h's integer handling functions to take unsigned integers. · 42121790

Andres Freund authored Jun 26, 2018

As added in 1de09ad8 the new functions
all accept signed integers as parameters. That's not great, because
it's perfectly reasonable to call them with unsigned parameters.
Unfortunately unsigned to signed conversion is not well defined, when
exceeding the range of the signed value.  That's presently not a
practical issue in postgres (among other reasons because we force
gcc's hand with -fwrapv).  But it's clearly not quite right.

Thus change the signatures to accept unsigned integers instead, signed
to unsigned conversion is always well defined. Also change the
backward compat pq_sendint() - while it's deprecated it seems better
to be consistent.

Per discussion between Andrew Gierth, Michael Paquier, Alvaro Herrera
and Tom Lane.

Reported-By: Andrew Gierth
Author: Andres Freund
Discussion: https://postgr.es/m/87r2m10zm2.fsf@news-spur.riddles.org.uk

42121790

Remove duplicated return statement from llvmjit code. · 98607087

Andres Freund authored Jun 26, 2018

The duplicated return clearly doesn't make sense / isn't
reachable. Likely introduced by me (Andres), while revising the code.

Author: Rushabh Lathia
Discussion: https://postgr.es/m/CAGPqQf2raxWOcbuTP36M1rEF3=Rfo7oD29K3psdyHMeE5swBRg@mail.gmail.com

98607087

Fix whitespace · 0fcf5e0e
Peter Eisentraut authored Jun 27, 2018

0fcf5e0e
doc: Improve wording and fix whitespace · ae5ed75e
Peter Eisentraut authored Jun 27, 2018

ae5ed75e
doc: Document some nuances of logical replication of TRUNCATE · c9d6a457
Peter Eisentraut authored Jun 27, 2018

c9d6a457

Cosmetic improvements for faster column addition. · 8121ab88

Amit Kapila authored Jun 27, 2018

Changed the name of few structure members for the sake of clarity and
removed spurious whitespace.

Reported-by: Amit Kapila
Author: Amit Kapila, based on suggestion by Andrew Dunstan
Reviewed-by: Alvaro Herrera
Discussion: https://postgr.es/m/CAA4eK1K2znsFpC+NQ9A4vxT4uDxADN4RmvHX0L6Y=aHVo9gB4Q@mail.gmail.com

8121ab88

26 Jun, 2018 2 commits

Fix "base" snapshot handling in logical decoding · f49a80c4

Alvaro Herrera authored Jun 26, 2018

Two closely related bugs are fixed. First, xmin of logical slots was
advanced too early. During xl_running_xacts processing, xmin of the
slot was set to the oldest running xid in the record, but that's wrong:
actually, snapshots which will be used for not-yet-replayed transactions
might consider older txns as running too, so we need to keep xmin back
for them. The problem wasn't noticed earlier because DDL which allows
to delete tuple (set xmax) while some another not-yet-committed
transaction looks at it is pretty rare, if not unique: e.g. all forms of
ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock
conflicting with any inserts. The included test case (test_decoding's
oldest_xmin) uses ALTER of a composite type, which doesn't have such
interlocking.

To deal with this, we must be able to quickly retrieve oldest xmin
(oldest running xid among all assigned snapshots) from ReorderBuffer. To
fix, add another list of ReorderBufferTXNs to the reorderbuffer, where
transactions are sorted by base-snapshot-LSN. This is slightly
different from the existing (sorted by first-LSN) list, because a
transaction can have an earlier LSN but a later Xmin, if its first
record does not obtain an xmin (eg. xl_xact_assignment). Note this new
list doesn't fully replace the existing txn list: we still need that one
to prevent WAL recycling.

The second issue concerns SnapBuilder snapshots and subtransactions.
SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a
transaction that is known to be a subtxn, which is good in the common
case that the top-level transaction already has one (no point in doing
so), but a bug otherwise. To fix, arrange to transfer the snapshot from
the subtxn to its top-level txn as soon as the kinship gets known.
test_decoding's snapshot_transfer verifies this.

Also, fix a minor memory leak: refcount of toplevel's old base snapshot
was not decremented when the snapshot is transferred from child.

Liberally sprinkle code comments, and rewrite a few existing ones. This
part is my (Álvaro's) contribution to this commit, as I had to write all
those comments in order to understand the existing code and Arseny's
patch.
Reported-by: Arseny Sher <a.sher@postgrespro.ru>
Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad

f49a80c4

Fix upper limit for vacuum_cleanup_index_scale_factor · 4d54543e

Alexander Korotkov authored Jun 26, 2018

6ca33a88 sets upper limit for vacuum_cleanup_index_scale_factor to
DBL_MAX. DBL_MAX appears to be platform-dependent. That causes
many buildfarm animals to fail, because we check boundaries of
vacuum_cleanup_index_scale_factor in regression tests.

This commit changes upper limit from DBL_MAX to just "large enough"
limit, which was arbitrary selected as 1e10.

Author: Alexander Korotkov
Reported-by: Tom Lane, Darafei Praliaskouski
Discussion: https://postgr.es/m/CAPpHfdvewmr4PcpRjrkstoNn1n2_6dL-iHRB21CCfZ0efZdBTg%40mail.gmail.com
Discussion: https://postgr.es/m/CAC8Q8tLYFOpKNaPS_E7V8KtPdE%3D_TnAn16t%3DA3LuL%3DXjfOO-BQ%40mail.gmail.com

4d54543e