1. 05 Jul, 2018 6 commits
    • Peter Eisentraut's avatar
      doc: Reword old inheritance partitioning documentation · 0c06534b
      Peter Eisentraut authored
      Prefer to use phrases like "child" instead of "partition" when
      describing the legacy inheritance-based partitioning.  The word
      "partition" now has a fixed meaning for the built-in partitioning, so
      keeping it out of the documentation of the old method makes things
      clearer.
      
      Author: Justin Pryzby <pryzby@telsasoft.com>
      0c06534b
    • Peter Eisentraut's avatar
      doc: Fix typos · 17411e0f
      Peter Eisentraut authored
      Author: Justin Pryzby <pryzby@telsasoft.com>
      17411e0f
    • Alvaro Herrera's avatar
      Reduce cost of test_decoding's new oldest_xmin test · 8d1c1ca7
      Alvaro Herrera authored
      Change a whole-database VACUUM into doing just pg_attribute, which is
      the portion that verifies what we want it to do.  The original
      formulation wastes a lot of CPU time, which leads the test to fail when
      runtime exceeds isolationtester timeout when it's super-slow, such as
      under CLOBBER_CACHE_ALWAYS.  Per buildfarm member friarbird.
      
      It turns out that the previous shape of the test doesn't always detect
      the condition it is supposed to detect (on unpatched reorderbuffer
      code): the reason is that there is a good chance of encountering a
      xl_running_xacts record (logged every 15 seconds) before the checkpoint
      -- and because we advance the xmin when we receive that WAL record, and
      we *don't* advance the xmin twice consecutively without receiving a
      client message in between, that means the xmin is not advanced enough
      for the tuple to be pruned from pg_attribute by VACUUM.  So the test
      would spuriously pass.
      
      The reason this test deficiency wasn't detected earlier is that HOT
      pruning removes the tuple anyway, even if vacuum leaves it in place, so
      the test correctly fails (detecting the coding mistake), but for the
      wrong reason.
      
      To fix this mess, run the s0_get_changes step twice before vacuum
      instead of once: this seems to cause the xmin to be advanced reliably,
      wreaking havoc with more certainty.
      
      Author: Arseny Sher
      Discussion: https://postgr.es/m/87h8lkuxoa.fsf@ars-thinkpad
      8d1c1ca7
    • Peter Eisentraut's avatar
      Fix typo · f61988d1
      Peter Eisentraut authored
      f61988d1
    • Michael Paquier's avatar
      Prevent references to invalid relation pages after fresh promotion · 3c64dcb1
      Michael Paquier authored
      If a standby crashes after promotion before having completed its first
      post-recovery checkpoint, then the minimal recovery point which marks
      the LSN position where the cluster is able to reach consistency may be
      set to a position older than the first end-of-recovery checkpoint while
      all the WAL available should be replayed.  This leads to the instance
      thinking that it contains inconsistent pages, causing a PANIC and a hard
      instance crash even if all the WAL available has not been replayed for
      certain sets of records replayed.  When in crash recovery,
      minRecoveryPoint is expected to always be set to InvalidXLogRecPtr,
      which forces the recovery to replay all the WAL available, so this
      commit makes sure that the local copy of minRecoveryPoint from the
      control file is initialized properly and stays as it is while crash
      recovery is performed.  Once switching to archive recovery or if crash
      recovery finishes, then the local copy minRecoveryPoint can be safely
      updated.
      
      Pavan Deolasee has reported and diagnosed the failure in the first
      place, and the base fix idea to rely on the local copy of
      minRecoveryPoint comes from Kyotaro Horiguchi, which has been expanded
      into a full-fledged patch by me.  The test included in this commit has
      been written by Álvaro Herrera and Pavan Deolasee, which I have modified
      to make it faster and more reliable with sleep phases.
      
      Backpatch down to all supported versions where the bug appears, aka 9.3
      which is where the end-of-recovery checkpoint is not run by the startup
      process anymore.  The test gets easily supported down to 10, still it
      has been tested on all branches.
      
      Reported-by: Pavan Deolasee
      Diagnosed-by: Pavan Deolasee
      Reviewed-by: Pavan Deolasee, Kyotaro Horiguchi
      Author: Michael Paquier, Kyotaro Horiguchi, Pavan Deolasee, Álvaro
      Herrera
      Discussion: https://postgr.es/m/CABOikdPOewjNL=05K5CbNMxnNtXnQjhTx2F--4p4ruorCjukbA@mail.gmail.com
      3c64dcb1
    • Andres Freund's avatar
      Use context with correct lifetime in hypothetical_dense_rank_final. · 249126e7
      Andres Freund authored
      The query lifetime expression context created in
      hypothetical_dense_rank_final() was buggily allocated in the calling
      memory context. I (Andres) broke that in bf6c614a.
      
      Reported-By: Rajkumar Raghuwanshi
      Author: Amit Langote
      Discussion:  https://postgr.es/m/CAKcux6kmzWmur5HhA_aU6gYVFu0RLQdgJJ+aC9SLdcOvBSrpfA@mail.gmail.com
      Backpatch: 11-
      249126e7
  2. 04 Jul, 2018 4 commits
    • Andres Freund's avatar
      Check for interrupts inside the nbtree page deletion code. · 3a01f68e
      Andres Freund authored
      When deleting pages the nbtree code has to walk through siblings of a
      tree node. When those sibling links are corrupted that can lead to
      endless loops - which are currently not interruptible.  This is
      especially problematic if autovacuum is repeatedly blocked on such
      indexes, as it can be hard to get out of that situation without
      resorting to single user mode.
      
      Thus add interrupt checks to appropriate places in such
      loops. Unfortunately in one of the cases it's it's not easy to do so.
      
      Between 9.3 and 9.4 the page deletion (and page split) code changed
      significantly. Before it was significantly less robust against
      interruptions. Therefore don't backpatch to 9.3.
      
      Author: Andres Freund
      Discussion: https://postgr.es/m/20180627191629.wkunw2qbibnvlz53@alap3.anarazel.de
      Backpatch: 9.4-
      3a01f68e
    • Fujii Masao's avatar
      Improve the performance of relation deletes during recovery. · b4166911
      Fujii Masao authored
      When multiple relations are deleted at the same transaction,
      the files of those relations are deleted by one call to smgrdounlinkall(),
      which leads to scan whole shared_buffers only one time. OTOH,
      previously, during recovery, smgrdounlink() (not smgrdounlinkall()) was
      called for each file to delete, which led to scan shared_buffers
      multiple times. Obviously this could cause to increase the WAL replay
      time very much especially when shared_buffers was huge.
      
      To alleviate this situation, this commit changes the recovery so that
      it also calls smgrdounlinkall() only one time to delete multiple
      relation files.
      
      This is just fix for oversight of commit 279628a0, not new feature.
      So, per discussion on pgsql-hackers, we concluded to backpatch this
      to all supported versions.
      
      Author: Fujii Masao
      Reviewed-by: Michael Paquier, Andres Freund, Thomas Munro, Kyotaro Horiguchi, Takayuki Tsunakawa
      Discussion: https://postgr.es/m/CAHGQGwHVQkdfDqtvGVkty+19cQakAydXn1etGND3X0PHbZ3+6w@mail.gmail.com
      b4166911
    • Peter Eisentraut's avatar
      doc: Reorganize CREATE TABLE / LIKE option documentation · b46727e0
      Peter Eisentraut authored
      This section once started out small but has now grown quite a bit and
      needs a bit of structure.
      
      Rewrite as list, add documentation of EXCLUDING, and improve the
      documentation of INCLUDING ALL instead of just listing all the options
      again.
      
      per report from Yugo Nagata that EXCLUDING was not documented, that part
      reviewed by Daniel Gustafsson, most of the rewrite was by me
      b46727e0
    • Michael Paquier's avatar
      Remove dead code for temporary relations in partition planning · fc057b2b
      Michael Paquier authored
      Since recent commit 1c7c317c, temporary relations cannot be mixed with
      permanent relations within the same partition tree, and the same counts
      for temporary relations created by other sessions, which the planner
      simply discarded.  Instead be paranoid and issue an error, as those
      should be blocked at definition time, at least for now.
      
      At the same time, a test case is added to stress what has been moved
      when expand_partitioned_rtentry gets called recursively but bumps on a
      partitioned relation with no partitions which should be handled the same
      way as the non-inheritance case.  This code may be reworked in a close
      future, and covering this code path will limit surprises.
      
      Reported-by: David Rowley
      Author: David Rowley
      Reviewed-by: Amit Langote, Robert Haas, Michael Paquier
      Discussion: https://postgr.es/m/CAKJS1f_HyV1txn_4XSdH5EOhBMYaCwsXyAj6bHXk9gOu4JKsbw@mail.gmail.com
      fc057b2b
  3. 03 Jul, 2018 2 commits
  4. 02 Jul, 2018 2 commits
  5. 01 Jul, 2018 6 commits
  6. 30 Jun, 2018 4 commits
  7. 29 Jun, 2018 6 commits
  8. 27 Jun, 2018 8 commits
  9. 26 Jun, 2018 2 commits
    • Alvaro Herrera's avatar
      Fix "base" snapshot handling in logical decoding · f49a80c4
      Alvaro Herrera authored
      Two closely related bugs are fixed.  First, xmin of logical slots was
      advanced too early.  During xl_running_xacts processing, xmin of the
      slot was set to the oldest running xid in the record, but that's wrong:
      actually, snapshots which will be used for not-yet-replayed transactions
      might consider older txns as running too, so we need to keep xmin back
      for them.  The problem wasn't noticed earlier because DDL which allows
      to delete tuple (set xmax) while some another not-yet-committed
      transaction looks at it is pretty rare, if not unique: e.g. all forms of
      ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock
      conflicting with any inserts. The included test case (test_decoding's
      oldest_xmin) uses ALTER of a composite type, which doesn't have such
      interlocking.
      
      To deal with this, we must be able to quickly retrieve oldest xmin
      (oldest running xid among all assigned snapshots) from ReorderBuffer. To
      fix, add another list of ReorderBufferTXNs to the reorderbuffer, where
      transactions are sorted by base-snapshot-LSN.  This is slightly
      different from the existing (sorted by first-LSN) list, because a
      transaction can have an earlier LSN but a later Xmin, if its first
      record does not obtain an xmin (eg. xl_xact_assignment).  Note this new
      list doesn't fully replace the existing txn list: we still need that one
      to prevent WAL recycling.
      
      The second issue concerns SnapBuilder snapshots and subtransactions.
      SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a
      transaction that is known to be a subtxn, which is good in the common
      case that the top-level transaction already has one (no point in doing
      so), but a bug otherwise.  To fix, arrange to transfer the snapshot from
      the subtxn to its top-level txn as soon as the kinship gets known.
      test_decoding's snapshot_transfer verifies this.
      
      Also, fix a minor memory leak: refcount of toplevel's old base snapshot
      was not decremented when the snapshot is transferred from child.
      
      Liberally sprinkle code comments, and rewrite a few existing ones.  This
      part is my (Álvaro's) contribution to this commit, as I had to write all
      those comments in order to understand the existing code and Arseny's
      patch.
      Reported-by: default avatarArseny Sher <a.sher@postgrespro.ru>
      Diagnosed-by: default avatarArseny Sher <a.sher@postgrespro.ru>
      Co-authored-by: default avatarArseny Sher <a.sher@postgrespro.ru>
      Co-authored-by: default avatarÁlvaro Herrera <alvherre@alvh.no-ip.org>
      Reviewed-by: default avatarAntonin Houska <ah@cybertec.at>
      Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad
      f49a80c4
    • Alexander Korotkov's avatar
      Fix upper limit for vacuum_cleanup_index_scale_factor · 4d54543e
      Alexander Korotkov authored
      6ca33a88 sets upper limit for vacuum_cleanup_index_scale_factor to
      DBL_MAX.  DBL_MAX appears to be platform-dependent. That causes
      many buildfarm animals to fail, because we check boundaries of
      vacuum_cleanup_index_scale_factor in regression tests.
      
      This commit changes upper limit from DBL_MAX to just "large enough"
      limit, which was arbitrary selected as 1e10.
      
      Author: Alexander Korotkov
      Reported-by: Tom Lane, Darafei Praliaskouski
      Discussion: https://postgr.es/m/CAPpHfdvewmr4PcpRjrkstoNn1n2_6dL-iHRB21CCfZ0efZdBTg%40mail.gmail.com
      Discussion: https://postgr.es/m/CAC8Q8tLYFOpKNaPS_E7V8KtPdE%3D_TnAn16t%3DA3LuL%3DXjfOO-BQ%40mail.gmail.com
      4d54543e