1. 17 Nov, 2014 7 commits
    • Simon Riggs's avatar
      Add pg_dump --snapshot option · be1cc8f4
      Simon Riggs authored
      Allows pg_dump to use a snapshot previously defined by a concurrent
      session that has either used pg_export_snapshot() or obtained a
      snapshot when creating a logical slot. When this option is used with
      parallel pg_dump, the snapshot defined by this option is used and no
      new snapshot is taken.
      
      Simon Riggs and Michael Paquier
      be1cc8f4
    • Tom Lane's avatar
      83205404
    • Fujii Masao's avatar
      Add --synchronous option to pg_receivexlog, for more reliable WAL writing. · c4f99d20
      Fujii Masao authored
      Previously pg_receivexlog flushed WAL data only when WAL file was switched.
      Then 3dad73e7 added -F option to pg_receivexlog so that users could control
      how frequently sync commands were issued to WAL files. It also allowed users
      to make pg_receivexlog flush WAL data immediately after writing by
      specifying 0 in -F option. However feedback messages were not sent back
      immediately even after a flush location was updated. So even if WAL data
      was flushed in real time, the server could not see that for a while.
      
      This commit removes -F option from and adds --synchronous to pg_receivexlog.
      If --synchronous is specified, like the standby's wal receiver, pg_receivexlog
      flushes WAL data as soon as there is WAL data which has not been flushed yet.
      Then it sends back the feedback message identifying the latest flush location
      to the server. This option is useful to make pg_receivexlog behave as sync
      standby by using replication slot, for example.
      
      Original patch by Furuya Osamu, heavily rewritten by me.
      Reviewed by Heikki Linnakangas, Alvaro Herrera and Sawada Masahiko.
      c4f99d20
    • Tom Lane's avatar
      Update time zone data files to tzdata release 2014j. · bc241488
      Tom Lane authored
      DST law changes in the Turks & Caicos Islands (America/Grand_Turk) and
      in Fiji.  New zone Pacific/Bougainville for portions of Papua New Guinea.
      Historical changes for Korea and Vietnam.
      bc241488
    • Heikki Linnakangas's avatar
      Fix WAL-logging of B-tree "unlink halfdead page" operation. · c73669c0
      Heikki Linnakangas authored
      There was some confusion on how to record the case that the operation
      unlinks the last non-leaf page in the branch being deleted.
      _bt_unlink_halfdead_page set the "topdead" field in the WAL record to
      the leaf page, but the redo routine assumed that it would be an invalid
      block number in that case. This commit fixes _bt_unlink_halfdead_page to
      do what the redo routine expected.
      
      This code is new in 9.4, so backpatch there.
      c73669c0
    • Alvaro Herrera's avatar
      Fix relpersistence setting in reindex_index · 0f9692b4
      Alvaro Herrera authored
      Buildfarm members with CLOBBER_CACHE_ALWAYS advised us that commit
      85b506bb was mistaken in setting the relpersistence value of the
      index directly in the relcache entry, within reindex_index.  The reason
      for the failure is that an invalidation message that comes after mucking
      with the relcache entry directly, but before writing it to the catalogs,
      would cause the entry to become rebuilt in place from catalogs with the
      old contents, losing the update.
      
      Fix by passing the correct persistence value to
      RelationSetNewRelfilenode instead; this routine also writes the updated
      tuple to pg_class, avoiding the problem.  Suggested by Tom Lane.
      0f9692b4
    • Peter Eisentraut's avatar
      Translation updates · 7466a1b7
      Peter Eisentraut authored
      7466a1b7
  2. 16 Nov, 2014 2 commits
  3. 15 Nov, 2014 7 commits
    • Simon Riggs's avatar
      Emit msg re skipping ANALYZE for absent inh tree · 0f66d212
      Simon Riggs authored
      When checking a table that has an inheritance tree marked,
      if no child tables remain, we skip ANALYZE. This patch emits
      a message to show that the action has been skipped.
      
      Author: Etsuro Fujita
      Reviewer: Furuya Osamu
      0f66d212
    • Alvaro Herrera's avatar
      Get rid of SET LOGGED indexes persistence kludge · 85b506bb
      Alvaro Herrera authored
      This removes ATChangeIndexesPersistence() introduced by f41872d0
      which was too ugly to live for long.  Instead, the correct persistence
      marking is passed all the way down to reindex_index, so that the
      transient relation built to contain the index relfilenode can
      get marked correctly right from the start.
      
      Author: Fabrízio de Royes Mello
      Review and editorialization by Michael Paquier
                                           and Álvaro Herrera
      85b506bb
    • Alvaro Herrera's avatar
      Remove unused InhPaths · e4d1e264
      Alvaro Herrera authored
      Allegedly, the last remaining usages of that struct were removed by
      0e99be1c.
      
      Author: Peter Geoghegan
      e4d1e264
    • Alvaro Herrera's avatar
    • Andres Freund's avatar
      Fix initdb --sync-only to also sync tablespaces. · 522c85a6
      Andres Freund authored
      630cd144 added initdb --sync-only, for use by pg_upgrade, by just
      exposing the existing fsync code. That's wrong, because initdb so far
      had absolutely no reason to deal with tablespaces.
      
      Fix --sync-only by additionally explicitly syncing each of the
      tablespaces.
      
      Backpatch to 9.3 where --sync-only was introduced.
      
      Abhijit Menon-Sen and Andres Freund
      522c85a6
    • Andres Freund's avatar
      Sync unlogged relations to disk after they have been reset. · 98ec7fd9
      Andres Freund authored
      Unlogged relations are only reset when performing a unclean
      restart. That means they have to be synced to disk during clean
      shutdowns. During normal processing that's achieved by registering a
      buffer's file to be fsynced at the next checkpoint when flushed. But
      ResetUnloggedRelations() doesn't go through the buffer manager, so
      nothing will force reset relations to disk before the next shutdown
      checkpoint.
      
      So just make ResetUnloggedRelations() fsync the newly created main
      forks to disk.
      
      Discussion: 20140912112246.GA4984@alap3.anarazel.de
      
      Backpatch to 9.1 where unlogged tables were introduced.
      
      Abhijit Menon-Sen and Andres Freund
      98ec7fd9
    • Andres Freund's avatar
      Ensure unlogged tables are reset even if crash recovery errors out. · d3586fc8
      Andres Freund authored
      Unlogged relations are reset at the end of crash recovery as they're
      only synced to disk during a proper shutdown. Unfortunately that and
      later steps can fail, e.g. due to running out of space. This reset
      was, up to now performed after marking the database as having finished
      crash recovery successfully. As out of space errors trigger a crash
      restart that could lead to the situation that not all unlogged
      relations are reset.
      
      Once that happend usage of unlogged relations could yield errors like
      "could not open file "...": No such file or directory". Luckily
      clusters that show the problem can be fixed by performing a immediate
      shutdown, and starting the database again.
      
      To fix, just call ResetUnloggedRelations(UNLOGGED_RELATION_INIT)
      earlier, before marking the database as having successfully recovered.
      
      Discussion: 20140912112246.GA4984@alap3.anarazel.de
      
      Backpatch to 9.1 where unlogged tables were introduced.
      
      Abhijit Menon-Sen and Andres Freund
      d3586fc8
  4. 14 Nov, 2014 9 commits
    • Tom Lane's avatar
      Document evaluation-order considerations for aggregate functions. · 0ce627d4
      Tom Lane authored
      The SELECT reference page didn't really address the question of when
      aggregate function evaluation occurs, nor did the "expression evaluation
      rules" documentation mention that CASE can't be used to control whether
      an aggregate gets evaluated or not.  Improve that.
      
      Per discussion of bug #11661.  Original text by Marti Raudsepp and Michael
      Paquier, rewritten significantly by me.
      0ce627d4
    • Stephen Frost's avatar
      Clean up includes from RLS patch · 80eacaa3
      Stephen Frost authored
      The initial patch for RLS mistakenly included headers associated with
      the executor and planner bits in rewrite/rowsecurity.h.  Per policy and
      general good sense, executor headers should not be included in planner
      headers or vice versa.
      
      The include of execnodes.h was a mistaken holdover from previous
      versions, while the include of relation.h was used for Relation's
      definition, which should have been coming from utils/relcache.h.  This
      patch cleans these issues up, adds comments to the RowSecurityPolicy
      struct and the RowSecurityConfigType enum, and changes Relation->rsdesc
      to Relation->rd_rsdesc to follow Relation field naming convention.
      
      Additionally, utils/rel.h was including rewrite/rowsecurity.h, which
      wasn't a great idea since that was pulling in things not really needed
      in utils/rel.h (which gets included in quite a few places).  Instead,
      use 'struct RowSecurityDesc' for the rd_rsdesc field and add comments
      explaining why.
      
      Lastly, add an include into access/nbtree/nbtsort.c for
      utils/sortsupport.h, which was evidently missed due to the above mess.
      
      Pointed out by Tom in 16970.1415838651@sss.pgh.pa.us; note that the
      concerns regarding a similar situation in the custom-path commit still
      need to be addressed.
      80eacaa3
    • Alvaro Herrera's avatar
      Document BRIN's pages_per_range in CREATE INDEX · 79172a58
      Alvaro Herrera authored
      Author: Michael Paquier
      79172a58
    • Stephen Frost's avatar
      Revert change to ALTER TABLESPACE summary. · 155c0f24
      Stephen Frost authored
      When ALTER TABLESPACE MOVE ALL was changed to be ALTER TABLE ALL IN
      TABLESPACE, the ALTER TABLESPACE summary should have been adjusted back
      to its original definition.
      
      Patch by Thom Brown (thanks!).
      155c0f24
    • Alvaro Herrera's avatar
      Reduce disk footprint of brin regression test · 86cf9a56
      Alvaro Herrera authored
      Per complaint from Tom.
      
      While at it, throw in some extra tests for nulls as well, and make sure
      that the set of data we insert on the second round is not identical to
      the first one.  Both measures are intended to improve coverage of the
      test.
      
      Also uncomment the ON COMMIT DROP clause on the CREATE TEMP TABLE
      commands.  This doesn't have any effect for someone examining the
      regression database after the tests are done, but it reduces clutter for
      those that execute the script directly.
      86cf9a56
    • Alvaro Herrera's avatar
      Allow interrupting GetMultiXactIdMembers · 51f9ea25
      Alvaro Herrera authored
      This function has a loop which can lead to uninterruptible process
      "stalls" (actually infinite loops) when some bugs are triggered.  Avoid
      that unpleasant situation by adding a check for interrupts in a place
      that shouldn't degrade performance in the normal case.
      
      Backpatch to 9.3.  Older branches have an identical loop here, but the
      aforementioned bugs are only a problem starting in 9.3 so there doesn't
      seem to be any point in backpatching any further.
      51f9ea25
    • Andres Freund's avatar
      Move BufferGetBlockNumber() out of heap_page_is_all_visible()'s inner loop. · 0c5af0a5
      Andres Freund authored
      In some workloads BufferGetBlockNumber() shows up in profiles due to
      the sheer number of calls to it (and because it causes cache
      misses). The compiler can't move it out of the loop because it's a
      full extern function call...
      0c5af0a5
    • Andres Freund's avatar
      Add valgrind suppression for pg_atomic_init_u64. · 6c878edc
      Andres Freund authored
      pg_atomic_init_u64 (indirectly) uses compare/exchange to guarantee
      atomic writes on platforms where compare/exchange is available, but
      64bit writes aren't atomic (yes, those exist). That leads to a
      harmless read of the initial value of variable.
      6c878edc
    • Peter Eisentraut's avatar
      Improve logical decoding log messages · a15d387c
      Peter Eisentraut authored
      suggestions from Robert Haas
      a15d387c
  5. 13 Nov, 2014 10 commits
    • Andres Freund's avatar
      Adapt valgrind.supp to the XLogInsert() split. · 473f162c
      Andres Freund authored
      The CRC computation now happens in XLogInsertRecord(), not
      XLogInsert() itself anymore.
      473f162c
    • Tom Lane's avatar
      Fix pg_dumpall to restore its ability to dump from ancient servers. · be09ceb2
      Tom Lane authored
      Fix breakage induced by commits d8d3d2a4
      and 463f2625: pg_dumpall has crashed when
      attempting to dump from pre-8.1 servers since then, due to faulty
      construction of the query used for dumping roles from older servers.
      The query was erroneous as of the earlier commit, but it wasn't exposed
      unless you tried to use --binary-upgrade, which you presumably wouldn't
      with a pre-8.1 server.  However commit 463f2625 made it fail always.
      
      In HEAD, also fix additional breakage induced in the same query by
      commit 491c029d, which evidently wasn't
      tested against pre-8.1 servers either.
      
      The bug is only latent in 9.1 because 463f2625 hadn't landed yet, but
      it seems best to back-patch all branches containing the faulty query.
      
      Gilles Darold
      be09ceb2
    • Andres Freund's avatar
      Fix and improve cache invalidation logic for logical decoding. · 89fd41b3
      Andres Freund authored
      There are basically three situations in which logical decoding needs
      to perform cache invalidation. During/After replaying a transaction
      with catalog changes, when skipping a uninteresting transaction that
      performed catalog changes and when erroring out while replaying a
      transaction. Unfortunately these three cases were all done slightly
      differently - partially because 8de3e410, which greatly simplifies
      matters, got committed in the midst of the development of logical
      decoding.
      
      The actually problematic case was when logical decoding skipped
      transaction commits (and thus processed invalidations). When used via
      the SQL interface cache invalidation could access the catalog - bad,
      because we didn't set up enough state to allow that correctly. It'd
      not be hard to setup sufficient state, but the simpler solution is to
      always perform cache invalidation outside a valid transaction.
      
      Also make the different cache invalidation cases look as similar as
      possible, to ease code review.
      
      This fixes the assertion failure reported by Antonin Houska in
      53EE02D9.7040702@gmail.com. The presented testcase has been expanded
      into a regression test.
      
      Backpatch to 9.4, where logical decoding was introduced.
      89fd41b3
    • Andres Freund's avatar
      Fix xmin/xmax horizon computation during logical decoding initialization. · 5a2c1840
      Andres Freund authored
      When building the initial historic catalog snapshot there were
      scenarios where snapbuild.c would use incorrect xmin/xmax values when
      starting from a xl_running_xacts record. The values used were always a
      bit suspect, but happened to be correct in the easy to test
      cases. Notably the values used when the the initial snapshot was
      computed while no other transactions were running were correct.
      
      This is likely to be the cause of the occasional buildfarm failures on
      animals markhor and tick; but it's quite possible to reproduce
      problems without CLOBBER_CACHE_ALWAYS.
      
      Backpatch to 9.4, where logical decoding was introduced.
      5a2c1840
    • Heikki Linnakangas's avatar
      Fix race condition between hot standby and restoring a full-page image. · 81c45081
      Heikki Linnakangas authored
      There was a window in RestoreBackupBlock where a page would be zeroed out,
      but not yet locked. If a backend pinned and locked the page in that window,
      it saw the zeroed page instead of the old page or new page contents, which
      could lead to missing rows in a result set, or errors.
      
      To fix, replace RBM_ZERO with RBM_ZERO_AND_LOCK, which atomically pins,
      zeroes, and locks the page, if it's not in the buffer cache already.
      
      In stable branches, the old RBM_ZERO constant is renamed to RBM_DO_NOT_USE,
      to avoid breaking any 3rd party extensions that might use RBM_ZERO. More
      importantly, this avoids renumbering the other enum values, which would
      cause even bigger confusion in extensions that use ReadBufferExtended, but
      haven't been recompiled.
      
      Backpatch to all supported versions; this has been racy since hot standby
      was introduced.
      81c45081
    • Alvaro Herrera's avatar
      Tweak row-level locking documentation · 35fed516
      Alvaro Herrera authored
      Move the meat of locking levels to mvcc.sgml, leaving only a link to it
      in the SELECT reference page.
      
      Michael Paquier, with some tweaks by Álvaro
      35fed516
    • Robert Haas's avatar
      Move the guts of our Levenshtein implementation into core. · c0828b78
      Robert Haas authored
      The hope is that we can use this to produce better diagnostics in
      some cases.
      
      Peter Geoghegan, reviewed by Michael Paquier, with some further
      changes by me.
      c0828b78
    • Peter Eisentraut's avatar
      1d69ae41
    • Heikki Linnakangas's avatar
    • Fujii Masao's avatar
      Rename pending_list_cleanup_size to gin_pending_list_limit. · c291503b
      Fujii Masao authored
      Since this parameter is only for GIN index, it's better to
      add "gin" to the parameter name for easier understanding.
      c291503b
  6. 12 Nov, 2014 5 commits
    • Tom Lane's avatar
      Explicitly support the case that a plancache's raw_parse_tree is NULL. · 67770803
      Tom Lane authored
      This only happens if a client issues a Parse message with an empty query
      string, which is a bit odd; but since it is explicitly called out as legal
      by our FE/BE protocol spec, we'd probably better continue to allow it.
      
      Fix by adding tests everywhere that the raw_parse_tree field is passed to
      functions that don't or shouldn't accept NULL.  Also make it clear in the
      relevant comments that NULL is an expected case.
      
      This reverts commits a73c9dba and
      2e9650cb, which fixed specific crash
      symptoms by hacking things at what now seems to be the wrong end, ie the
      callee functions.  Making the callees allow NULL is superficially more
      robust, but it's not always true that there is a defensible thing for the
      callee to do in such cases.  The caller has more context and is better
      able to decide what the empty-query case ought to do.
      
      Per followup discussion of bug #11335.  Back-patch to 9.2.  The code
      before that is sufficiently different that it would require development
      of a separate patch, which doesn't seem worthwhile for what is believed
      to be an essentially cosmetic change.
      67770803
    • Andres Freund's avatar
      Fix several weaknesses in slot and logical replication on-disk serialization. · ec5896ae
      Andres Freund authored
      Heikki noticed in 544E23C0.8090605@vmware.com that slot.c and
      snapbuild.c were missing the FIN_CRC32 call when computing/checking
      checksums of on disk files. That doesn't lower the the error detection
      capabilities of the checksum, but is inconsistent with other usages.
      
      In a followup mail Heikki also noticed that, contrary to a comment,
      the 'version' and 'length' struct fields of replication slot's on disk
      data where not covered by the checksum. That's not likely to lead to
      actually missed corruption as those fields are cross checked with the
      expected version and the actual file length. But it's wrong
      nonetheless.
      
      As fixing these issues makes existing on disk files unreadable, bump
      the expected versions of on disk files for both slots and logical
      decoding historic catalog snapshots.  This means that loading old
      files will fail with
      ERROR: "replication slot file ... has unsupported version 1"
      and
      ERROR: "snapbuild state file ... has unsupported version 1 instead of
      2" respectively. Given the low likelihood of anybody already using
      these new features in a production setup that seems acceptable.
      
      Fixing these issues made me notice that there's no regression test
      covering the loading of historic snapshot from disk - so add one.
      
      Backpatch to 9.4 where these features were introduced.
      ec5896ae
    • Andres Freund's avatar
      Add interrupt checks to contrib/pg_prewarm. · bd4ae0f3
      Andres Freund authored
      Currently the extension's pg_prewarm() function didn't check
      interrupts once it started "warming" data. Since individual calls can
      take a long while it's important for them to be interruptible.
      
      Backpatch to 9.4 where pg_prewarm was introduced.
      bd4ae0f3
    • Noah Misch's avatar
      Use just one database connection in the "tablespace" test. · 28245b84
      Noah Misch authored
      On Windows, DROP TABLESPACE has a race condition when run concurrently
      with other processes having opened files in the tablespace.  This led to
      a rare failure on buildfarm member frogmouth.  Back-patch to 9.4, where
      the reconnection was introduced.
      28245b84
    • Peter Eisentraut's avatar
      Message improvements · 8339f33d
      Peter Eisentraut authored
      8339f33d