1. 08 Oct, 2014 1 commit
    • Robert Haas's avatar
      Extend shm_mq API with new functions shm_mq_sendv, shm_mq_set_handle. · 7bb0e974
      Robert Haas authored
      shm_mq_sendv sends a message to the queue assembled from multiple
      locations.  This is expected to be used by forthcoming patches to
      allow frontend/backend protocol messages to be sent via shm_mq, but
      might be useful for other purposes as well.
      
      shm_mq_set_handle associates a BackgroundWorkerHandle with an
      already-existing shm_mq_handle.  This solves a timing problem when
      creating a shm_mq to communicate with a newly-launched background
      worker: if you attach to the queue first, and the background worker
      fails to start, you might block forever trying to do I/O on the queue;
      but if you start the background worker first, but then die before
      attaching to the queue, the background worrker might block forever
      trying to do I/O on the queue.  This lets you attach before starting
      the worker (so that the worker is protected) and then associate the
      BackgroundWorkerHandle later (so that you are also protected).
      
      Patch by me, reviewed by Stephen Frost.
      7bb0e974
  2. 07 Oct, 2014 3 commits
    • Alvaro Herrera's avatar
      Implement SKIP LOCKED for row-level locks · df630b0d
      Alvaro Herrera authored
      This clause changes the behavior of SELECT locking clauses in the
      presence of locked rows: instead of causing a process to block waiting
      for the locks held by other processes (or raise an error, with NOWAIT),
      SKIP LOCKED makes the new reader skip over such rows.  While this is not
      appropriate behavior for general purposes, there are some cases in which
      it is useful, such as queue-like tables.
      
      Catalog version bumped because this patch changes the representation of
      stored rules.
      
      Reviewed by Craig Ringer (based on a previous attempt at an
      implementation by Simon Riggs, who also provided input on the syntax
      used in the current patch), David Rowley, and Álvaro Herrera.
      
      Author: Thomas Munro
      df630b0d
    • Robert Haas's avatar
      Fix typo in elog message. · c421efd2
      Robert Haas authored
      c421efd2
    • Tom Lane's avatar
      Fix array overrun in ecpg's version of ParseDateTime(). · 55bfdd1c
      Tom Lane authored
      The code wrote a value into the caller's field[] array before checking
      to see if there was room, which of course is backwards.  Per report from
      Michael Paquier.
      
      I fixed the equivalent bug in the backend's version of this code way back
      in 630684d3, but failed to think about ecpg's copy.  Fortunately
      this doesn't look like it would be exploitable for anything worse than a
      core dump: an external attacker would have no control over the single word
      that gets written.
      55bfdd1c
  3. 06 Oct, 2014 4 commits
    • Stephen Frost's avatar
      Clean up Create/DropReplicationSlot query buffer · 273b29db
      Stephen Frost authored
      CreateReplicationSlot() and DropReplicationSlot() were not cleaning up
      the query buffer in some cases (mostly error conditions) which meant a
      small leak.  Not generally an issue as the error case would result in an
      immediate exit, but not difficult to fix either and reduces the number
      of false positives from code analyzers.
      
      In passing, also add appropriate PQclear() calls to RunIdentifySystem().
      
      Pointed out by Coverity.
      273b29db
    • Andres Freund's avatar
      Add support for managing physical replication slots to pg_receivexlog. · d9f38c7a
      Andres Freund authored
      pg_receivexlog already has the capability to use a replication slot to
      reserve WAL on the upstream node. But the used slot currently has to
      be created via SQL.
      
      To allow using slots directly, without involving SQL, add
      --create-slot and --drop-slot actions, analogous to the logical slot
      manipulation support in pg_recvlogical.
      
      Author: Michael Paquier
      Discussion: CABUevEx+zrOHZOQg+dPapNPFRJdsk59b=TSVf30Z71GnFXhQaw@mail.gmail.com
      d9f38c7a
    • Andres Freund's avatar
      Rename pg_recvlogical's --create/--drop to --create-slot/--drop-slot. · c8b6cba8
      Andres Freund authored
      A future patch (9.5 only) adds slot management to pg_receivexlog. The
      verbs create/drop don't seem descriptive enough there. It seems better
      to rename pg_recvlogical's commands now, in beta, than live with the
      inconsistency forever.
      
      The old form (e.g. --drop) will still be accepted by virtue of most
      getopt_long() options accepting abbreviations for long commands.
      
      Backpatch to 9.4 where pg_recvlogical was introduced.
      
      Author: Michael Paquier and Andres Freund
      Discussion: CAB7nPqQtt79U6FmhwvgqJmNyWcVCbbV-nS72j_jyPEopERg9rg@mail.gmail.com
      c8b6cba8
    • Peter Eisentraut's avatar
      Translation updates · 1ec4a970
      Peter Eisentraut authored
      1ec4a970
  4. 05 Oct, 2014 2 commits
    • Tom Lane's avatar
      Update 9.4 release notes for commits through today. · f706f2c1
      Tom Lane authored
      Add entries for recent changes, including noting the JSONB format change
      and the recent timezone data changes.  We should remove those two items
      before 9.4 final: the JSONB change will be of no interest in the long
      run, and it's not normally our habit to mention timezone updates in
      major-release notes.  But it seems important to document them temporarily
      for beta testers.
      
      I failed to resist the temptation to wordsmith a couple of existing
      entries, too.
      f706f2c1
    • Robert Haas's avatar
      Eliminate one background-worker-related flag variable. · d0410d66
      Robert Haas authored
      Teach sigusr1_handler() to use the same test for whether a worker
      might need to be started as ServerLoop().  Aside from being perhaps
      a bit simpler, this prevents a potentially-unbounded delay when
      starting a background worker.  On some platforms, select() doesn't
      return when interrupted by a signal, but is instead restarted,
      including a reset of the timeout to the originally-requested value.
      If signals arrive often enough, but no connection requests arrive,
      sigusr1_handler() will be executed repeatedly, but the body of
      ServerLoop() won't be reached.  This change ensures that, even in
      that case, background workers will eventually get launched.
      
      This is far from a perfect fix; really, we need select() to return
      control to ServerLoop() after an interrupt, either via the self-pipe
      trick or some other mechanism.  But that's going to require more
      work and discussion, so let's do this for now to at least mitigate
      the damage.
      
      Per investigation of test_shm_mq failures on buildfarm member anole.
      d0410d66
  5. 04 Oct, 2014 1 commit
    • Tom Lane's avatar
      Update time zone data files to tzdata release 2014h. · 513d06de
      Tom Lane authored
      Most zones in the Russian Federation are subtracting one or two hours
      as of 2014-10-26.  Update the meanings of the abbreviations IRKT, KRAT,
      MAGT, MSK, NOVT, OMST, SAKT, VLAT, YAKT, YEKT to match.
      
      The IANA timezone database has adopted abbreviations of the form AxST/AxDT
      for all Australian time zones, reflecting what they believe to be current
      majority practice Down Under.  These names do not conflict with usage
      elsewhere (other than ACST for Acre Summer Time, which has been in disuse
      since 1994).  Accordingly, adopt these names into our "Default" timezone
      abbreviation set.  The "Australia" abbreviation set now contains only
      CST,EAST,EST,SAST,SAT,WST, all of which are thought to be mostly historical
      usage.  Note that SAST has also been changed to be South Africa Standard
      Time in the "Default" abbreviation set.
      
      Add zone abbreviations SRET (Asia/Srednekolymsk) and XJT (Asia/Urumqi),
      and use WSST/WSDT for western Samoa.
      
      Also a DST law change in the Turks & Caicos Islands (America/Grand_Turk),
      and numerous corrections for historical time zone data.
      513d06de
  6. 03 Oct, 2014 8 commits
    • Tom Lane's avatar
      Update time zone abbreviations lists. · 4f499eee
      Tom Lane authored
      This updates known_abbrevs.txt to be what it should have been already,
      were my -P patch not broken; and updates some tznames/ entries that
      missed getting any love in previous timezone data updates because zic
      failed to flag the change of abbreviation.
      
      The non-cosmetic updates:
      
      * Remove references to "ADT" as "Arabia Daylight Time", an abbreviation
      that's been out of use since 2007; therefore, claiming there is a conflict
      with "Atlantic Daylight Time" doesn't seem especially helpful.  (We have
      left obsolete entries in the files when they didn't conflict with anything,
      but that seems like a different situation.)
      
      * Fix entirely incorrect GMT offsets for CKT (Cook Islands), FJT, FJST
      (Fiji); we didn't even have them on the proper side of the date line.
      (Seems to have been aboriginal errors in our tznames data; there's no
      evidence anything actually changed recently.)
      
      * FKST (Falkland Islands Summer Time) is now used all year round, so
      don't mark it as a DST abbreviation.
      
      * Update SAKT (Sakhalin) to mean GMT+11 not GMT+10.
      
      In cosmetic changes, I fixed a bunch of wrong (or at least obsolete)
      claims about abbreviations not being present in the zic files, and
      tried to be consistent about how obsolete abbreviations are labeled.
      
      Note the underlying timezone/data files are still at release 2014e;
      this is just trying to get us in sync with what those files actually
      say before we go to the next update.
      4f499eee
    • Stephen Frost's avatar
      Fix CreatePolicy, pg_dump -v; psql and doc updates · 78d72563
      Stephen Frost authored
      Peter G pointed out that valgrind was, rightfully, complaining about
      CreatePolicy() ending up copying beyond the end of the parsed policy
      name.  Name is a fixed-size type and we need to use namein (through
      DirectFunctionCall1()) to flush out the entire array before we pass
      it down to heap_form_tuple.
      
      Michael Paquier pointed out that pg_dump --verbose was missing a
      newline and Fabrízio de Royes Mello further pointed out that the
      schema was also missing from the messages, so fix those also.
      
      Also, based on an off-list comment from Kevin, rework the psql \d
      output to facilitate copy/pasting into a new CREATE or ALTER POLICY
      command.
      
      Lastly, improve the pg_policies view and update the documentation for
      it, along with a few other minor doc corrections based on an off-list
      discussion with Adam Brightwell.
      78d72563
    • Tom Lane's avatar
      Fix bogus logic for zic -P option. · 59685704
      Tom Lane authored
      The quick hack I added to zic to dump out currently-in-use timezone
      abbreviations turns out to have a nasty bug: within each zone, it was
      printing the last "struct ttinfo" to be *defined*, not necessarily the
      last one in use.  This was mainly a problem in zones that had changed the
      meaning of their zone abbreviation (to another GMT offset value) and later
      changed it back.
      
      As a result of this error, we'd missed out updating the tznames/ files
      for some jurisdictions that have changed their zone abbreviations since
      the tznames/ files were originally created.  I'll address the missing data
      updates in a separate commit.
      59685704
    • Alvaro Herrera's avatar
      Don't balance vacuum cost delay when per-table settings are in effect · 1021bd6a
      Alvaro Herrera authored
      When there are cost-delay-related storage options set for a table,
      trying to make that table participate in the autovacuum cost-limit
      balancing algorithm produces undesirable results: instead of using the
      configured values, the global values are always used,
      as illustrated by Mark Kirkwood in
      http://www.postgresql.org/message-id/52FACF15.8020507@catalyst.net.nz
      
      Since the mechanism is already complicated, just disable it for those
      cases rather than trying to make it cope.  There are undesirable
      side-effects from this too, namely that the total I/O impact on the
      system will be higher whenever such tables are vacuumed.  However, this
      is seen as less harmful than slowing down vacuum, because that would
      cause bloat to accumulate.  Anyway, in the new system it is possible to
      tweak options to get the precise behavior one wants, whereas with the
      previous system one was simply hosed.
      
      This has been broken forever, so backpatch to all supported branches.
      This might affect systems where cost_limit and cost_delay have been set
      for individual tables.
      1021bd6a
    • Robert Haas's avatar
      Fix typos in comments. · 017b2e98
      Robert Haas authored
      Etsuro Fujita
      017b2e98
    • Robert Haas's avatar
      Still another typo fix for 0709b7ee. · 9019264f
      Robert Haas authored
      Buildfarm member anole caught this one.
      9019264f
    • Heikki Linnakangas's avatar
      Check for GiST index tuples that don't fit on a page. · 7690ddea
      Heikki Linnakangas authored
      The page splitting code would go into infinite recursion if you try to
      insert an index tuple that doesn't fit even on an empty page.
      
      Per analysis and suggested fix by Andrew Gierth. Fixes bug #11555, reported
      by Bryan Seitz (analysis happened over IRC). Backpatch to all supported
      versions.
      7690ddea
    • Heikki Linnakangas's avatar
      Fix documentation for CREATE SEQUENCE IF NOT EXISTS. · 7a08e21f
      Heikki Linnakangas authored
      The [ IF NOT EXISTS ] was put in wrong place in the syntax.
      
      Pointed out by Marti Raudsepp.
      7a08e21f
  7. 02 Oct, 2014 4 commits
  8. 01 Oct, 2014 8 commits
    • Tom Lane's avatar
      Fix some more problems with nested append relations. · 5a6c168c
      Tom Lane authored
      As of commit a87c7291 (which later got backpatched as far as 9.1),
      we're explicitly supporting the notion that append relations can be
      nested; this can occur when UNION ALL constructs are nested, or when
      a UNION ALL contains a table with inheritance children.
      
      Bug #11457 from Nelson Page, as well as an earlier report from Elvis
      Pranskevichus, showed that there were still nasty bugs associated with such
      cases: in particular the EquivalenceClass mechanism could try to generate
      "join" clauses connecting an appendrel child to some grandparent appendrel,
      which would result in assertion failures or bogus plans.
      
      Upon investigation I concluded that all current callers of
      find_childrel_appendrelinfo() need to be fixed to explicitly consider
      multiple levels of parent appendrels.  The most complex fix was in
      processing of "broken" EquivalenceClasses, which are ECs for which we have
      been unable to generate all the derived equality clauses we would like to
      because of missing cross-type equality operators in the underlying btree
      operator family.  That code path is more or less entirely untested by
      the regression tests to date, because no standard opfamilies have such
      holes in them.  So I wrote a new regression test script to try to exercise
      it a bit, which turned out to be quite a worthwhile activity as it exposed
      existing bugs in all supported branches.
      
      The present patch is essentially the same as far back as 9.2, which is
      where parameterized paths were introduced.  In 9.0 and 9.1, we only need
      to back-patch a small fragment of commit 5b7b5518, which fixes failure to
      propagate out the original WHERE clauses when a broken EC contains constant
      members.  (The regression test case results show that these older branches
      are noticeably stupider than 9.2+ in terms of the quality of the plans
      generated; but we don't really care about plan quality in such cases,
      only that the plan not be outright wrong.  A more invasive fix in the
      older branches would not be a good idea anyway from a plan-stability
      standpoint.)
      5a6c168c
    • Andres Freund's avatar
      Refactor replication connection code of various pg_basebackup utilities. · 0c013e08
      Andres Freund authored
      Move some more code to manage replication connection command to
      streamutil.c. A later patch will introduce replication slot via
      pg_receivexlog and this avoid duplicating relevant code between
      pg_receivexlog and pg_recvlogical.
      
      Author: Michael Paquier, with some editing by me.
      0c013e08
    • Andres Freund's avatar
      pg_recvlogical.c code review. · fdf81c9a
      Andres Freund authored
      Several comments still referred to 'initiating', 'freeing', 'stopping'
      replication slots. These were terms used during different phases of
      the development of logical decoding, but are no long accurate.
      
      Also rename StreamLog() to StreamLogicalLog() and add 'void' to the
      prototype.
      
      Author: Michael Paquier, with some editing by me.
      
      Backpatch to 9.4 where pg_recvlogical was introduced.
      fdf81c9a
    • Heikki Linnakangas's avatar
      Remove num_xloginsert_locks GUC, replace with a #define · 5fa6c81a
      Heikki Linnakangas authored
      I left the GUC in place for the beta period, so that people could experiment
      with different values. No-one's come up with any data that a different value
      would be better under some circumstances, so rather than try to document to
      users what the GUC, let's just hard-code the current value, 8.
      5fa6c81a
    • Andres Freund's avatar
      Block signals while computing the sleep time in postmaster's main loop. · a39e78b7
      Andres Freund authored
      DetermineSleepTime() was previously called without blocked
      signals. That's not good, because it allows signal handlers to
      interrupt its workings.
      
      DetermineSleepTime() was added in 9.3 with the addition of background
      workers (da07a1e8), where it only read from
      BackgroundWorkerList.
      
      Since 9.4, where dynamic background workers were added (7f7485a0),
      the list is also manipulated in DetermineSleepTime(). That's bad
      because the list now can be persistently corrupted if modified by both
      a signal handler and DetermineSleepTime().
      
      This was discovered during the investigation of hangs on buildfarm
      member anole. It's unclear whether this bug is the source of these
      hangs or not, but it's worth fixing either way. I have confirmed that
      it can cause crashes.
      
      It luckily looks like this only can cause problems when bgworkers are
      actively used.
      
      Discussion: 20140929193733.GB14400@awork2.anarazel.de
      
      Backpatch to 9.3 where background workers were introduced.
      a39e78b7
    • Heikki Linnakangas's avatar
      Add functions for dealing with PGP armor header lines to pgcrypto. · 32984d8f
      Heikki Linnakangas authored
      This add a new pgp_armor_headers function to extract armor headers from an
      ASCII-armored blob, and a new overloaded variant of the armor function, for
      constructing an ASCII-armor with extra headers.
      
      Marko Tiikkaja and me.
      32984d8f
    • Andres Freund's avatar
      Improve documentation about binary/textual output mode for output plugins. · 0ef3c29a
      Andres Freund authored
      Also improve related error message as it contributed to the confusion.
      
      Discussion: CAB7nPqQrqFzjqCjxu4GZzTrD9kpj6HMn9G5aOOMwt1WZ8NfqeA@mail.gmail.com,
          CAB7nPqQXc_+g95zWnqaa=mVQ4d3BVRs6T41frcEYi2ocUrR3+A@mail.gmail.com
      
      Per discussion between Michael Paquier, Robert Haas and Andres Freund
      
      Backpatch to 9.4 where logical decoding was introduced.
      0ef3c29a
    • Andres Freund's avatar
      Rename CACHE_LINE_SIZE to PG_CACHE_LINE_SIZE. · ef886384
      Andres Freund authored
      As noted in http://bugs.debian.org/763098 there is a conflict between
      postgres' definition of CACHE_LINE_SIZE and the definition by various
      *bsd platforms. It's debatable who has the right to define such a
      name, but postgres' use was only introduced in 375d8526 (9.4), so
      it seems like a good idea to rename it.
      
      Discussion: 20140930195756.GC27407@msg.df7cb.de
      
      Per complaint of Christoph Berg in the above email, although he's not
      the original bug reporter.
      
      Backpatch to 9.4 where the define was introduced.
      ef886384
  9. 30 Sep, 2014 3 commits
  10. 29 Sep, 2014 4 commits
    • Andres Freund's avatar
      doc fix for pg_recvlogical: --create doesn't immediately exit. · 445d2628
      Andres Freund authored
      Author: Michael Paquier
      445d2628
    • Stephen Frost's avatar
      Also revert e3ec0728, JSON regression tests · 08da8947
      Stephen Frost authored
      Managed to forget to update the other JSON regression test output,
      again.  Revert the commit which fixed it before.
      
      Per buildfarm.
      08da8947
    • Stephen Frost's avatar
      Revert 95d737ff to add 'ignore_nulls' · c8a026e4
      Stephen Frost authored
      Per discussion, revert the commit which added 'ignore_nulls' to
      row_to_json.  This capability would be better added as an independent
      function rather than being bolted on to row_to_json.  Additionally,
      the implementation didn't address complex JSON objects, and so was
      incomplete anyway.
      
      Pointed out by Tom and discussed with Andrew and Robert.
      c8a026e4
    • Tom Lane's avatar
      Change JSONB's on-disk format for improved performance. · def4c28c
      Tom Lane authored
      The original design used an array of offsets into the variable-length
      portion of a JSONB container.  However, such an array is basically
      uncompressible by simple compression techniques such as TOAST's LZ
      compressor.  That's bad enough, but because the offset array is at the
      front, it tended to trigger the give-up-after-1KB heuristic in the TOAST
      code, so that the entire JSONB object was stored uncompressed; which was
      the root cause of bug #11109 from Larry White.
      
      To fix without losing the ability to extract a random array element in O(1)
      time, change this scheme so that most of the JEntry array elements hold
      lengths rather than offsets.  With data that's compressible at all, there
      tend to be fewer distinct element lengths, so that there is scope for
      compression of the JEntry array.  Every N'th entry is still an offset.
      To determine the length or offset of any specific element, we might have
      to examine up to N preceding JEntrys, but that's still O(1) so far as the
      total container size is concerned.  Testing shows that this cost is
      negligible compared to other costs of accessing a JSONB field, and that
      the method does largely fix the incompressible-data problem.
      
      While at it, rearrange the order of elements in a JSONB object so that
      it's "all the keys, then all the values" not alternating keys and values.
      This doesn't really make much difference right at the moment, but it will
      allow providing a fast path for extracting individual object fields from
      large JSONB values stored EXTERNAL (ie, uncompressed), analogously to the
      existing optimization for substring extraction from large EXTERNAL text
      values.
      
      Bump catversion to denote the incompatibility in on-disk format.
      We will need to fix pg_upgrade to disallow upgrading jsonb data stored
      with 9.4 betas 1 and 2.
      
      Heikki Linnakangas and Tom Lane
      def4c28c
  11. 26 Sep, 2014 2 commits
    • Stephen Frost's avatar
      Fix relcache for policies, and doc updates · ff27fcfa
      Stephen Frost authored
      Andres pointed out that there was an extra ';' in equalPolicies, which
      made me realize that my prior testing with CLOBBER_CACHE_ALWAYS was
      insufficient (it didn't always catch the issue, just most of the time).
      Thanks to that, a different issue was discovered, specifically in
      equalRSDescs.  This change corrects eqaulRSDescs to return 'true' once
      all policies have been confirmed logically identical.  After stepping
      through both functions to ensure correct behavior, I ran this for
      about 12 hours of CLOBBER_CACHE_ALWAYS runs of the regression tests
      with no failures.
      
      In addition, correct a few typos in the documentation which were pointed
      out by Thom Brown (thanks!) and improve the policy documentation further
      by adding a flushed out usage example based on a unix passwd file.
      
      Lastly, clean up a few comments in the regression tests and pg_dump.h.
      ff27fcfa
    • Robert Haas's avatar
      Fix identify_locking_dependencies for schema-only dumps. · 07d46a89
      Robert Haas authored
      Without this fix, parallel restore of a schema-only dump can deadlock,
      because when the dump is schema-only, the dependency will still be
      pointing at the TABLE item rather than the TABLE DATA item.
      
      Robert Haas and Tom Lane
      07d46a89