1. 09 Jun, 2014 2 commits
    • Tom Lane's avatar
      Fix infinite loop when splitting inner tuples in SPGiST text indexes. · c170655c
      Tom Lane authored
      Previously, the code used a node label of zero both for strings that
      contain no bytes beyond the inner tuple's prefix, and for cases where an
      "allTheSame" inner tuple has to be split to allow a string with a different
      next byte to be inserted into it.  Failing to distinguish these cases meant
      that if a string ending with the current prefix needed to be inserted into
      an allTheSame tuple, we got into an infinite loop, because after splitting
      the tuple we'd descend into the child allTheSame tuple and then find we
      need to split again.
      
      To fix, instead use -1 and -2 as the node labels for these two cases.
      This requires widening the node label type from "char" to int2, but
      fortunately SPGiST stores all pass-by-value node label types in their
      Datum representation, which means that this change is transparently upward
      compatible so far as the on-disk representation goes.  We continue to
      recognize zero as a dummy node label for reading purposes, but will not
      attempt to push new index entries down into such a label, so that the loop
      won't occur even when dealing with an existing index.
      
      Per report from Teodor Sigaev.  Back-patch to 9.2 where the faulty
      code was introduced.
      c170655c
    • Alvaro Herrera's avatar
      Wrap multixact/members correctly during extension, take 2 · b0b263ba
      Alvaro Herrera authored
      In a50d9762 I already changed this, but got it wrong for the case
      where the number of members is larger than the number of entries that
      fit in the last page of the last segment.
      
      As reported by Serge Negodyuck in a followup to bug #8673.
      b0b263ba
  2. 05 Jun, 2014 7 commits
    • Andres Freund's avatar
      Fix off-by-one in decoding causing one-record events to be skipped. · fe7337f2
      Andres Freund authored
      A ReorderBufferTransaction's end_lsn, the sentPtr advocated by
      walsender keepalive messages, and the end location remembered by the
      decoding get_*changes* SQL functions all use the location of the last
      read record + 1. I.e. the LSN points to the beginning of the next
      record. That cannot realistically be changed without changing the
      replication protocol because that's how keepalive messages have worked
      since 9.0.
      The bug is that the logic inside the snapshot builder, which decides
      whether a transaction's contents should be decoded, assumed the start
      location would point towards the last byte of the last record. The
      reason this didn't actually cause visible problems is that currently
      that decision is only made for commit records. Since interesting
      transactions always have at least one additional record - containing
      actual data - we'd never skip a transaction.
      But if there ever were transactions, or other events, with just one
      record containing important information, we'd skip them after stopping
      and restarting logical decoding.
      fe7337f2
    • Tom Lane's avatar
      Add defenses against running with a wrong selection of LOBLKSIZE. · 5f93c378
      Tom Lane authored
      It's critical that the backend's idea of LOBLKSIZE match the way data has
      actually been divided up in pg_largeobject.  While we don't provide any
      direct way to adjust that value, doing so is a one-line source code change
      and various people have expressed interest recently in changing it.  So,
      just as with TOAST_MAX_CHUNK_SIZE, it seems prudent to record the value in
      pg_control and cross-check that the backend's compiled-in setting matches
      the on-disk data.
      
      Also tweak the code in inv_api.c so that fetches from pg_largeobject
      explicitly verify that the length of the data field is not more than
      LOBLKSIZE.  Formerly we just had Asserts() for that, which is no protection
      at all in production builds.  In some of the call sites an overlength data
      value would translate directly to a security-relevant stack clobber, so it
      seems worth one extra runtime comparison to be sure.
      
      In the back branches, we can't change the contents of pg_control; but we
      can still make the extra checks in inv_api.c, which will offer some amount
      of protection against running with the wrong value of LOBLKSIZE.
      5f93c378
    • Andres Freund's avatar
      Consistently spell a replication slot's name as slot_name. · f0c10856
      Andres Freund authored
      Previously there's been a mix between 'slotname' and 'slot_name'. It's
      not nice to be unneccessarily inconsistent in a new feature. As a post
      beta1 initdb now is required in the wake of eeca4cd3, fix the
      inconsistencies.
      Most the changes won't affect usage of replication slots because the
      majority of changes is around function parameter names. The prominent
      exception to that is that the recovery.conf parameter
      'primary_slotname' is now named 'primary_slot_name'.
      f0c10856
    • Andres Freund's avatar
      Move regression test listing of builtin leakproof functions to opr_sanity.sql. · e0cb4aa8
      Andres Freund authored
      The original location in create_function_3.sql didn't invite the close
      structinity warranted for adding new leakproof functions. Add comments
      to the test explaining that functions should only be added after
      careful consideration and understanding what a leakproof function is.
      
      Per complaint from Tom Lane after 5eebb8d9.
      e0cb4aa8
    • Heikki Linnakangas's avatar
      Adjust SP-GiST WAL record formats to reduce alignment padding. · 8776faa8
      Heikki Linnakangas authored
      The way the code was written, the padding was copied from uninitialized
      memory areas.. Because the structs are local variables in the code where
      the WAL records are constructed, making them larger and zeroing the padding
      bytes would not make the code very pretty, so rather than fixing this
      directly by zeroing out the padding bytes, it seems more clear to not try to
      align the tuples in the WAL records. The redo functions are taught to copy
      the tuple header to a local variable to avoid unaligned access.
      
      Stable-branches have the same problem, but we can't change the WAL format
      there, so fix in master only. Reading a few random extra bytes at the stack
      is harmless in practice, so it's not worth crafting a different
      back-patchable fix.
      
      Per reports from Kevin Grittner and Andres Freund, using clang static
      analyzer and Valgrind, respectively.
      8776faa8
    • Tom Lane's avatar
      Tweak new regression test case for better portability. · d4d48a5e
      Tom Lane authored
      Buildfarm says we get different plans on 32-bit and 64-bit platforms,
      probably because of MAXALIGN-related differences in memory-consumption
      calculations.  Add some dummy WHERE clauses so that the planner estimates
      different sizes for the three generate_series() relations; that should
      stabilize the choice of join order.
      d4d48a5e
    • Tom Lane's avatar
      Add btree and hash opclasses for pg_lsn. · 4c8ab1b9
      Tom Lane authored
      This is needed to allow ORDER BY, DISTINCT, etc to work as expected for
      pg_lsn values.
      
      We had previously decided to put this off for 9.5, but in view of commit
      eeca4cd3 there's no reason to avoid a
      catversion bump for 9.4beta2, and this does make a pretty significant
      usability difference for pg_lsn.
      
      Michael Paquier, with fixes from Andres Freund and Tom Lane
      4c8ab1b9
  3. 04 Jun, 2014 5 commits
    • Tom Lane's avatar
      Bump PG_CONTROL_VERSION for previous 9.4 changes. · eeca4cd3
      Tom Lane authored
      This should have been done in 6bc8ef0b
      and/or 50e54709, but better late than
      never.  If we don't change this then we risk 9.3 pg_controldata or
      pg_resetxlog being inappropriately used against a 9.4 pg_control file,
      or vice versa.
      eeca4cd3
    • Andres Freund's avatar
      Fix longstanding bug in HeapTupleSatisfiesVacuum(). · 621a99a6
      Andres Freund authored
      HeapTupleSatisfiesVacuum() didn't properly discern between
      DELETE_IN_PROGRESS and INSERT_IN_PROGRESS for rows that have been
      inserted in the current transaction and deleted in a aborted
      subtransaction of the current backend. At the very least that caused
      problems for CLUSTER and CREATE INDEX in transactions that had
      aborting subtransactions producing rows, leading to warnings like:
      WARNING:  concurrent delete in progress within table "..."
      possibly in an endless, uninterruptible, loop.
      
      Instead of treating *InProgress xmins the same as *IsCurrent ones,
      treat them as being distinct like the other visibility routines. As
      implemented this separatation can cause a behaviour change for rows
      that have been inserted and deleted in another, still running,
      transaction. HTSV will now return INSERT_IN_PROGRESS instead of
      DELETE_IN_PROGRESS for those. That's both, more in line with the other
      visibility routines and arguably more correct. The latter because a
      INSERT_IN_PROGRESS will make callers look at/wait for xmin, instead of
      xmax.
      The only current caller where that's possibly worse than the old
      behaviour is heap_prune_chain() which now won't mark the page as
      prunable if a row has concurrently been inserted and deleted. That's
      harmless enough.
      
      As a cautionary measure also insert a interrupt check before the gotos
      in IndexBuildHeapScan() that lead to the uninterruptible loop. There
      are other possible causes, like a row that several sessions try to
      update and all fail, for repeated loops and the cost of doing so in
      the retry case is low.
      
      As this bug goes back all the way to the introduction of
      subtransactions in 573a71a5 backpatch to all supported releases.
      
      Reported-By: Sandro Santilli
      621a99a6
    • Fujii Masao's avatar
      Add description of pg_stat directory into doc. · c8c9c1f5
      Fujii Masao authored
      Back-patch to 9.3 where pg_stat directory was introduced.
      c8c9c1f5
    • Fujii Masao's avatar
      Save pg_stat_statements statistics file into $PGDATA/pg_stat directory at shutdown. · 654e8e44
      Fujii Masao authored
      187492b6 changed pgstat.c so that
      the stats files were saved into $PGDATA/pg_stat directory when the server
      was shutdowned. But it accidentally forgot to change the location of
      pg_stat_statements permanent stats file. This commit fixes pg_stat_statements
      so that its stats file is also saved into $PGDATA/pg_stat at shutdown.
      
      Since this fix changes the file layout, we don't back-patch it to 9.3
      where this oversight was introduced.
      654e8e44
    • Peter Eisentraut's avatar
      Silence Bison deprecation warnings · 55fb759a
      Peter Eisentraut authored
      Bison >=3.0 issues warnings about
      
          %name-prefix="base_yy"
      
      instead of the now preferred
      
          %name-prefix "base_yy"
      
      but the latter doesn't work with Bison 2.3 or less.  So for now we
      silence the deprecation warnings.
      55fb759a
  4. 03 Jun, 2014 6 commits
    • Andrew Dunstan's avatar
      Use EncodeDateTime instead of to_char to render JSON timestamps. · ab14a73a
      Andrew Dunstan authored
      Per gripe from Peter Eisentraut and Tom Lane.
      
      The output is slightly different, but still ISO 8601 compliant: to_char
      doesn't output the minutes when time zone offset is an integer number of
      hours, while EncodeDateTime outputs ":00".
      
      The code is slightly adapted from code in xml.c
      ab14a73a
    • Andrew Dunstan's avatar
      Do not escape a unicode sequence when escaping JSON text. · 0ad1a816
      Andrew Dunstan authored
      Previously, any backslash in text being escaped for JSON was doubled so
      that the result was still valid JSON. However, this led to some perverse
      results in the case of Unicode sequences, These are now detected and the
      initial backslash is no longer escaped. All other backslashes are
      still escaped. No validity check is performed, all that is looked for is
      \uXXXX where X is a hexidecimal digit.
      
      This is a change from the 9.2 and 9.3 behaviour as noted in the Release
      notes.
      
      Per complaint from Teodor Sigaev.
      0ad1a816
    • Andrew Dunstan's avatar
      Output timestamps in ISO 8601 format when rendering JSON. · f30015b6
      Andrew Dunstan authored
      Many JSON processors require timestamp strings in ISO 8601 format in
      order to convert the strings. When converting a timestamp, with or
      without timezone, to a JSON datum we therefore now use such a format
      rather than the type's default text output, in functions such as
      to_json().
      
      This is a change in behaviour from 9.2 and 9.3, as noted in the release
      notes.
      f30015b6
    • Tom Lane's avatar
      Make plpython_unicode regression test work in more database encodings. · 2dfa15de
      Tom Lane authored
      This test previously used a data value containing U+0080, and would
      therefore fail if the database encoding didn't have an equivalent to
      that; which only about half of our supported server encodings do.
      We could fall back to using some plain-ASCII character, but that seems
      like it's losing most of the point of the test.  Instead switch to using
      U+00A0 (no-break space), which translates into all our supported encodings
      except the four in the EUC_xx family.
      
      Per buildfarm testing.  Back-patch to 9.1, which is as far back as this
      test is expected to succeed everywhere.  (9.0 has the test, but without
      back-patching some 9.1 code changes we could not expect to get consistent
      results across platforms anyway.)
      2dfa15de
    • Andres Freund's avatar
      Set the process latch when processing recovery conflict interrupts. · 44445b28
      Andres Freund authored
      Because RecoveryConflictInterrupt() didn't set the process latch
      anything using the latter to wait for events didn't get notified about
      recovery conflicts. Most latch users are never the target of recovery
      conflicts, which explains the lack of reports about this until
      now.
      Since 9.3 two possible affected users exist though: The sql callable
      pg_sleep() now uses latches to wait and background workers are
      expected to use latches in their main loop. Both would currently wait
      until the end of WaitLatch's timeout.
      
      Fix by adding a SetLatch() to RecoveryConflictInterrupt(). It'd also
      be possible to fix the issue by having each latch user set
      set_latch_on_sigusr1. That seems failure prone and though, as most of
      these callsites won't often receive recovery conflicts and thus will
      likely only be tested against normal query cancels et al. It'd also be
      unnecessarily verbose.
      
      Backpatch to 9.1 where latches were introduced. Arguably 9.3 would be
      sufficient, because that's where pg_sleep() was converted to waiting
      on the latch and background workers got introduced; but there could be
      user level code making use of the latch pre 9.3.
      44445b28
    • Andres Freund's avatar
      Use unaligned output in another regression test query to reduce diff noise. · 5eebb8d9
      Andres Freund authored
      Use the unaligned/no rowcount output mode in a regression tests that
      shows all built-in leakproof functions. Currently a new leakproof
      function will often change the alignment of all existing functions,
      making it hard to see the actual difference and creating unnecessary
      patch conflicts.
      
      Noticed while looking over a patch introducing new leakproof functions.
      5eebb8d9
  5. 02 Jun, 2014 1 commit
  6. 01 Jun, 2014 1 commit
    • Andrew Dunstan's avatar
      Improve the efficiency of certain jsonb get operations. · 1a4174a4
      Andrew Dunstan authored
      Instead of iterating over jsonb structures, use the inbuilt functions
      findJsonbValueFromContainerLen() and getIthJsonbValueFromContainer() to
      extract values directly. These functions use algorithms that are O(n log
      n) and O(1) respectively, whereas iterating is O(n), so we should see
      considerable speedup here.
      
      Teodor Sigaev.
      1a4174a4
  7. 31 May, 2014 1 commit
    • Andres Freund's avatar
      Improvements to the replication protocol documentation. · a5750982
      Andres Freund authored
      Document the CREATE_REPLICATION_SLOT's output_plugin parameter; that
      START_REPLICATION ... LOGICAL takes parameters; that START_REPLICATION
      ... LOGICAL uses the same messages as ... PHYSICAL; and be more
      consistent with the usage of <literal/>.
      
      Michael Paquier, with some additional changes by me.
      a5750982
  8. 30 May, 2014 3 commits
    • Tom Lane's avatar
      On OS X, link libpython normally, ignoring the "framework" framework. · 20561acf
      Tom Lane authored
      As of Xcode 5.0, Apple isn't including the Python framework as part of the
      SDK-level files, which means that linking to it might fail depending on
      whether Xcode thinks you've selected a specific SDK version.  According to
      their Tech Note 2328, they've basically deprecated the framework method of
      linking to libpython and are telling people to link to the shared library
      normally.  (I'm pretty sure this is in direct contradiction to the advice
      they were giving a few years ago, but whatever.)  Testing says that this
      approach works fine at least as far back as OS X 10.4.11, so let's just
      rip out the framework special case entirely.  We do still need a special
      case to decide that OS X provides a shared library at all, unfortunately
      (I wonder why the distutils check doesn't work ...).  But this is still
      less of a special case than before, so it's fine.
      
      Back-patch to all supported branches, since we'll doubtless be hearing
      about this more as more people update to recent Xcode.
      20561acf
    • Heikki Linnakangas's avatar
      Fix typos in MSVC solution file. · 512f3b03
      Heikki Linnakangas authored
      Michael Paquier
      512f3b03
    • Robert Haas's avatar
      In release notes, mention the need to initialize bgw_notify_pid. · 42be7d69
      Robert Haas authored
      Michael Paquier
      42be7d69
  9. 29 May, 2014 2 commits
    • Tom Lane's avatar
      When using the OSSP UUID library, cache its uuid_t state object. · c941aed9
      Tom Lane authored
      The original coding in contrib/uuid-ossp created and destroyed a uuid_t
      object (or, in some cases, even two of them) each time it was called.
      This is not the intended usage: you're supposed to keep the uuid_t object
      around so that the library can cache its state across uses.  (Other UUID
      libraries seem to keep equivalent state behind-the-scenes in static
      variables, but OSSP chose differently.)  Aside from being quite inefficient,
      creating a new uuid_t loses knowledge of the previously generated UUID,
      which in theory could result in duplicate V1-style UUIDs being created
      on sufficiently fast machines.
      
      On at least some platforms, creating a new uuid_t also draws some entropy
      from /dev/urandom, leaving less for the rest of the system.  This seems
      sufficiently unpleasant to justify back-patching this change.
      c941aed9
    • Tom Lane's avatar
      Fix uuid-ossp regression tests based on buildfarm feedback. · 25dd07e0
      Tom Lane authored
      The previous version of these tests expected uuid_generate_v1() to always
      emit MAC addresses with the local-admin and multicast address bits zero.
      However, several of the buildfarm critters are reporting values with the
      local-admin bit set.  (Perhaps they're running inside VMs or jails.)
      And a couple are reporting values with the multicast bit set, probably
      meaning that the UUID library couldn't read the system MAC address.
      
      Also, it emerges that if OSSP UUID can't read the system MAC address, it
      falls back to V1MC behavior wherein the whole node field gets randomized
      each time, breaking the test that expected the node field to remain stable
      in V1 output.  (It looks like e2fs doesn't behave that way, though.)
      
      It's not entirely clear why we can't get a system MAC address, since the
      buildfarm scripts would not work without internet access.  Nonetheless,
      the regression tests had better cope with the case, so adjust the tests
      to expect these behaviors.
      25dd07e0
  10. 28 May, 2014 12 commits
    • Tom Lane's avatar
      Revert "Fix bogus %name-prefix option syntax in all our Bison files." · 71ed8b3c
      Tom Lane authored
      This reverts commit 45b7abe5.
      
      It turns out that the %name-prefix syntax without "=" does not work
      at all in pre-2.4 Bison.  We are not prepared to make such a large
      jump in minimum required Bison version just to suppress a warning
      message in a version hardly any developers are using yet.
      When 3.0 gets more popular, we'll figure out a way to deal with this.
      In the meantime, BISONFLAGS=-Wno-deprecated is recommendable for
      anyone using 3.0 who doesn't want to see the warning.
      71ed8b3c
    • Andres Freund's avatar
      Don't pay heed to wal_sender_timeout while creating a decoding slot. · 21d48d66
      Andres Freund authored
      Sometimes CREATE_REPLICATION_SLOT ... LOGICAL ... needs to wait for
      further WAL using WalSndWaitForWal(). That used to always respect
      wal_sender_timeout and kill the session when waiting long enough
      because no feedback/ping messages can be sent while the slot is still
      being created.
      Introduce the notion that last_reply_timestamp = 0 means that the
      walsender currently doesn't need timeout processing to avoid that
      problem. Use that notion for CREATE_REPLICATION_SLOT ... LOGICAL.
      
      Bugreport and initial patch by Steve Singer, revised by me.
      21d48d66
    • Heikki Linnakangas's avatar
      Minor refactoring of jsonb_util.c · d1d50bff
      Heikki Linnakangas authored
      The only caller of compareJsonbScalarValue that needed locale-sensitive
      comparison of strings was also the only caller that didn't just check for
      equality. Separate the two cases for clarity: compareJsonbScalarValue now
      does locale-sensitive comparison, and a new function,
      equalsJsonbScalarValue, just checks for equality.
      d1d50bff
    • Heikki Linnakangas's avatar
      Jsonb comparison bug fixes. · b3e5cfd5
      Heikki Linnakangas authored
      Fix an over-zealous assertion, which didn't take into account that sometimes
      a scalar element can be compared against an array/object element.
      
      Avoid comparing possibly-uninitialized local variables when end-of-array or
      end-of-object is reached. Also fix and enhance comments a bit.
      
      Peter Geoghegan, per reports by Pavel Stehule and me.
      b3e5cfd5
    • Tom Lane's avatar
      Fix bogus %name-prefix option syntax in all our Bison files. · 45b7abe5
      Tom Lane authored
      %name-prefix doesn't use an "=" sign according to the Bison docs, but it
      silently accepted one anyway, until Bison 3.0.  This was originally a
      typo of mine in commit 012abeba, and we
      seem to have slavishly copied the error into all the other grammar files.
      
      Per report from Vik Fearing; analysis by Peter Eisentraut.
      
      Back-patch to all active branches, since somebody might try to build
      a back branch with up-to-date tools.
      45b7abe5
    • Tom Lane's avatar
      Improve regression tests for uuid-ossp. · c0f27628
      Tom Lane authored
      On reflection, the timestamp-advances test might fail if we're unlucky
      enough for the time_mid field to change between two calls, since uuid_cmp
      is just bytewise comparison and the field ordering has more significant
      fields later.  Build some field extraction functions so we can do a more
      honest test of that.  Also check that the version and reserved fields
      contain what they should.
      c0f27628
    • Tom Lane's avatar
      Fix stack clobber in new uuid-ossp code. · 2103218d
      Tom Lane authored
      The V5 (SHA1 hashing) code wrote 20 bytes into a 16-byte local variable.
      This had accidentally failed to fail in my testing and Matteo's, but
      buildfarm results exposed the problem.
      2103218d
    • Magnus Hagander's avatar
      Ensure cleanup in case of early errors in streaming base backups · 8232d6df
      Magnus Hagander authored
      Move the code that sends the initial status information as well as the
      calculation of paths inside the ENSURE_ERROR_CLEANUP block. If this code
      failed, we would "leak" a counter of number of concurrent backups, thereby
      making the system always believe it was in backup mode. This could happen
      if the sending failed (which it probably never did given that the small
      amount of data to send would never cause a flush) or if the psprintf calls
      ran out of memory. Both are very low risk, but all operations after
      do_pg_start_backup should be protected.
      8232d6df
    • Bruce Momjian's avatar
      c6763156
    • Tom Lane's avatar
      pg_lsn should not be marked typispreferred. · ec3357a3
      Tom Lane authored
      In general it's not a good idea for built-in types in the 'U' category
      to be marked preferred; they could draw behavior away from user-defined
      types with similarly-named operators.  pg_lsn is probably at low risk
      of that right now given the lack of casts between it and other types,
      but that doesn't make this marking OK.
      
      Ordinarily we'd bump catversion when changing any predefined catalog
      contents like this, but since we're past beta1, the costs of a forced
      initdb seem to outweigh the benefits of guaranteed behavioral consistency.
      There's not any known behavioral impact today anyway --- this is more
      in the nature of being sure there's not problems in future.
      
      Per an off-list complaint from Thomas Fanghaenel.
      ec3357a3
    • Tom Lane's avatar
      Fix obsolete config-module-exclusion logic in vcregress.pl. · 86000311
      Tom Lane authored
      The recent addition of regression tests to uuid-ossp exposed the fact
      that the MSVC build system wasn't being consistent about whether it was
      building/testing that contrib module, ie, it would try to test the module
      even when it hadn't built it.  The same hazard was latent for sslinfo.
      
      For the moment I just copied the more up-to-date logic from point A to
      point B, but this is screaming for refactoring.
      
      Per buildfarm results.
      86000311
    • Tom Lane's avatar
      Propagate system identifier generation improvement into pg_resetxlog. · 4bcb3946
      Tom Lane authored
      Commit 5035701e improved xlog.c's method
      for creating a database system identifier, but I neglected to fix the
      copy of that code appearing in pg_resetxlog.c.  Spotted by Andres Freund.
      4bcb3946