1. 21 Mar, 2014 2 commits
    • Bruce Momjian's avatar
      Properly check for readdir/closedir() failures · 6f03927f
      Bruce Momjian authored
      Clear errno before calling readdir() and handle old MinGW errno bug
      while adding full test coverage for readdir/closedir failures.
      
      Backpatch through 8.4.
      6f03927f
    • Heikki Linnakangas's avatar
      Replace the XLogInsert slots with regular LWLocks. · 68a2e52b
      Heikki Linnakangas authored
      The special feature the XLogInsert slots had over regular LWLocks is the
      insertingAt value that was updated atomically with releasing backends
      waiting on it. Add new functions to the LWLock API to do that, and replace
      the slots with LWLocks. This reduces the amount of duplicated code.
      (There's still some duplication, but at least it's all in lwlock.c now.)
      
      Reviewed by Andres Freund.
      68a2e52b
  2. 20 Mar, 2014 3 commits
    • Tom Lane's avatar
      Again fix initialization of auto-tuned effective_cache_size. · af930e60
      Tom Lane authored
      The previous method was overly complex and underly correct; in particular,
      by assigning the default value with PGC_S_OVERRIDE, it prevented later
      attempts to change the setting in postgresql.conf, as noted by Jeff Janes.
      We should just assign the default value with source PGC_S_DYNAMIC_DEFAULT,
      which will have the desired priority relative to the boot_val as well as
      user-set values.
      
      There is still a gap in this method: if there's an explicit assignment of
      effective_cache_size = -1 in the postgresql.conf file, and that assignment
      appears before shared_buffers is assigned, the code will substitute 4 times
      the bootstrap default for shared_buffers, and that value will then persist
      (since it will have source PGC_S_FILE).  I don't see any very nice way
      to avoid that though, and it's not a case to be expected in practice.
      The existing comments in guc-file.l look forward to a redesign of the
      DYNAMIC_DEFAULT mechanism; if that ever happens, we should consider this
      case as one of the things we'd like to improve.
      af930e60
    • Bruce Momjian's avatar
      libpq: pass a memory allocation failure error up to PQconndefaults() · a4c8f143
      Bruce Momjian authored
      Previously user name memory allocation failures were ignored and the
      default user name set to NULL.
      a4c8f143
    • Robert Haas's avatar
      test_shm_mq: Improve regression tests. · d1bdab2f
      Robert Haas authored
      Per discussion with Tom Lane.
      d1bdab2f
  3. 19 Mar, 2014 3 commits
    • Alvaro Herrera's avatar
      Setup error context callback for transaction lock waits · f88d4cfc
      Alvaro Herrera authored
      With this in place, a session blocking behind another one because of
      tuple locks will get a context line mentioning the relation name, tuple
      TID, and operation being done on tuple.  For example:
      
      LOG:  process 11367 still waiting for ShareLock on transaction 717 after 1000.108 ms
      DETAIL:  Process holding the lock: 11366. Wait queue: 11367.
      CONTEXT:  while updating tuple (0,2) in relation "foo"
      STATEMENT:  UPDATE foo SET value = 3;
      
      Most usefully, the new line is displayed by log entries due to
      log_lock_waits, although of course it will be printed by any other log
      message as well.
      
      Author: Christian Kruse, some tweaks by Álvaro Herrera
      Reviewed-by: Amit Kapila, Andres Freund, Tom Lane, Robert Haas
      f88d4cfc
    • Tom Lane's avatar
      Fix memory leak during regular expression execution. · ea8c7e90
      Tom Lane authored
      For a regex containing backrefs, pg_regexec() might fail to free all the
      sub-DFAs that were created during execution, resulting in a permanent
      (session lifespan) memory leak.  Problem was introduced by me in commit
      58735947.  Per report from Sandro Santilli;
      diagnosis by Greg Stark.
      ea8c7e90
    • Fujii Masao's avatar
      Some minor improvements to logical decoding document. · fb1d92a9
      Fujii Masao authored
      Also improve help message in pg_recvlogical.
      fb1d92a9
  4. 18 Mar, 2014 16 commits
    • Heikki Linnakangas's avatar
      Fix compilation of pg_xlogdump, now that rm_safe_restartpoint is no more. · 033dc1c9
      Heikki Linnakangas authored
      Oops. Pointed out by Andres Freund.
      033dc1c9
    • Heikki Linnakangas's avatar
      Remove rm_safe_restartpoint machinery. · 59a5ab3f
      Heikki Linnakangas authored
      It is no longer used, none of the resource managers have multi-record
      actions that would make it unsafe to perform a restartpoint.
      
      Also don't allow rm_cleanup to write WAL records, it's also no longer
      required. Move the call to rm_cleanup routines to make it more symmetric
      with rm_startup.
      59a5ab3f
    • Heikki Linnakangas's avatar
      Fix misc typos in comments. · 1d3b258c
      Heikki Linnakangas authored
      1d3b258c
    • Robert Haas's avatar
      Logical decoding documentation corrections. · 3ee4fcfc
      Robert Haas authored
      Thom Brown
      3ee4fcfc
    • Robert Haas's avatar
      Fix uninitialized variable. · a3b30d4c
      Robert Haas authored
      Report from Andres Freund, but not his fix.
      a3b30d4c
    • Heikki Linnakangas's avatar
      Make the handling of interrupted B-tree page splits more robust. · 40dae7ec
      Heikki Linnakangas authored
      Splitting a page consists of two separate steps: splitting the child page,
      and inserting the downlink for the new right page to the parent. Previously,
      we handled the case that you crash in between those steps with a cleanup
      routine after the WAL recovery had finished, which finished the incomplete
      split. However, that doesn't help if the page split is interrupted but the
      database doesn't crash, so that you don't perform WAL recovery. That could
      happen for example if you run out of disk space.
      
      Remove the end-of-recovery cleanup step. Instead, when a page is split, the
      left page is marked with a new INCOMPLETE_SPLIT flag, and when the downlink
      is inserted to the parent, the flag is cleared again. If an insertion sees
      a page with the flag set, it knows that the split was interrupted for some
      reason, and inserts the missing downlink before proceeding.
      
      I used the same approach to fix GIN and GiST split algorithms earlier. This
      was the last WAL cleanup routine, so we could get rid of that whole
      machinery now, but I'll leave that for a separate patch.
      
      Reviewed by Peter Geoghegan.
      40dae7ec
    • Tom Lane's avatar
      Fix some remaining int64 vestiges in contrib/test_shm_mq. · b6ec7c92
      Tom Lane authored
      Andres Freund and Tom Lane
      b6ec7c92
    • Robert Haas's avatar
      test_shm_mq: Use Size rather than uint64. · c676ac0f
      Robert Haas authored
      Commit 3bd261ca updated the API but
      neglected to make the corresponding edits here.
      
      Per Tom Lane and the buildfarm.
      c676ac0f
    • Robert Haas's avatar
      Documentation for logical decoding. · 49c0864d
      Robert Haas authored
      Craig Ringer, Andres Freund, Christian Kruse, with edits by me.
      49c0864d
    • Robert Haas's avatar
      Add pg_recvlogical, a tool to receive data logical decoding data. · 8bdd12bb
      Robert Haas authored
      This is fairly basic at the moment, but it's at least useful for
      testing and debugging, and possibly more.
      
      Andres Freund
      8bdd12bb
    • Robert Haas's avatar
      Rewrite comment for shm_mq_receive_bytes. · 250f8a7b
      Robert Haas authored
      The comment and the code diverged at some point before the initial
      commit of this feature, and I failed to notice.
      
      Noted by Tom Lane.
      250f8a7b
    • Tom Lane's avatar
      Fix relcache reference leak in refresh_by_match_merge(). · f7271c44
      Tom Lane authored
      One path through the loop over indexes forgot to do index_close().  Rather
      than adding a fourth call, restructure slightly so that there's only one.
      
      In passing, get rid of an unnecessary syscache lookup: the pg_index struct
      for the index is already available from its relcache entry.
      
      Per report from YAMAMOTO Takashi, though this is a bit different from his
      suggested patch.  This is new code in HEAD, so no need for back-patch.
      f7271c44
    • Robert Haas's avatar
      Improve shm_mq portability around MAXIMUM_ALIGNOF and sizeof(Size). · 3bd261ca
      Robert Haas authored
      Revise the original decision to expose a uint64-based interface and
      use Size everywhere possible.  Avoid assuming that MAXIMUM_ALIGNOF is
      8, or making any assumption about the relationship between that value
      and sizeof(Size).  If MAXIMUM_ALIGNOF is bigger, we'll now insert
      padding after the length word; if it's smaller, we are now prepared
      to read and write the length word in chunks.
      
      Per discussion with Tom Lane.
      3bd261ca
    • Tom Lane's avatar
      Fix pg_dumpall option parsing: -i doesn't take an argument. · 19f2d6cd
      Tom Lane authored
      This used to work properly, but got fat-fingered in commit
      3dee636e.  Per bug #9620 from
      Nicolas Payart.
      19f2d6cd
    • Fujii Masao's avatar
      Fix help message and document in pg_receivexlog. · e726e59d
      Fujii Masao authored
      Add SLOTNAME placeholder to --slot option in help message and
      document.
      e726e59d
    • Robert Haas's avatar
      Make it easy to detach completely from shared memory. · 79a4d24f
      Robert Haas authored
      The new function dsm_detach_all() can be used either by postmaster
      children that don't wish to take any risk of accidentally corrupting
      shared memory; or by forked children of regular backends with
      the same need.  This patch also updates the postmaster children that
      already do PGSharedMemoryDetach() to do dsm_detach_all() as well.
      
      Per discussion with Tom Lane.
      79a4d24f
  5. 17 Mar, 2014 11 commits
  6. 16 Mar, 2014 1 commit
    • Magnus Hagander's avatar
      Cleanups from the remove-native-krb5 patch · 0294023a
      Magnus Hagander authored
      krb_srvname is actually not available anymore as a parameter server-side, since
      with gssapi we accept all principals in our keytab. It's still used in libpq for
      client side specification.
      
      In passing remove declaration of krb_server_hostname, where all the functionality
      was already removed.
      
      Noted by Stephen Frost, though a different solution than his suggestion
      0294023a
  7. 15 Mar, 2014 2 commits
  8. 14 Mar, 2014 2 commits
    • Heikki Linnakangas's avatar
      Fix race condition in B-tree page deletion. · efada2b8
      Heikki Linnakangas authored
      In short, we don't allow a page to be deleted if it's the rightmost child
      of its parent, but that situation can change after we check for it.
      
      Problem
      -------
      
      We check that the page to be deleted is not the rightmost child of its
      parent, and then lock its left sibling, the page itself, its right sibling,
      and the parent, in that order. However, if the parent page is split after
      the check but before acquiring the locks, the target page might become the
      rightmost child, if the split happens at the right place. That leads to an
      error in vacuum (I reproduced this by setting a breakpoint in debugger):
      
      ERROR:  failed to delete rightmost child 41 of block 3 in index "foo_pkey"
      
      We currently re-check that the page is still the rightmost child, and throw
      the above error if it's not. We could easily just give up rather than throw
      an error, but that approach doesn't scale to half-dead pages. To recap,
      although we don't normally allow deleting the rightmost child, if the page
      is the *only* child of its parent, we delete the child page and mark the
      parent page as half-dead in one atomic operation. But before we do that, we
      check that the parent can later be deleted, by checking that it in turn is
      not the rightmost child of the grandparent (potentially recursing all the
      way up to the root). But the same situation can arise there - the
      grandparent can be split while we're not holding the locks. We end up with
      a half-dead page that we cannot delete.
      
      To make things worse, the keyspace of the deleted page has already been
      transferred to its right sibling. As the README points out, the keyspace at
      the grandparent level is "out-of-whack" until the half-dead page is deleted,
      and if enough tuples with keys in the transferred keyspace are inserted, the
      page might get split and a downlink might be inserted into the grandparent
      that is out-of-order. That might not cause any serious problem if it's
      transient (as the README ponders), but is surely bad if it stays that way.
      
      Solution
      --------
      
      This patch changes the page deletion algorithm to avoid that problem. After
      checking that the topmost page in the chain of to-be-deleted pages is not
      the rightmost child of its parent, and then deleting the pages from bottom
      up, unlink the pages from top to bottom. This way, the intermediate stages
      are similar to the intermediate stages in page splitting, and there is no
      transient stage where the keyspace is "out-of-whack". The topmost page in
      the to-be-deleted chain doesn't have a downlink pointing to it, like a page
      split before the downlink has been inserted.
      
      This also allows us to get rid of the cleanup step after WAL recovery, if we
      crash during page deletion. The deletion will be continued at next VACUUM,
      but the tree is consistent for searches and insertions at every step.
      
      This bug is old, all supported versions are affected, but this patch is too
      big to back-patch (and changes the WAL record formats of related records).
      We have not heard any reports of the bug from users, so clearly it's not
      easy to bump into. Maybe backpatch later, after this has had some field
      testing.
      
      Reviewed by Kevin Grittner and Peter Geoghegan.
      efada2b8
    • Tom Lane's avatar
      Prevent interrupts while reporting non-ERROR elog messages. · 6c461cb9
      Tom Lane authored
      This should eliminate the risk of recursive entry to syslog(3), which
      appears to be the cause of the hang reported in bug #9551 from James
      Morton.
      
      Arguably, the real problem here is auth.c's willingness to turn on
      ImmediateInterruptOK while executing fairly wide swaths of backend code.
      We may well need to work at narrowing the code ranges in which the
      authentication_timeout interrupt is enabled.  For the moment, though,
      this is a cheap and reasonably noninvasive fix for a field-reported
      failure; the other approach would be complex and not necessarily
      bug-free itself.
      
      Back-patch to all supported branches.
      6c461cb9