1. 18 Mar, 2014 6 commits
    • Robert Haas's avatar
      Rewrite comment for shm_mq_receive_bytes. · 250f8a7b
      Robert Haas authored
      The comment and the code diverged at some point before the initial
      commit of this feature, and I failed to notice.
      
      Noted by Tom Lane.
      250f8a7b
    • Tom Lane's avatar
      Fix relcache reference leak in refresh_by_match_merge(). · f7271c44
      Tom Lane authored
      One path through the loop over indexes forgot to do index_close().  Rather
      than adding a fourth call, restructure slightly so that there's only one.
      
      In passing, get rid of an unnecessary syscache lookup: the pg_index struct
      for the index is already available from its relcache entry.
      
      Per report from YAMAMOTO Takashi, though this is a bit different from his
      suggested patch.  This is new code in HEAD, so no need for back-patch.
      f7271c44
    • Robert Haas's avatar
      Improve shm_mq portability around MAXIMUM_ALIGNOF and sizeof(Size). · 3bd261ca
      Robert Haas authored
      Revise the original decision to expose a uint64-based interface and
      use Size everywhere possible.  Avoid assuming that MAXIMUM_ALIGNOF is
      8, or making any assumption about the relationship between that value
      and sizeof(Size).  If MAXIMUM_ALIGNOF is bigger, we'll now insert
      padding after the length word; if it's smaller, we are now prepared
      to read and write the length word in chunks.
      
      Per discussion with Tom Lane.
      3bd261ca
    • Tom Lane's avatar
      Fix pg_dumpall option parsing: -i doesn't take an argument. · 19f2d6cd
      Tom Lane authored
      This used to work properly, but got fat-fingered in commit
      3dee636e.  Per bug #9620 from
      Nicolas Payart.
      19f2d6cd
    • Fujii Masao's avatar
      Fix help message and document in pg_receivexlog. · e726e59d
      Fujii Masao authored
      Add SLOTNAME placeholder to --slot option in help message and
      document.
      e726e59d
    • Robert Haas's avatar
      Make it easy to detach completely from shared memory. · 79a4d24f
      Robert Haas authored
      The new function dsm_detach_all() can be used either by postmaster
      children that don't wish to take any risk of accidentally corrupting
      shared memory; or by forked children of regular backends with
      the same need.  This patch also updates the postmaster children that
      already do PGSharedMemoryDetach() to do dsm_detach_all() as well.
      
      Per discussion with Tom Lane.
      79a4d24f
  2. 17 Mar, 2014 11 commits
  3. 16 Mar, 2014 1 commit
    • Magnus Hagander's avatar
      Cleanups from the remove-native-krb5 patch · 0294023a
      Magnus Hagander authored
      krb_srvname is actually not available anymore as a parameter server-side, since
      with gssapi we accept all principals in our keytab. It's still used in libpq for
      client side specification.
      
      In passing remove declaration of krb_server_hostname, where all the functionality
      was already removed.
      
      Noted by Stephen Frost, though a different solution than his suggestion
      0294023a
  4. 15 Mar, 2014 2 commits
  5. 14 Mar, 2014 2 commits
    • Heikki Linnakangas's avatar
      Fix race condition in B-tree page deletion. · efada2b8
      Heikki Linnakangas authored
      In short, we don't allow a page to be deleted if it's the rightmost child
      of its parent, but that situation can change after we check for it.
      
      Problem
      -------
      
      We check that the page to be deleted is not the rightmost child of its
      parent, and then lock its left sibling, the page itself, its right sibling,
      and the parent, in that order. However, if the parent page is split after
      the check but before acquiring the locks, the target page might become the
      rightmost child, if the split happens at the right place. That leads to an
      error in vacuum (I reproduced this by setting a breakpoint in debugger):
      
      ERROR:  failed to delete rightmost child 41 of block 3 in index "foo_pkey"
      
      We currently re-check that the page is still the rightmost child, and throw
      the above error if it's not. We could easily just give up rather than throw
      an error, but that approach doesn't scale to half-dead pages. To recap,
      although we don't normally allow deleting the rightmost child, if the page
      is the *only* child of its parent, we delete the child page and mark the
      parent page as half-dead in one atomic operation. But before we do that, we
      check that the parent can later be deleted, by checking that it in turn is
      not the rightmost child of the grandparent (potentially recursing all the
      way up to the root). But the same situation can arise there - the
      grandparent can be split while we're not holding the locks. We end up with
      a half-dead page that we cannot delete.
      
      To make things worse, the keyspace of the deleted page has already been
      transferred to its right sibling. As the README points out, the keyspace at
      the grandparent level is "out-of-whack" until the half-dead page is deleted,
      and if enough tuples with keys in the transferred keyspace are inserted, the
      page might get split and a downlink might be inserted into the grandparent
      that is out-of-order. That might not cause any serious problem if it's
      transient (as the README ponders), but is surely bad if it stays that way.
      
      Solution
      --------
      
      This patch changes the page deletion algorithm to avoid that problem. After
      checking that the topmost page in the chain of to-be-deleted pages is not
      the rightmost child of its parent, and then deleting the pages from bottom
      up, unlink the pages from top to bottom. This way, the intermediate stages
      are similar to the intermediate stages in page splitting, and there is no
      transient stage where the keyspace is "out-of-whack". The topmost page in
      the to-be-deleted chain doesn't have a downlink pointing to it, like a page
      split before the downlink has been inserted.
      
      This also allows us to get rid of the cleanup step after WAL recovery, if we
      crash during page deletion. The deletion will be continued at next VACUUM,
      but the tree is consistent for searches and insertions at every step.
      
      This bug is old, all supported versions are affected, but this patch is too
      big to back-patch (and changes the WAL record formats of related records).
      We have not heard any reports of the bug from users, so clearly it's not
      easy to bump into. Maybe backpatch later, after this has had some field
      testing.
      
      Reviewed by Kevin Grittner and Peter Geoghegan.
      efada2b8
    • Tom Lane's avatar
      Prevent interrupts while reporting non-ERROR elog messages. · 6c461cb9
      Tom Lane authored
      This should eliminate the risk of recursive entry to syslog(3), which
      appears to be the cause of the hang reported in bug #9551 from James
      Morton.
      
      Arguably, the real problem here is auth.c's willingness to turn on
      ImmediateInterruptOK while executing fairly wide swaths of backend code.
      We may well need to work at narrowing the code ranges in which the
      authentication_timeout interrupt is enabled.  For the moment, though,
      this is a cheap and reasonably noninvasive fix for a field-reported
      failure; the other approach would be complex and not necessarily
      bug-free itself.
      
      Back-patch to all supported branches.
      6c461cb9
  6. 13 Mar, 2014 5 commits
    • Tom Lane's avatar
      Allow psql to print COPY command status in more cases. · f70a78bc
      Tom Lane authored
      Previously, psql would print the "COPY nnn" command status only for COPY
      commands executed server-side.  Now it will print that for frontend copies
      too (including \copy).  However, we continue to suppress the command status
      for COPY TO STDOUT, since in that case the copy data has been routed to the
      same place that the command status would go, and there is a risk of the
      status line being mistaken for another line of COPY data.  Doing that would
      break existing scripts, and it doesn't seem worth the benefit --- this case
      seems fairly analogous to SELECT, for which we also suppress the command
      status.
      
      Kumar Rajeev Rastogi, with substantial review by Amit Khandekar
      f70a78bc
    • Tom Lane's avatar
      Avoid transaction-commit race condition while receiving a NOTIFY message. · 7bae0284
      Tom Lane authored
      Use TransactionIdIsInProgress, then TransactionIdDidCommit, to distinguish
      whether a NOTIFY message's originating transaction is in progress,
      committed, or aborted.  The previous coding could accept a message from a
      transaction that was still in-progress according to the PGPROC array;
      if the client were fast enough at starting a new transaction, it might fail
      to see table rows added/updated by the message-sending transaction.  Which
      of course would usually be the point of receiving the message.  We noted
      this type of race condition long ago in tqual.c, but async.c overlooked it.
      
      The race condition probably cannot occur unless there are multiple NOTIFY
      senders in action, since an individual backend doesn't send NOTIFY signals
      until well after it's done committing.  But if two senders commit in close
      succession, it's certainly possible that we could see the second sender's
      message within the race condition window while responding to the signal
      from the first one.
      
      Per bug #9557 from Marko Tiikkaja.  This patch is slightly more invasive
      than what he proposed, since it removes the now-redundant
      TransactionIdDidAbort call.
      
      Back-patch to 9.0, where the current NOTIFY implementation was introduced.
      7bae0284
    • Heikki Linnakangas's avatar
      Fix a couple of typos in docs. · 16ff08b7
      Heikki Linnakangas authored
      Thom Brown
      16ff08b7
    • Bruce Momjian's avatar
      C comments: remove odd blank lines after #ifdef WIN32 lines · 242c2737
      Bruce Momjian authored
      A few more
      242c2737
    • Bruce Momjian's avatar
  7. 12 Mar, 2014 8 commits
  8. 10 Mar, 2014 4 commits
    • Tom Lane's avatar
      Fix tracking of psql script line numbers during \copy from another place. · e85a5ffb
      Tom Lane authored
      Commit 08146775 changed do_copy() to
      temporarily scribble on pset.cur_cmd_source.  That was a mighty ugly bit of
      code in any case, but in particular it broke handleCopyIn's ability to tell
      whether it was reading from the current script source file (in which case
      pset.lineno should be incremented for each line of COPY data), or from
      someplace else (in which case it shouldn't).  The former case still worked,
      the latter not so much.  The visible effect was that line numbers reported
      for errors in a script file would be wrong if there were an earlier \copy
      that was reading anything other than inline-in-the-script-file data.
      
      To fix, introduce another pset field that holds the file do_copy wants the
      COPY code to use.  This is a little bit ugly, but less so than passing the
      file down explicitly through several layers that aren't COPY-specific.
      
      Extracted from a larger patch by Kumar Rajeev Rastogi; that patch also
      changes printing of COPY command tags, which is not a bug fix and shouldn't
      get back-patched.  This particular idea was from a suggestion by Amit
      Khandekar, if I'm reading the thread correctly.
      
      Back-patch to 9.2 where the faulty code was introduced.
      e85a5ffb
    • Robert Haas's avatar
      Allow dynamic shared memory segments to be kept until shutdown. · 8722017b
      Robert Haas authored
      Amit Kapila, reviewed by Kyotaro Horiguchi, with some further
      changes by me.
      8722017b
    • Robert Haas's avatar
      Allow logical decoding via the walsender interface. · 5a991ef8
      Robert Haas authored
      In order for this to work, walsenders need the optional ability to
      connect to a database, so the "replication" keyword now allows true
      or false, for backward-compatibility, and the new value "database"
      (which causes the "dbname" parameter to be respected).
      
      walsender needs to loop not only when idle but also when sending
      decoded data to the user and when waiting for more xlog data to decode.
      This means that there are now three separate loops inside walsender.c;
      although some refactoring has been done here, this is still a bit ugly.
      
      Andres Freund, with contributions from Álvaro Herrera, and further
      review by me.
      5a991ef8
    • Robert Haas's avatar
      Teach on_exit_reset() to discard pending cleanups for dsm. · cb9a0c79
      Robert Haas authored
      If a postmaster child invokes fork() and then calls on_exit_reset, that
      should be sufficient to let it exit() without breaking anything, but
      dynamic shared memory broke that by not updating on_exit_reset() to
      discard callbacks registered with dynamic shared memory segments.
      
      Per investigation of a complaint from Tom Lane.
      cb9a0c79
  9. 09 Mar, 2014 1 commit