1. 13 Feb, 2015 3 commits
    • Heikki Linnakangas's avatar
      Fix broken #ifdef for __sparcv8 · 33e879c4
      Heikki Linnakangas authored
      Rob Rowan. Backpatch to all supported versions, like the patch that added
      the broken #ifdef.
      33e879c4
    • Heikki Linnakangas's avatar
      Simplify waiting logic in reading from / writing to client. · 80788a43
      Heikki Linnakangas authored
      The client socket is always in non-blocking mode, and if we actually want
      blocking behaviour, we emulate it by sleeping and retrying. But we have
      retry loops at different layers for reads and writes, which was confusing.
      To simplify, remove all the sleeping and retrying code from the lower
      levels, from be_tls_read and secure_raw_read and secure_raw_write, and put
      all the logic in secure_read() and secure_write().
      80788a43
    • Heikki Linnakangas's avatar
      Simplify the way OpenSSL renegotiation is initiated in server. · 272923a0
      Heikki Linnakangas authored
      At least in all modern versions of OpenSSL, it is enough to call
      SSL_renegotiate() once, and then forget about it. Subsequent SSL_write()
      and SSL_read() calls will finish the handshake.
      
      The SSL_set_session_id_context() call is unnecessary too. We only have
      one SSL context, and the SSL session was created with that to begin with.
      272923a0
  2. 12 Feb, 2015 6 commits
    • Bruce Momjian's avatar
      pg_upgrade: improve checksum mismatch error message · dc01efa5
      Bruce Momjian authored
      Patch by Greg Sabino Mullane, slight adjustments by me
      dc01efa5
    • Bruce Momjian's avatar
      pg_upgrade: quote directory names in delete_old_cluster script · 056764b1
      Bruce Momjian authored
      This allows the delete script to properly function when special
      characters appear in directory paths, e.g. spaces.
      
      Backpatch through 9.0
      056764b1
    • Bruce Momjian's avatar
      pg_upgrade: preserve freeze info for postgres/template1 dbs · 866f3017
      Bruce Momjian authored
      pg_database.datfrozenxid and pg_database.datminmxid were not preserved
      for the 'postgres' and 'template1' databases.  This could cause missing
      clog file errors on access to user tables and indexes after upgrades in
      these databases.
      
      Backpatch through 9.0
      866f3017
    • Andres Freund's avatar
      Fix typo in logicaldecoding.sgml. · 8785e6e3
      Andres Freund authored
      Author: Tatsuo Ishii
      
      Backpatch to 9.4, where logicaldecoding was introduced.
      8785e6e3
    • Tom Lane's avatar
      Fix missing PQclear() in libpqrcv_endstreaming(). · 4f38a281
      Tom Lane authored
      This omission leaked one PGresult per WAL streaming cycle, which possibly
      would never be enough to notice in the real world, but it's still a leak.
      
      Per Coverity.  Back-patch to 9.3 where the error was introduced.
      4f38a281
    • Tom Lane's avatar
      Fix minor memory leak in ident_inet(). · 58146d35
      Tom Lane authored
      We'd leak the ident_serv data structure if the second pg_getaddrinfo_all
      (the one for the local address) failed.  This is not of great consequence
      because a failure return here just leads directly to backend exit(), but
      if this function is going to try to clean up after itself at all, it should
      not have such holes in the logic.  Try to fix it in a future-proof way by
      having all the failure exits go through the same cleanup path, rather than
      "optimizing" some of them.
      
      Per Coverity.  Back-patch to 9.2, which is as far back as this patch
      applies cleanly.
      58146d35
  3. 11 Feb, 2015 3 commits
    • Tom Lane's avatar
      Fix more memory leaks in failure path in buildACLCommands. · 9179444d
      Tom Lane authored
      We already had one go at this issue in commit d73b7f97, but we
      failed to notice that buildACLCommands also leaked several PQExpBuffers
      along with a simply malloc'd string.  This time let's try to make the
      fix a bit more future-proof by eliminating the separate exit path.
      
      It's still not exactly critical because pg_dump will curl up and die on
      failure; but since the amount of the potential leak is now several KB,
      it seems worth back-patching as far as 9.2 where the previous fix landed.
      
      Per Coverity, which evidently is smarter than clang's static analyzer.
      9179444d
    • Tom Lane's avatar
      Fix pg_dump's heuristic for deciding which casts to dump. · 9feefedf
      Tom Lane authored
      Back in 2003 we had a discussion about how to decide which casts to dump.
      At the time pg_dump really only considered an object's containing schema
      to decide what to dump (ie, dump whatever's not in pg_catalog), and so
      we chose a complicated idea involving whether the underlying types were to
      be dumped (cf commit a6790ce8).  But users
      are allowed to create casts between built-in types, and we failed to dump
      such casts.  Let's get rid of that heuristic, which has accreted even more
      ugliness since then, in favor of just looking at the cast's OID to decide
      if it's a built-in cast or not.
      
      In passing, also fix some really ancient code that supposed that it had to
      manufacture a dependency for the cast on its cast function; that's only
      true when dumping from a pre-7.3 server.  This just resulted in some wasted
      cycles and duplicate dependency-list entries with newer servers, but we
      might as well improve it.
      
      Per gripes from a number of people, most recently Greg Sabino Mullane.
      Back-patch to all supported branches.
      9feefedf
    • Tom Lane's avatar
      Fix GEQO to not assume its join order heuristic always works. · 1a179f36
      Tom Lane authored
      Back in commit 400e2c93 I rewrote GEQO's
      gimme_tree function to improve its heuristic for modifying the given tour
      into a legal join order.  In what can only be called a fit of hubris,
      I supposed that this new heuristic would *always* find a legal join order,
      and ripped out the old logic that allowed gimme_tree to sometimes fail.
      
      The folly of this is exposed by bug #12760, in which the "greedy" clumping
      behavior of merge_clump() can lead it into a dead end which could only be
      recovered from by un-clumping.  We have no code for that and wouldn't know
      exactly what to do with it if we did.  Rather than try to improve the
      heuristic rules still further, let's just recognize that it *is* a
      heuristic and probably must always have failure cases.  So, put back the
      code removed in the previous commit to allow for failure (but comment it
      a bit better this time).
      
      It's possible that this code was actually fully correct at the time and
      has only been broken by the introduction of LATERAL.  But having seen this
      example I no longer have much faith in that proposition, so back-patch to
      all supported branches.
      1a179f36
  4. 10 Feb, 2015 2 commits
    • Michael Meskes's avatar
      Fixed array handling in ecpg. · 1f393fc9
      Michael Meskes authored
      When ecpg was rewritten to the new protocol version not all variable types
      were corrected. This patch rewrites the code for these types to fix that. It
      also fixes the documentation to correctly tell the status of array handling.
      1f393fc9
    • Heikki Linnakangas's avatar
      Speed up CRC calculation using slicing-by-8 algorithm. · 025c0242
      Heikki Linnakangas authored
      This speeds up WAL generation and replay. The new algorithm is
      significantly faster with large inputs, like full-page images or when
      inserting wide rows. It is slower with tiny inputs, i.e. less than 10 bytes
      or so, but the speedup with longer inputs more than make up for that. Even
      small WAL records at least have 24 byte header in the front.
      
      The output is identical to the current byte-at-a-time computation, so this
      does not affect compatibility. The new algorithm is only used for the
      CRC-32C variant, not the legacy version used in tsquery or the
      "traditional" CRC-32 used in hstore and ltree. Those are not as performance
      critical, and are usually only applied over small inputs, so it seems
      better to not carry around the extra lookup tables to speed up those rare
      cases.
      
      Abhijit Menon-Sen
      025c0242
  5. 09 Feb, 2015 4 commits
    • Heikki Linnakangas's avatar
      Fix MSVC build. · cc761b17
      Heikki Linnakangas authored
      When I moved pg_crc.c from src/port to src/common, I forgot to modify MSVC
      build script accordingly.
      cc761b17
    • Tom Lane's avatar
      Minor cleanup/code review for "indirect toast" stuff. · bc4de01d
      Tom Lane authored
      Fix some issues I noticed while fooling with an extension to allow an
      additional kind of toast pointer.  Much of this is just comment
      improvement, but there are a couple of actual bugs, which might or might
      not be reachable today depending on what can happen during logical
      decoding.  An example is that toast_flatten_tuple() failed to cover the
      possibility of an indirection pointer in its input.  Back-patch to 9.4
      just in case that is reachable now.
      
      In HEAD, also correct some really minor issues with recent compression
      reorganization, such as dangerously underparenthesized macros.
      bc4de01d
    • Heikki Linnakangas's avatar
      Move pg_crc.c to src/common, and remove pg_crc_tables.h · c619c235
      Heikki Linnakangas authored
      To get CRC functionality in a client program, you now need to link with
      libpgcommon instead of libpgport. The CRC code has nothing to do with
      portability, so libpgcommon is a better home. (libpgcommon didn't exist
      when pg_crc.c was originally moved to src/port.)
      
      Remove the possibility to get CRC functionality by just #including
      pg_crc_tables.h. I'm not aware of any extensions that actually did that and
      couldn't simply link with libpgcommon.
      
      This also moves the pg_crc.h header file from src/include/utils to
      src/include/common, which will require changes to any external programs
      that currently does #include "utils/pg_crc.h". That seems acceptable, as
      include/common is clearly the right home for it now, and the change needed
      to any such programs is trivial.
      c619c235
    • Fujii Masao's avatar
      Move pg_lzcompress.c to src/common. · 40bede54
      Fujii Masao authored
      The meta data of PGLZ symbolized by PGLZ_Header is removed, to make
      the compression and decompression code independent on the backend-only
      varlena facility. PGLZ_Header is being used to store some meta data
      related to the data being compressed like the raw length of the uncompressed
      record or some varlena-related data, making it unpluggable once PGLZ is
      stored in src/common as it contains some backend-only code paths with
      the management of varlena structures. The APIs of PGLZ are reworked
      at the same time to do only compression and decompression of buffers
      without the meta-data layer, simplifying its use for a more general usage.
      
      On-disk format is preserved as well, so there is no incompatibility with
      previous major versions of PostgreSQL for TOAST entries.
      
      Exposing compression and decompression APIs of pglz makes possible its
      use by extensions and contrib modules. Especially this commit is required
      for upcoming WAL compression feature so that the WAL reader facility can
      decompress the WAL data by using pglz_decompress.
      
      Michael Paquier, reviewed by me.
      40bede54
  6. 07 Feb, 2015 2 commits
    • Noah Misch's avatar
      Check DCH_MAX_ITEM_SIZ limits with <=, not <. · 237795a7
      Noah Misch authored
      We reserve space for the full amount, not one less.  The affected checks
      deal with localized month and day names.  Today's DCH_MAX_ITEM_SIZ value
      would suffice for a 60-byte day name, while the longest known is the
      49-byte mn_CN.utf-8 word for "Saturday."  Thus, the upshot of this
      change is merely to avoid misdirecting future readers of the code; users
      are not expected to see errors either way.
      237795a7
    • Noah Misch's avatar
      Assert(PqCommReadingMsg) in pq_peekbyte(). · a7a4adcf
      Noah Misch authored
      Interrupting pq_recvbuf() can break protocol sync, so its callers all
      deserve this assertion.  The one pq_peekbyte() caller suffices already.
      a7a4adcf
  7. 06 Feb, 2015 1 commit
    • Heikki Linnakangas's avatar
      Report WAL flush, not insert, position in replication IDENTIFY_SYSTEM · ff16b40f
      Heikki Linnakangas authored
      When beginning streaming replication, the client usually issues the
      IDENTIFY_SYSTEM command, which used to return the current WAL insert
      position. That's not suitable for the intended purpose of that field,
      however. pg_receivexlog uses it to start replication from the reported
      point, but if it hasn't been flushed to disk yet, it will fail. Change
      IDENTIFY_SYSTEM to report the flush position instead.
      
      Backpatch to 9.1 and above. 9.0 doesn't report any WAL position.
      ff16b40f
  8. 05 Feb, 2015 1 commit
  9. 04 Feb, 2015 7 commits
    • Heikki Linnakangas's avatar
      Use a separate memory context for GIN scan keys. · d88976cf
      Heikki Linnakangas authored
      It was getting tedious to track and release all the different things that
      form a scan key. We were leaking at least the queryCategories array, and
      possibly more, on a rescan. That was visible if a GIN index was used in a
      nested loop join. This also protects from leaks in extractQuery method.
      
      No backpatching, given the lack of complaints from the field. Maybe later,
      after this has received more field testing.
      d88976cf
    • Heikki Linnakangas's avatar
      Fix reference-after-free when waiting for another xact due to constraint. · 57fe2468
      Heikki Linnakangas authored
      If an insertion or update had to wait for another transaction to finish,
      because there was another insertion with conflicting key in progress,
      we would pass a just-free'd item pointer to XactLockTableWait().
      
      All calls to XactLockTableWait() and MultiXactIdWait() had similar issues.
      Some passed a pointer to a buffer in the buffer cache, after already
      releasing the lock. The call in EvalPlanQualFetch had already released the
      pin too. All but the call in execUtils.c would merely lead to reporting a
      bogus ctid, however (or an assertion failure, if enabled).
      
      All the callers that passed HeapTuple->t_data->t_ctid were slightly bogus
      anyway: if the tuple was updated (again) in the same transaction, its ctid
      field would point to the next tuple in the chain, not the tuple itself.
      
      Backpatch to 9.4, where the 'ctid' argument to XactLockTableWait was added
      (in commit f88d4cfc)
      57fe2468
    • Robert Haas's avatar
      pgcrypto: Code cleanup for decrypt_internal. · 370b3a46
      Robert Haas authored
      Remove some unnecessary null-tests, and replace a goto-label construct
      with an "if" block.
      
      Michael Paquier, reviewed by me.
      370b3a46
    • Heikki Linnakangas's avatar
      Fix memory leaks on OOM in ecpg. · c31b5d9d
      Heikki Linnakangas authored
      These are fairly obscure cases, but let's keep Coverity happy.
      
      Michael Paquier with some further fixes by me.
      c31b5d9d
    • Andres Freund's avatar
      Add missing float.h include to snprintf.c. · ff8ca3b0
      Andres Freund authored
      On windows _isnan() (which isnan() is redirected to in port/win32.h)
      is declared in float.h, not math.h.
      
      Per buildfarm animal currawong.
      
      Backpatch to all supported branches.
      ff8ca3b0
    • Fujii Masao's avatar
      doc: Fix markup · c036edb7
      Fujii Masao authored
      Ian Barwick
      c036edb7
    • Heikki Linnakangas's avatar
      Add dummy PQsslAttributes function for non-SSL builds. · 302262d5
      Heikki Linnakangas authored
      All the other new SSL information functions had dummy versions in
      be-secure.c, but I missed PQsslAttributes(). Oops. Surprisingly, the linker
      did not complain about the missing function on most platforms represented in
      the buildfarm, even though it is exported, except for a few Windows systems.
      302262d5
  10. 03 Feb, 2015 11 commits
    • Andres Freund's avatar
      Remove ill-conceived Assertion in ProcessClientWriteInterrupt(). · 3a54f4a4
      Andres Freund authored
      It's perfectly fine to have blocked interrupts when
      ProcessClientWriteInterrupt() is called. In fact it's commonly the
      case when emitting error reports. And we deal with that correctly.
      
      Even if that'd not be the case, it'd be a bad location for such a
      assertion. Because ProcessClientWriteInterrupt() is only called when
      the socket is blocked it's hard to hit.
      
      Per Heikki and buildfarm animals nightjar and dunlin.
      3a54f4a4
    • Andres Freund's avatar
      Remove remnants of ImmediateInterruptOK handling. · 2505ce0b
      Andres Freund authored
      Now that nothing sets ImmediateInterruptOK to true anymore, we can
      remove all the supporting code.
      
      Reviewed-By: Heikki Linnakangas
      2505ce0b
    • Andres Freund's avatar
      Remove the option to service interrupts during PGSemaphoreLock(). · d0699571
      Andres Freund authored
      The remaining caller (lwlocks) doesn't need that facility, and we plan
      to remove ImmedidateInterruptOK entirely. That means that interrupts
      can't be serviced race-free and portably anyway, so there's little
      reason for keeping the feature.
      
      Reviewed-By: Heikki Linnakangas
      d0699571
    • Andres Freund's avatar
      Move deadlock and other interrupt handling in proc.c out of signal handlers. · 6753333f
      Andres Freund authored
      Deadlock checking was performed inside signal handlers up to
      now. While it's a remarkable feat to have made this work reliably,
      it's quite complex to understand why that is the case. Partially it
      worked due to the assumption that semaphores are signal safe - which
      is not actually documented to be the case for sysv semaphores.
      
      The reason we had to rely on performing this work inside signal
      handlers is that semaphores aren't guaranteed to be interruptable by
      signals on all platforms. But now that latches provide a somewhat
      similar API, which actually has the guarantee of being interruptible,
      we can avoid doing so.
      
      Signalling between ProcSleep, ProcWakeup, ProcWaitForSignal and
      ProcSendSignal is now done using latches. This increases the
      likelihood of spurious wakeups. As spurious wakeup already were
      possible and aren't likely to be frequent enough to be an actual
      problem, this seems acceptable.
      
      This change would allow for further simplification of the deadlock
      checking, now that it doesn't have to run in a signal handler. But
      even if I were motivated to do so right now, it would still be better
      to do that separately. Such a cleanup shouldn't have to be reviewed a
      the same time as the more fundamental changes in this commit.
      
      There is one possible usability regression due to this commit. Namely
      it is more likely than before that log_lock_waits messages are output
      more than once.
      
      Reviewed-By: Heikki Linnakangas
      6753333f
    • Andres Freund's avatar
      Don't allow immediate interrupts during authentication anymore. · 6647248e
      Andres Freund authored
      We used to handle authentication_timeout by setting
      ImmediateInterruptOK to true during large parts of the authentication
      phase of a new connection.  While that happens to work acceptably in
      practice, it's not particularly nice and has ugly corner cases.
      
      Previous commits converted the FE/BE communication to use latches and
      implemented support for interrupt handling during both
      send/recv. Building on top of that work we can get rid of
      ImmediateInterruptOK during authentication, by immediately treating
      timeouts during authentication as a reason to die. As die interrupts
      are handled immediately during client communication that provides a
      sensibly quick reaction time to authentication timeout.
      
      Additionally add a few CHECK_FOR_INTERRUPTS() to some more complex
      authentication methods. More could be added, but this already should
      provides a reasonable coverage.
      
      While it this overall increases the maximum time till a timeout is
      reacted to, it greatly reduces complexity and increases
      reliability. That seems like a overall win. If the increase proves to
      be noticeable we can deal with those cases by moving to nonblocking
      network code and add interrupt checking there.
      
      Reviewed-By: Heikki Linnakangas
      6647248e
    • Tom Lane's avatar
      Remove unused "m" field in LSEG. · cec916f3
      Tom Lane authored
      This field has been unreferenced since 1998, and does not appear in lseg
      values stored on disk (since sizeof(lseg) is only 32 bytes according to
      pg_type).  There was apparently some idea of maintaining it just in values
      appearing in memory, but the bookkeeping required to make that work would
      surely far outweigh the cost of recalculating the line's slope when needed.
      Remove it to (a) simplify matters and (b) suppress some uninitialized-field
      whining from Coverity.
      cec916f3
    • Andres Freund's avatar
      Process 'die' interrupts while reading/writing from the client socket. · 4fe384bd
      Andres Freund authored
      Up to now it was impossible to terminate a backend that was trying to
      send/recv data to/from the client when the socket's buffer was already
      full/empty. While the send/recv calls itself might have gotten
      interrupted by signals on some platforms, we just immediately retried.
      
      That could lead to situations where a backend couldn't be terminated ,
      after a client died without the connection being closed, because it
      was blocked in send/recv.
      
      The problem was far more likely to be hit when sending data than when
      reading. That's because while reading a command from the client, and
      during authentication, we processed interrupts immediately . That
      primarily left COPY FROM STDIN as being problematic for recv.
      
      Change things so that that we process 'die' events immediately when
      the appropriate signal arrives. We can't sensibly react to query
      cancels at that point, because we might loose sync with the client as
      we could be in the middle of writing a message.
      
      We don't interrupt writes if the write buffer isn't full, as indicated
      by write() returning EWOULDBLOCK, as that would lead to fewer error
      messages reaching clients.
      
      Per discussion with Kyotaro HORIGUCHI and Heikki Linnakangas
      
      Discussion: 20140927191243.GD5423@alap3.anarazel.de
      4fe384bd
    • Andres Freund's avatar
      Introduce and use infrastructure for interrupt processing during client reads. · 4f85fde8
      Andres Freund authored
      Up to now large swathes of backend code ran inside signal handlers
      while reading commands from the client, to allow for speedy reaction to
      asynchronous events. Most prominently shared invalidation and NOTIFY
      handling. That means that complex code like the starting/stopping of
      transactions is run in signal handlers...  The required code was
      fragile and verbose, and is likely to contain bugs.
      
      That approach also severely limited what could be done while
      communicating with the client. As the read might be from within
      openssl it wasn't safely possible to trigger an error, e.g. to cancel
      a backend in idle-in-transaction state. We did that in some cases,
      namely fatal errors, nonetheless.
      
      Now that FE/BE communication in the backend employs non-blocking
      sockets and latches to block, we can quite simply interrupt reads from
      signal handlers by setting the latch. That allows us to signal an
      interrupted read, which is supposed to be retried after returning from
      within the ssl library.
      
      As signal handlers now only need to set the latch to guarantee timely
      interrupt processing, remove a fair amount of complicated & fragile
      code from async.c and sinval.c.
      
      We could now actually start to process some kinds of interrupts, like
      sinval ones, more often that before, but that seems better done
      separately.
      
      This work will hopefully allow to handle cases like being blocked by
      sending data, interrupting idle transactions and similar to be
      implemented without too much effort.  In addition to allowing getting
      rid of ImmediateInterruptOK, that is.
      
      Author: Andres Freund
      Reviewed-By: Heikki Linnakangas
      4f85fde8
    • Andres Freund's avatar
      Use a nonblocking socket for FE/BE communication and block using latches. · 387da188
      Andres Freund authored
      This allows to introduce more elaborate handling of interrupts while
      reading from a socket.  Currently some interrupt handlers have to do
      significant work from inside signal handlers, and it's very hard to
      correctly write code to do so.  Generic signal handler limitations,
      combined with the fact that we can't safely jump out of a signal
      handler while reading from the client have prohibited implementation
      of features like timeouts for idle-in-transaction.
      
      Additionally we use the latch code to wait in a couple places where we
      previously only had waiting code on windows as other platforms just
      busy looped.
      
      This can increase the number of systemcalls happening during FE/BE
      communication. Benchmarks so far indicate that the impact isn't very
      high, and there's room for optimization in the latch code. The chance
      of cleaning up the usage of latches gives us, seem to outweigh the
      risk of small performance regressions.
      
      This commit theoretically can't used without the next patch in the
      series, as WaitLatchOrSocket is not defined to be fully signal
      safe. As we already do that in some cases though, it seems better to
      keep the commits separate, so they're easier to understand.
      
      Author: Andres Freund
      Reviewed-By: Heikki Linnakangas
      387da188
    • Tom Lane's avatar
      Fix breakage in GEODEBUG debug code. · 778d498c
      Tom Lane authored
      LINE doesn't have an "m" field (anymore anyway).  Also fix unportable
      assumption that %x can print the result of pointer subtraction.
      
      In passing, improve single_decode() in minor ways:
      * Remove unnecessary leading-whitespace skip (strtod does that already).
      * Make GEODEBUG message more intelligible.
      * Remove entirely-useless test to see if strtod returned a silly pointer.
      * Don't bother computing trailing-whitespace skip unless caller wants
        an ending pointer.
      
      This has been broken since 261c7d4b.
      Although it's only debug code, might as well fix the 9.4 branch too.
      778d498c
    • Heikki Linnakangas's avatar
      Add API functions to libpq to interrogate SSL related stuff. · 91fa7b47
      Heikki Linnakangas authored
      This makes it possible to query for things like the SSL version and cipher
      used, without depending on OpenSSL functions or macros. That is a good
      thing if we ever get another SSL implementation.
      
      PQgetssl() still works, but it should be considered as deprecated as it
      only works with OpenSSL. In particular, PQgetSslInUse() should be used to
      check if a connection uses SSL, because as soon as we have another
      implementation, PQgetssl() will return NULL even if SSL is in use.
      91fa7b47