1. 04 Jun, 2016 1 commit
  2. 03 Jun, 2016 12 commits
    • Tom Lane's avatar
      Fix grammar's AND/OR flattening to work with operator_precedence_warning. · 05104f69
      Tom Lane authored
      It'd be good for "(x AND y) AND z" to produce a three-child AND node
      whether or not operator_precedence_warning is on, but that failed to
      happen when it's on because makeAndExpr() didn't look through the added
      AEXPR_PAREN node.  This has no effect on generated plans because prepqual.c
      would flatten the AND nest anyway; but it does affect the number of parens
      printed in ruleutils.c, for example.  I'd already fixed some similar
      hazards in parse_expr.c in commit abb16465, but didn't think to search
      gram.y for problems of this ilk.  Per gripe from Jean-Pierre Pelletier.
      
      Report: <fa0535ec6d6428cfec40c7e8a6d11156@mail.gmail.com>
      05104f69
    • Tom Lane's avatar
      Inline the easy cases in MakeExpandedObjectReadOnly(). · d50183c5
      Tom Lane authored
      This attempts to buy back some of whatever performance we lost from fixing
      bug #14174 by inlining the initial checks in MakeExpandedObjectReadOnly()
      into the callers.  We can do that in a macro without creating multiple-
      evaluation hazards, so it's pretty much free notationally; and the amount
      of code added to callers should be minimal as well.  (Testing a value can't
      take many more instructions than passing it to a subroutine.)
      
      Might as well inline DatumIsReadWriteExpandedObject() while we're at it.
      
      This is an ABI break for callers, so it doesn't seem safe to put into 9.5,
      but I see no reason not to do it in HEAD.
      d50183c5
    • Tom Lane's avatar
      Mark read/write expanded values as read-only in ValuesNext(), too. · 9eaf5be5
      Tom Lane authored
      Further thought about bug #14174 motivated me to try the case of a
      R/W datum being returned from a VALUES list, and sure enough it was
      broken.  Fix that.
      
      Also add a regression test case exercising the same scenario for
      FunctionScan.  That's not broken right now, because the function's
      result will get shoved into a tuplestore between generation and use;
      but it could easily become broken whenever we get around to optimizing
      FunctionScan better.
      
      There don't seem to be any other places where we put the result of
      expression evaluation into a virtual tuple slot that could then be
      the source for Vars of further expression evaluation, so I think
      this is the end of this bug.
      9eaf5be5
    • Tom Lane's avatar
      Mark read/write expanded values as read-only in ExecProject(). · 69f526aa
      Tom Lane authored
      If a plan node output expression returns an "expanded" datum, and that
      output column is referenced in more than one place in upper-level plan
      nodes, we need to ensure that what is returned is a read-only reference
      not a read/write reference.  Otherwise one of the referencing sites could
      scribble on or even delete the expanded datum before we have evaluated the
      others.  Commit 1dc5ebc9, which introduced this feature, supposed
      that it'd be sufficient to make SubqueryScan nodes force their output
      columns to read-only state.  The folly of that was revealed by bug #14174
      from Andrew Gierth, and really should have been immediately obvious
      considering that the planner will happily optimize SubqueryScan nodes
      out of the plan without any regard for this issue.
      
      The safest fix seems to be to make ExecProject() force its results into
      read-only state; that will cover every case where a plan node returns
      expression results.  Actually we can delegate this to ExecTargetList()
      since we can recursively assume that plain Vars will not reference
      read-write datums.  That should keep the extra overhead down to something
      minimal.  We no longer need ExecMakeSlotContentsReadOnly(), which was
      introduced only in support of the idea that just a few plan node types
      would need to do this.
      
      In the future it would be nice to have the planner account for this problem
      and inject force-to-read-only expression evaluation nodes into only the
      places where there's a risk of multiple evaluation.  That's not a suitable
      solution for 9.5 or even 9.6 at this point, though.
      
      Report: <20160603124628.9932.41279@wrigleys.postgresql.org>
      69f526aa
    • Robert Haas's avatar
      Remove bogus code to apply PathTargets to partial paths. · 04ae11f6
      Robert Haas authored
      The partial paths that get modified may already have been used as
      part of a GatherPath which appears in the path list, so modifying
      them is not a good idea at this stage - especially because this
      code has no check that the PathTarget is in fact parallel-safe.
      
      When partial aggregation is being performed, this is actually
      harmless because we'll end up replacing the pathtargets here with
      the correct ones within create_grouping_paths().  But if we've got
      a query tree containing only scan/join operations then this can
      result in incorrectly pushing down parallel-restricted target
      list entries.  If those are, for example, references to subqueries,
      that can crash the server; but it's wrong in any event.
      
      Amit Kapila
      04ae11f6
    • Robert Haas's avatar
      Mark PostmasterPid as PGDLLIMPORT. · cac83219
      Robert Haas authored
      This is so that extensions can use it.
      
      Michael Paquier
      cac83219
    • Kevin Grittner's avatar
      Add new snapshot fields to serialize/deserialize functions. · 370a46fc
      Kevin Grittner authored
      The "snapshot too old" condition was not being recognized when
      using a copied snapshot, since the original timestamp and lsn were
      not being passed along.  Noticed when testing the combination of
      "snapshot too old" with parallel query execution.
      370a46fc
    • Robert Haas's avatar
      Fix comment to be more accurate. · 6436a853
      Robert Haas authored
      Now that we skip vacuuming all-frozen pages, this comment needs
      updating.
      
      Masahiko Sawada
      6436a853
    • Tom Lane's avatar
      Suppress -Wunused-result warnings about write(), again. · 6c72a28e
      Tom Lane authored
      Adopt the same solution as in commit aa90e148, but this time
      let's put the ugliness inside the write_stderr() macro, instead of
      expecting each call site to deal with it.  Back-port that decision
      into psql/common.c where I got the macro from in the first place.
      
      Per gripe from Peter Eisentraut.
      6c72a28e
    • Greg Stark's avatar
      Fix various common mispellings. · e1623c39
      Greg Stark authored
      Mostly these are just comments but there are a few in documentation
      and a handful in code and tests. Hopefully this doesn't cause too much
      unnecessary pain for backpatching. I relented from some of the most
      common like "thru" for that reason. The rest don't seem numerous
      enough to cause problems.
      
      Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings
      e1623c39
    • Tom Lane's avatar
      Measure Bloom index signature-length reloption in bits, not words. · ee4af347
      Tom Lane authored
      Per discussion, this is a more understandable and future-proof way of
      exposing the setting to users.  On-disk, we can still store it in words,
      so as to not break on-disk compatibility with beta1.
      
      Along the way, clean up the code associated with Bloom reloptions.
      Provide explicit macros for default and maximum lengths rather than
      having magic numbers buried in multiple places in the code.  Drop
      the adjustBloomOptions() code altogether: it was useless in view of
      the fact that reloptions.c already performed default-substitution and
      range checking for the options.  Rename a couple of macros and types
      for more clarity.
      
      Discussion: <23767.1464926580@sss.pgh.pa.us>
      ee4af347
    • Robert Haas's avatar
      Cosmetic improvements to freeze map code. · fdfaccfa
      Robert Haas authored
      Per post-commit review comments from Andres Freund, improve variable
      names, comments, and in one place, slightly improve the code structure.
      
      Masahiko Sawada
      fdfaccfa
  3. 02 Jun, 2016 4 commits
    • Greg Stark's avatar
      Be conservative about alignment requirements of struct epoll_event. · a3b30763
      Greg Stark authored
      Use MAXALIGN size/alignment to guarantee that later uses of memory are
      aligned correctly. E.g. epoll_event might need 8 byte alignment on some
      platforms, but earlier allocations like WaitEventSet and WaitEvent might
      not sized to guarantee that when purely using sizeof().
      
      Found by myself while testing on an Sun Ultra 5 (Sparc IIi) with some
      editorializing by Andres Freund.
      
      In passing fix a couple typos in the area
      a3b30763
    • Kevin Grittner's avatar
      C comment improvement & typo fix. · 4edb7bd2
      Kevin Grittner authored
      Thomas Munro
      4edb7bd2
    • Tom Lane's avatar
      Redesign handling of SIGTERM/control-C in parallel pg_dump/pg_restore. · e652273e
      Tom Lane authored
      Formerly, Unix builds of pg_dump/pg_restore would trap SIGINT and similar
      signals and set a flag that was tested in various data-transfer loops.
      This was prone to errors of omission (cf commit 3c8aa665); and even if
      the client-side response was prompt, we did nothing that would cause
      long-running SQL commands (e.g. CREATE INDEX) to terminate early.
      Also, the master process would effectively do nothing at all upon receipt
      of SIGINT; the only reason it seemed to work was that in typical scenarios
      the signal would also be delivered to the child processes.  We should
      support termination when a signal is delivered only to the master process,
      though.
      
      Windows builds had no console interrupt handler, so they would just fall
      over immediately at control-C, again leaving long-running SQL commands to
      finish unmolested.
      
      To fix, remove the flag-checking approach altogether.  Instead, allow the
      Unix signal handler to send a cancel request directly and then exit(1).
      In the master process, also have it forward the signal to the children.
      On Windows, add a console interrupt handler that behaves approximately
      the same.  The main difference is that a single execution of the Windows
      handler can send all the cancel requests since all the info is available
      in one process, whereas on Unix each process sends a cancel only for its
      own database connection.
      
      In passing, fix an old problem that DisconnectDatabase tends to send a
      cancel request before exiting a parallel worker, even if nothing went
      wrong.  This is at least a waste of cycles, and could lead to unexpected
      log messages, or maybe even data loss if it happened in pg_restore (though
      in the current code the problem seems to affect only pg_dump).  The cause
      was that after a COPY step, pg_dump was leaving libpq in PGASYNC_BUSY
      state, causing PQtransactionStatus() to report PQTRANS_ACTIVE.  That's
      normally harmless because the next PQexec() will silently clear the
      PGASYNC_BUSY state; but in a parallel worker we might exit without any
      additional SQL commands after a COPY step.  So add an extra PQgetResult()
      call after a COPY to allow libpq to return to PGASYNC_IDLE state.
      
      This is a bug fix, IMO, so back-patch to 9.3 where parallel dump/restore
      were introduced.
      
      Thanks to Kyotaro Horiguchi for Windows testing and code suggestions.
      
      Original-Patch: <7005.1464657274@sss.pgh.pa.us>
      Discussion: <20160602.174941.256342236.horiguchi.kyotaro@lab.ntt.co.jp>
      e652273e
    • Kevin Grittner's avatar
      Fix btree mark/restore bug. · 7392eed7
      Kevin Grittner authored
      Commit 2ed5b87f introduced a bug in
      mark/restore, in an attempt to optimize repeated restores to the
      same page.  This caused an assertion failure during a merge join
      which fed directly from an index scan, although the impact would
      not be limited to that case.  Revert the bad chunk of code from
      that commit.
      
      While investigating this bug it was discovered that a particular
      "paranoia" set of the mark position field would not prevent bad
      behavior; it would just make it harder to diagnose.  Change that
      into an assertion, which will draw attention to any future problem
      in that area more directly.
      
      Backpatch to 9.5, where the bug was introduced.
      
      Bug #14169 reported by Shinta Koyanagi.
      Preliminary analysis by Tom Lane identified which commit caused
      the bug.
      7392eed7
  4. 01 Jun, 2016 1 commit
    • Tom Lane's avatar
      Clean up some minor inefficiencies in parallel dump/restore. · 763eec6b
      Tom Lane authored
      Parallel dump did a totally pointless query to find out the name of each
      table to be dumped, which it already knows.  Parallel restore runs issued
      lots of redundant SET commands because _doSetFixedOutputState() was invoked
      once per TOC item rather than just once at connection start.  While the
      extra queries are insignificant if you're dumping or restoring large
      tables, it still seems worth getting rid of them.
      
      Also, give the responsibility for selecting the right client_encoding for
      a parallel dump worker to setup_connection() where it naturally belongs,
      instead of having ad-hoc code for that in CloneArchive().  And fix some
      minor bugs like use of strdup() where pg_strdup() would be safer.
      
      Back-patch to 9.3, mostly to keep the branches in sync in an area that
      we're still finding bugs in.
      
      Discussion: <5086.1464793073@sss.pgh.pa.us>
      763eec6b
  5. 31 May, 2016 5 commits
    • Peter Eisentraut's avatar
      doc: Update version() and current_date output in tutorial · 9ee56dfe
      Peter Eisentraut authored
      While the version number is automatically updated in the example output,
      the other details looked a bit dated.
      
      suggested by mike2.schneider@gmail.com
      9ee56dfe
    • Tom Lane's avatar
      Avoid useless closely-spaced writes of statistics files. · 22b27b4c
      Tom Lane authored
      The original intent in the stats collector was that we should not write out
      stats data oftener than every PGSTAT_STAT_INTERVAL msec.  Backends will not
      make requests at all if they see the existing data is newer than that, and
      the stats collector is supposed to disregard requests having a cutoff_time
      older than its most recently written data, so that close-together requests
      don't result in multiple writes.  But the latter part of that got broken
      in commit 187492b6, so that if two backends concurrently decide
      the existing stats are too old, the collector would write the data twice.
      (In principle the collector's logic would still merge requests as long as
      the second one arrives before we've actually written data ... but since
      the message collection loop would write data immediately after processing
      a single inquiry message, that never happened in practice, and in any case
      the window in which it might work would be much shorter than
      PGSTAT_STAT_INTERVAL.)
      
      To fix, improve pgstat_recv_inquiry so that it checks whether the cutoff
      time is too old, and doesn't add a request to the queue if so.  This means
      that we do not need DBWriteRequest.request_time, because the decision is
      taken before making a queue entry.  And that means that we don't really
      need the DBWriteRequest data structure at all; an OID list of database
      OIDs will serve and allow removal of some rather verbose and crufty code.
      
      In passing, improve the comments in this area, which have been rather
      neglected.  Also change backend_read_statsfile so that it's not silently
      relying on MyDatabaseId to have some particular value in the autovacuum
      launcher process.  It accidentally worked as desired because MyDatabaseId
      is zero in that process; but that does not seem like a dependency we want,
      especially with no documentation about it.
      
      Although this patch is mine, it turns out I'd rediscovered a known bug,
      for which Tomas Vondra had already submitted a patch that's functionally
      equivalent to the non-cosmetic aspects of this patch.  Thanks to Tomas
      for reviewing this version.
      
      Back-patch to 9.3 where the bug was introduced.
      
      Prior-Discussion: <1718942738eb65c8407fcd864883f4c8@fuzzy.cz>
      Patch: <4625.1464202586@sss.pgh.pa.us>
      22b27b4c
    • Peter Eisentraut's avatar
      Fix whitespace · aa14bc41
      Peter Eisentraut authored
      aa14bc41
    • Tom Lane's avatar
      Fix typo in CREATE DATABASE syntax synopsis. · 6d69ea33
      Tom Lane authored
      Misplaced "]", evidently a thinko in commit 213335c1.
      6d69ea33
    • Noah Misch's avatar
      Mirror struct Aggref field order in _copyAggref(). · 2195c5af
      Noah Misch authored
      This is cosmetic, and no supported release has the affected fields.
      2195c5af
  6. 30 May, 2016 2 commits
    • Andres Freund's avatar
      Move memory barrier in UnlockBufHdr to before releasing the lock. · 87a3023c
      Andres Freund authored
      This bug appears to have been introduced late in the development of
      48354581 ("Allow Pin/UnpinBuffer to operate in a lockfree
      manner.").
      
      Found while debugging a bug which turned out to be independent of the
      commit mentioned above.
      
      Backpatch: -
      87a3023c
    • Alvaro Herrera's avatar
      Fix PageAddItem BRIN bug · 975ad4e6
      Alvaro Herrera authored
      BRIN was relying on the ability to remove a tuple from an index page,
      then putting another tuple in the same line pointer.  But PageAddItem
      refuses to add a tuple beyond the first free item past the last used
      item, and in particular, it rejects an attempt to add an item to an
      empty page anywhere other than the first line pointer.  PageAddItem
      issues a WARNING and indicates to the caller that it failed, which in
      turn causes the BRIN calling code to issue a PANIC, so the whole
      sequence looks like this:
      	WARNING:  specified item offset is too large
      	PANIC:  failed to add BRIN tuple
      
      To fix, create a new function PageAddItemExtended which is like
      PageAddItem except that the two boolean arguments become a flags bitmap;
      the "overwrite" and "is_heap" boolean flags in PageAddItem become
      PAI_OVERWITE and PAI_IS_HEAP flags in the new function, and a new flag
      PAI_ALLOW_FAR_OFFSET enables the behavior required by BRIN.
      PageAddItem() retains its original signature, for compatibility with
      third-party modules (other callers in core code are not modified,
      either).
      
      Also, in the belt-and-suspenders spirit, I added a new sanity check in
      brinGetTupleForHeapBlock to raise an error if an TID found in the revmap
      is not marked as live by the page header.  This causes it to react with
      "ERROR: corrupted BRIN index" to the bug at hand, rather than a hard
      crash.
      
      Backpatch to 9.5.
      
      Bug reported by Andreas Seltenreich as detected by his handy sqlsmith
      fuzzer.
      Discussion: https://www.postgresql.org/message-id/87mvni77jh.fsf@elite.ansel.ydns.eu
      975ad4e6
  7. 29 May, 2016 2 commits
    • Tom Lane's avatar
      Fix missing abort checks in pg_backup_directory.c. · 3c8aa665
      Tom Lane authored
      Parallel restore from directory format failed to respond to control-C
      in a timely manner, because there were no checkAborting() calls in the
      code path that reads data from a file and sends it to the backend.
      If any worker was in the midst of restoring data for a large table,
      you'd just have to wait.
      
      This fix doesn't do anything for the problem of aborting a long-running
      server-side command, but at least it fixes things for data transfers.
      
      Back-patch to 9.3 where parallel restore was introduced.
      3c8aa665
    • Tom Lane's avatar
      Remove pg_dump/parallel.c's useless "aborting" flag. · 210981a4
      Tom Lane authored
      This was effectively dead code, since the places that tested it could not
      be reached after we entered the on-exit-cleanup routine that would set it.
      It seems to have been a leftover from a design in which error abort would
      try to send fresh commands to the workers --- a design which could never
      have worked reliably, of course.  Since the flag is not cross-platform, it
      complicates reasoning about the code's behavior, which we could do without.
      
      Although this is effectively just cosmetic, back-patch anyway, because
      there are some actual bugs in the vicinity of this behavior.
      
      Discussion: <15583.1464462418@sss.pgh.pa.us>
      210981a4
  8. 28 May, 2016 1 commit
    • Tom Lane's avatar
      Lots of comment-fixing, and minor cosmetic cleanup, in pg_dump/parallel.c. · 6b3094c2
      Tom Lane authored
      The commentary in this file was in extremely sad shape.  The author(s)
      had clearly never heard of the project convention that a function header
      comment should provide an API spec of some sort for that function.  Much
      of it was flat out wrong, too --- maybe it was accurate when written, but
      if so it had not been updated to track subsequent code revisions.  Rewrite
      and rearrange to try to bring it up to speed, and annotate some of the
      places where more work is needed.  (I've refrained from actually fixing
      anything of substance ... yet.)
      
      Also, rename a couple of functions for more clarity as to what they do,
      do some very minor code rearrangement, remove some pointless Asserts,
      fix an incorrect Assert in readMessageFromPipe, and add a missing socket
      close in one error exit from pgpipe().  The last would be a bug if we
      tried to continue after pgpipe() failure, but since we don't, it's just
      cosmetic at present.
      
      Although this is only cosmetic, back-patch to 9.3 where parallel.c was
      added.  It's sufficiently invasive that it'll pose a hazard for future
      back-patching if we don't.
      
      Discussion: <25239.1464386067@sss.pgh.pa.us>
      6b3094c2
  9. 27 May, 2016 4 commits
    • Tom Lane's avatar
      Clean up thread management in parallel pg_dump for Windows. · 807b4537
      Tom Lane authored
      Since we start the worker threads with _beginthreadex(), we should use
      _endthreadex() to terminate them.  We got this right in the normal-exit
      code path, but not so much during an error exit from a worker.
      In addition, be sure to apply CloseHandle to the thread handle after
      each thread exits.
      
      It's not clear that these oversights cause any user-visible problems,
      since the pg_dump run is about to terminate anyway.  Still, it's clearly
      better to follow Microsoft's API specifications than ignore them.
      
      Also a few cosmetic cleanups in WaitForTerminatingWorkers(), including
      being a bit less random about where to cast between uintptr_t and HANDLE,
      and being sure to clear the worker identity field for each dead worker
      (not that false matches should be possible later, but let's be careful).
      
      Original observation and patch by Armin Schöffmann, cosmetic improvements
      by Michael Paquier and me.  (Armin's patch also included closing sockets
      in ShutdownWorkersHard(), but that's been dealt with already in commit
      df8d2d8c.)  Back-patch to 9.3 where parallel pg_dump was introduced.
      
      Discussion: <zarafa.570306bd.3418.074bf1420d8f2ba2@root.aegaeon.de>
      807b4537
    • Tom Lane's avatar
      Fix release-note typo. · d81ecb9b
      Tom Lane authored
      Léonard Benedetti
      d81ecb9b
    • Tom Lane's avatar
      Fix DROP ACCESS METHOD IF EXISTS. · 83dbde94
      Tom Lane authored
      The IF EXISTS option was documented, and implemented in the grammar, but
      it didn't actually work for lack of support in does_not_exist_skipping().
      Per bug #14160.
      
      Report and patch by Kouhei Sutou
      
      Report: <20160527070433.19424.81712@wrigleys.postgresql.org>
      83dbde94
    • Tom Lane's avatar
      Be more predictable about reporting "lock timeout" vs "statement timeout". · 9dd4178c
      Tom Lane authored
      If both timeout indicators are set when we arrive at ProcessInterrupts,
      we've historically just reported "lock timeout".  However, some buildfarm
      members have been observed to fail isolationtester's timeouts test by
      reporting "lock timeout" when the statement timeout was expected to fire
      first.  The cause seems to be that the process is allowed to sleep longer
      than expected (probably due to heavy machine load) so that the lock
      timeout happens before we reach the point of reporting the error, and
      then this arbitrary tiebreak rule does the wrong thing.  We can improve
      matters by comparing the scheduled timeout times to decide which error
      to report.
      
      I had originally proposed greatly reducing the 1-second window between
      the two timeouts in the test cases.  On reflection that is a bad idea,
      at least for the case where the lock timeout is expected to fire first,
      because that would assume that it takes negligible time to get from
      statement start to the beginning of the lock wait.  Thus, this patch
      doesn't completely remove the risk of test failures on slow machines.
      Empirically, however, the case this handles is the one we are seeing
      in the buildfarm.  The explanation may be that the other case requires
      the scheduler to take the CPU away from a busy process, whereas the
      case fixed here only requires the scheduler to not give the CPU back
      right away to a process that has been woken from a multi-second sleep
      (and, perhaps, has been swapped out meanwhile).
      
      Back-patch to 9.3 where the isolationtester timeouts test was added.
      
      Discussion: <8693.1464314819@sss.pgh.pa.us>
      9dd4178c
  10. 26 May, 2016 5 commits
    • Magnus Hagander's avatar
      Make pg_dump error cleanly with -j against hot standby · d74048de
      Magnus Hagander authored
      Getting a synchronized snapshot is not supported on a hot standby node,
      and is by default taken when using -j with multiple sessions. Trying to
      do so still failed, but with a server error that would also go in the
      log. Instead, proprely detect this case and give a better error message.
      d74048de
    • Tom Lane's avatar
      Disable physical tlist if any Var would need multiple sortgroupref labels. · aeb9ae64
      Tom Lane authored
      As part of upper planner pathification (commit 3fc6e2d7) I redid
      createplan.c's approach to the physical-tlist optimization, in which scan
      nodes are allowed to return exactly the underlying table's columns so as
      to save doing a projection step at runtime.  The logic was intentionally
      more aggressive than before about applying the optimization, which is
      generally a good thing, but Andres Freund found a case in which it got
      too aggressive.  Namely, if any column is referenced more than once in
      the parent plan node's sorting or grouping column list, we can't optimize
      because then that column would need to have more than one ressortgroupref
      label, and we only have space for one.
      
      Add logic to detect this situation in use_physical_tlist(), and also add
      some error checking in apply_pathtarget_labeling_to_tlist(), which this
      example proves was being overly cavalier about whether what it was doing
      made any sense.
      
      The added test case exposes the problem only because we do not eliminate
      duplicate grouping keys.  That might be something to fix someday, but it
      doesn't seem like appropriate post-beta work.
      
      Report: <20160526021235.w4nq7k3gnheg7vit@alap3.anarazel.de>
      aeb9ae64
    • Alvaro Herrera's avatar
      Fix typo in 9.5 release nodes · d7ef3572
      Alvaro Herrera authored
      Noted by 星合 拓馬 (HOSHIAI Takuma)
      d7ef3572
    • Tom Lane's avatar
      Make pg_dump behave more sanely when built without HAVE_LIBZ. · cae2bb19
      Tom Lane authored
      For some reason the code to emit a warning and switch to uncompressed
      output was placed down in the guts of pg_backup_archiver.c.  This is
      definitely too late in the case of parallel operation (and I rather
      wonder if it wasn't too late for other purposes as well).  Put it in
      pg_dump.c's option-processing logic, which seems a much saner place.
      
      Also, the default behavior with custom or directory output format was
      to emit the warning telling you the output would be uncompressed.  This
      seems unhelpful, so silence that case.
      
      Back-patch to 9.3 where parallel dump was introduced.
      
      Kyotaro Horiguchi, adjusted a bit by me
      
      Report: <20160526.185551.242041780.horiguchi.kyotaro@lab.ntt.co.jp>
      cae2bb19
    • Tom Lane's avatar
      In Windows pg_dump, ensure idle workers will shut down during error exit. · df8d2d8c
      Tom Lane authored
      The Windows coding of ShutdownWorkersHard() thought that setting termEvent
      was sufficient to make workers exit after an error.  But that only helps
      if a worker is busy and passes through checkAborting().  An idle worker
      will just sit, resulting in pg_dump failing to exit until the user gives up
      and hits control-C.  We should close the write end of the command pipe
      so that idle workers will see socket EOF and exit, as the Unix coding was
      already doing.
      
      Back-patch to 9.3 where parallel pg_dump was introduced.
      
      Kyotaro Horiguchi
      df8d2d8c
  11. 25 May, 2016 3 commits
    • Tom Lane's avatar
      Remove option to write USING before opclass name in CREATE INDEX. · b898eb63
      Tom Lane authored
      Dating back to commit f10b6392, our grammar has allowed "USING" to
      optionally appear before an opclass name in CREATE INDEX (and, lately,
      some related places such as ON CONFLICT specifications).  Nikolay Shaplov
      noticed that this syntax existed but wasn't documented, and proposed
      documenting it.  But what seems like a better idea is to remove the
      production, thereby making the code match the docs not vice versa.
      This isn't our usual modus operandi for such cases, but there are a
      couple of good reasons to proceed this way:
      
      * So far as I can find, this syntax has never been documented anywhere.
      It isn't relied on by any of our own code or test cases, and there seems
      little reason to suppose that it's been used in the wild either.
      
      * Documenting it would mean that there would be two separate uses of
      USING in the CREATE INDEX syntax, the other being "USING access_method".
      That can lead to nothing but confusion.
      
      So, let's just remove it.  On the off chance that somebody somewhere
      is using it, this isn't something to back-patch, but we can fix it
      in HEAD.
      
      Discussion: <1593237.l7oKHRpxSe@nataraj-amd64>
      b898eb63
    • Tom Lane's avatar
      Ensure that backends see up-to-date statistics for shared catalogs. · 52e8fc3e
      Tom Lane authored
      Ever since we split the statistics collector's reports into per-database
      files (commit 187492b6), backends have been seeing stale statistics
      for shared catalogs.  This is because the inquiry message only prompts the
      collector to write the per-database file for the requesting backend's own
      database.  Stats for shared catalogs are in a separate file for "DB 0",
      which didn't get updated.
      
      In normal operation this was partially masked by the fact that the
      autovacuum launcher would send an inquiry message at least once per
      autovacuum_naptime that asked for "DB 0"; so the shared-catalog stats would
      never be more than a minute out of date.  However the problem becomes very
      obvious with autovacuum disabled, as reported by Peter Eisentraut.
      
      To fix, redefine the semantics of inquiry messages so that both the
      specified DB and DB 0 will be dumped.  (This might seem a bit inefficient,
      but we have no good way to know whether a backend's transaction will look
      at shared-catalog stats, so we have to read both groups of stats whenever
      we request stats.  Sending two inquiry messages would definitely not be
      better.)
      
      Back-patch to 9.3 where the bug was introduced.
      
      Report: <56AD41AC.1030509@gmx.net>
      52e8fc3e
    • Tom Lane's avatar
      Fix broken error handling in parallel pg_dump/pg_restore. · 9abd64ec
      Tom Lane authored
      In the original design for parallel dump, worker processes reported errors
      by sending them up to the master process, which would print the messages.
      This is unworkably fragile for a couple of reasons: it risks deadlock if a
      worker sends an error at an unexpected time, and if the master has already
      died for some reason, the user will never get to see the error at all.
      Revert that idea and go back to just always printing messages to stderr.
      This approach means that if all the workers fail for similar reasons (eg,
      bad password or server shutdown), the user will see N copies of that
      message, not only one as before.  While that's slightly annoying, it's
      certainly better than not seeing any message; not to mention that we
      shouldn't assume that only the first failure is interesting.
      
      An additional problem in the same area was that the master failed to
      disable SIGPIPE (at least until much too late), which meant that sending a
      command to an already-dead worker would cause the master to crash silently.
      That was bad enough in itself but was made worse by the total reliance on
      the master to print errors: even if the worker had reported an error, you
      would probably not see it, depending on timing.  Instead disable SIGPIPE
      right after we've forked the workers, before attempting to send them
      anything.
      
      Additionally, the master relies on seeing socket EOF to realize that a
      worker has exited prematurely --- but on Windows, there would be no EOF
      since the socket is attached to the process that includes both the master
      and worker threads, so it remains open.  Make archive_close_connection()
      close the worker end of the sockets so that this acts more like the Unix
      case.  It's not perfect, because if a worker thread exits without going
      through exit_nicely() the closures won't happen; but that's not really
      supposed to happen.
      
      This has been wrong all along, so back-patch to 9.3 where parallel dump
      was introduced.
      
      Report: <2458.1450894615@sss.pgh.pa.us>
      9abd64ec