1. 17 Dec, 2019 5 commits
    • Amit Kapila's avatar
      Change overly strict Assert in TransactionGroupUpdateXidStatus. · af3290f5
      Amit Kapila authored
      This Assert thought that an overflowed transaction can never get registered
      for the group update.  But that is not true, because even when the number
      of children for a transaction got reduced, the overflow flag is not
      changed.  And, for group update, we only care about the current number of
      children for a transaction that is being committed.
      
      Based on comments by Andres Freund, remove a redundant Assert in
      TransactionIdSetPageStatus as we already had a static Assert for the same
      condition a few lines earlier.
      
      Reported-by: Vignesh C
      Author: Dilip Kumar
      Reviewed-by: Amit Kapila
      Backpatch-through: 11
      Discussion: https://postgr.es/m/CAFiTN-s5=uJw-Z6JC9gcqtBSjXsrHnU63PXBrA=pnBjqnkm5UA@mail.gmail.com
      af3290f5
    • Peter Geoghegan's avatar
      Rename nbtree tuple macros. · fcf3b691
      Peter Geoghegan authored
      Rename two function-style macros, removing the word "inner".  This makes
      things more consistent.
      fcf3b691
    • Michael Paquier's avatar
      Fix query cancellation handling in psql · 5d43c3c5
      Michael Paquier authored
      The refactoring done in a4fd3aa7 for query cancellation has messed up
      with the logic in psql by mixing CancelRequested and cancel_pressed,
      breaking for example \watch.  The former would be switched to true if a
      cancellation request has been attempted and that it actually succeeded,
      and the latter tracks if a cancellation attempt has been done.
      
      This commit brings back the code of psql to a state consistent to what
      it was before a4fd3aa7, without giving up on the refactoring pieces
      introduced.  It should be actually possible to merge more both flags as
      their concepts are close enough, however note that psql's --single-step
      mode relies on cancel_pressed to be always set, so this requires more
      careful analysis left for later.
      
      While on it, fix the declarations of CancelRequested (in cancel.c) and
      cancel_pressed (in psql) to be volatile sig_atomic_t.  Previously,
      both were declared as booleans, which should be fine on modern
      platforms, but the C standard recommends the use of sig_atomic_t for
      variables used in signal handlers.  Note that since its introduction in
      a1792320, CancelRequested declaration was not volatile.
      
      Reported-by: Jeff Janes
      Author: Michael Paquier
      Discussion: https://postgr.es/m/CAMkU=1zpoUDGKqWKuMWkj7t-bOCaJDx0r=5te_-d0B2HVLABXg@mail.gmail.com
      5d43c3c5
    • Tom Lane's avatar
      Fix "force_parallel_mode = regress" to work with ANALYZE + VERBOSE. · b925a00f
      Tom Lane authored
      force_parallel_mode = regress is supposed to force use of a Gather
      node without having any impact on EXPLAIN output.  But it failed to
      accomplish that if both ANALYZE and VERBOSE are given, because that
      enables per-worker output data that you wouldn't see if the Gather
      hadn't been inserted.  Improve the logic so that we suppress the
      per-worker data too.
      
      This allows putting the new test case added by commit 5935917c
      back into the originally intended form (cf. 776a2c88, 22864f6e).
      We can also get rid of a kluge in subselect.sql, which previously
      had to clean up after force_parallel_mode's failure to do what it
      said on the tin.
      
      Discussion: https://postgr.es/m/18445.1576177309@sss.pgh.pa.us
      b925a00f
    • Peter Geoghegan's avatar
      Update nbtree README's "Scans during Recovery". · 9067b839
      Peter Geoghegan authored
      get_actual_variable_range() hasn't used a dirty snapshot since commit
      3ca930fc, which invented a new snapshot type specifically to meet
      selfuncs.c's requirements (HeapTupleSatisfiesNonVacuumable() type
      snapshots were added).
      
      Discussion: https://postgr.es/m/CAH2-Wzn2pSqEOcBDAA40CnO82oEy-EOpE2bNh_XL_cfFoA86jw@mail.gmail.com
      9067b839
  2. 16 Dec, 2019 6 commits
  3. 15 Dec, 2019 1 commit
  4. 14 Dec, 2019 5 commits
    • Tom Lane's avatar
      Try to stabilize results of new tuplesort regression test. · baa32ce2
      Tom Lane authored
      It appears that a concurrent autovacuum/autoanalyze run can cause
      changes in the plans expected by this test.  To prevent that, change
      the tables it uses to be temp tables --- there's no need for them
      to be permanent, and this should save a few cycles too.
      
      Discussion: https://postgr.es/m/3244.1576160824@sss.pgh.pa.us
      baa32ce2
    • Tom Lane's avatar
      Prevent overly-aggressive collapsing of joins to RTE_RESULT relations. · 6ea364e7
      Tom Lane authored
      The RTE_RESULT simplification logic added by commit 4be058fe had a
      flaw: it would collapse out a RTE_RESULT that is due to compute a
      PlaceHolderVar, and reassign the PHV to the parent join level, even if
      another input relation of the join contained a lateral reference to
      the PHV.  That can't work because the PHV would be computed too late.
      In practice it led to failures of internal sanity checks later in
      planning (either assertion failures or errors such as "failed to
      construct the join relation").
      
      To fix, add code to check for the presence of such PHVs in relevant
      portions of the query tree.  Notably, this required refactoring
      range_table_walker so that a caller could ask to walk individual RTEs
      not the whole list.  (It might be a good idea to refactor
      range_table_mutator in the same way, if only to keep those functions
      looking similar; but I didn't do so here as it wasn't necessary for
      the bug fix.)
      
      This exercise also taught me that find_dependent_phvs(), as it stood,
      could only safely be used on the entire Query, not on subtrees.
      Adjust its API to reflect that; which in passing allows it to have
      a fast path for the common case of no PHVs anywhere.
      
      Per report from Will Leinweber.  Back-patch to v12 where the bug
      was introduced.
      
      Discussion: https://postgr.es/m/CALLb-4xJMd4GZt2YCecMC95H-PafuWNKcmps4HLRx2NHNBfB4g@mail.gmail.com
      6ea364e7
    • Michael Paquier's avatar
      Fix memory leak when initializing DH parameters in backend · e0e569e1
      Michael Paquier authored
      When loading DH parameters used for the generation of ephemeral DH keys
      in the backend, the code has never bothered releasing the memory used
      for the DH information loaded from a file or from libpq's default.  This
      commit makes sure that the information is properly free()'d.
      
      Note that as SSL parameters can be reloaded, this can cause an accumulation
      of memory leaked.  As the leak is minor, no backpatch is done.
      
      Reported-by: Dmitry Uspenskiy
      Discussion: https://postgr.es/m/16160-18367e56e9a28264@postgresql.org
      e0e569e1
    • Thomas Munro's avatar
      Fix mdsyncfiletag(), take II. · 7c85be08
      Thomas Munro authored
      The previous commit failed to consider that FileGetRawDesc() might
      not return a valid fd, as discovered on the build farm.  Switch to
      using the File interface only.
      
      Back-patch to 12, like the previous commit.
      7c85be08
    • Thomas Munro's avatar
      Don't use _mdfd_getseg() in mdsyncfiletag(). · 7bb3102c
      Thomas Munro authored
      _mdfd_getseg() opens all segments up to the requested one.  That
      causes problems for mdsyncfiletag(), if mdunlinkfork() has
      already unlinked other segment files.  Open the file we want
      directly by name instead, if it's not already open.
      
      The consequence of this bug was a rare panic in the checkpointer,
      made more likely if you saturated the sync request queue so that
      the SYNC_FORGET_REQUEST messages for a given relation were more
      likely to be absorbed in separate cycles by the checkpointer.
      
      Back-patch to 12.  Defect in commit 3eb77eba.
      
      Author: Thomas Munro
      Reported-by: Justin Pryzby
      Discussion: https://postgr.es/m/20191119115759.GI30362%40telsasoft.com
      7bb3102c
  5. 13 Dec, 2019 2 commits
    • Heikki Linnakangas's avatar
      Fix crash when a page was split during GiST index creation. · a7ee7c85
      Heikki Linnakangas authored
      The bug was similar to the one that was fixed in commit 22251686. When
      we split page X and insert the downlink for the new page, the parent page
      might also need to be split. When that happens, the downlink offset number
      we remembered for X is no longer valid. We correctly called
      gistFindCorrectParent() to re-find it, but gistFindCorrectParent() doesn't
      do anything if the LSN of the page hasn't changed, and we stopped updating
      LSNs during index build in commit 9155580f. The buggy codepath was taken
      if the page was split into three or more pages, and inserting the downlink
      caused the parent page to split. To fix, explicitly mark the downlink
      offset number as invalid, to force gistFindCorrectParent() to re-find it.
      
      Fixes bug #16134 reported by Alexander Lakhin, reported again as #16162 by
      Andreas Kunert. Thanks to Jeff Janes, Tom Lane and Tomas Vondra for
      debugging. Backpatch to v12, where we stopped WAL-logging during index
      build.
      
      Discussion: https://www.postgresql.org/message-id/16134-0423f729671dec64%40postgresql.org
      Discussion: https://www.postgresql.org/message-id/16162-45d21b7b6c1a3105%40postgresql.org
      a7ee7c85
    • Tom Lane's avatar
      Modernize our readline API a tad. · 5e7bedc5
      Tom Lane authored
      Prefer to call "rl_filename_completion_function" and
      "rl_completion_matches", rather than using the names without the rl_
      prefix.  This matches Readline's documentation, and makes our code
      a little clearer about which names are external.  On platforms that
      only have the un-prefixed names (just some very ancient versions of
      libedit, AFAICT), reverse the direction of the compatibility macro
      definitions to match.
      
      Also, remove our extern declaration of "filename_completion_function";
      whatever libedit versions may have failed to declare that are surely
      dead and buried.
      
      Discussion: https://postgr.es/m/23608.1576248145@sss.pgh.pa.us
      5e7bedc5
  6. 12 Dec, 2019 6 commits
  7. 11 Dec, 2019 8 commits
    • Tom Lane's avatar
      Remove unstable test case added in commit 5935917c. · 776a2c88
      Tom Lane authored
      The buildfarm says this produces some unexpected output with
      force_parallel_mode = regress.  There's probably a bug underneath
      that, but for the moment just delete the test case to make the
      buildfarm green again.
      
      (I now notice that the case had also failed to get updated to follow
      commit d52eaa09, which made plan_cache_mode = force_generic_plan
      prevail throughout partition_prune.sql; it was thereby managing to
      break a later test.  When/if we put this back in, *don't* include the
      SET and RESET commands.)
      776a2c88
    • Tom Lane's avatar
      Allow executor startup pruning to prune all child nodes. · 5935917c
      Tom Lane authored
      Previously, if the startup pruning logic proved that all child nodes
      of an Append or MergeAppend could be pruned, we still kept one, just
      to keep EXPLAIN from failing.  The previous commit removed the
      ruleutils.c limitation that required this kluge, so drop it.  That
      results in less-confusing EXPLAIN output, as per a complaint from
      Yuzuko Hosoya.
      
      David Rowley
      
      Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
      5935917c
    • Tom Lane's avatar
      Further adjust EXPLAIN's choices of table alias names. · 6ef77cf4
      Tom Lane authored
      This patch causes EXPLAIN to always assign a separate table alias to the
      parent RTE of an append relation (inheritance set); before, such RTEs
      were ignored if not actually scanned by the plan.  Since the child RTEs
      now always have that same alias to start with (cf. commit 55a1954d),
      the net effect is that the parent RTE usually gets the alias used or
      implied by the query text, and the children all get that alias with "_N"
      appended.  (The exception to "usually" is if there are duplicate aliases
      in different subtrees of the original query; then some of those original
      RTEs will also have "_N" appended.)
      
      This results in more uniform output for partitioned-table plans than
      we had before: the partitioned table itself gets the original alias,
      and all child tables have aliases with "_N", rather than the previous
      behavior where one of the children would get an alias without "_N".
      
      The reason for giving the parent RTE an alias, even if it isn't scanned
      by the plan, is that we now use the parent's alias to qualify Vars that
      refer to an appendrel output column and appear above the Append or
      MergeAppend that computes the appendrel.  But below the append, Vars
      refer to some one of the child relations, and are displayed that way.
      This seems clearer than the old behavior where a Var that could carry
      values from any child relation was displayed as if it referred to only
      one of them.
      
      While at it, change ruleutils.c so that the code paths used by EXPLAIN
      deal in Plan trees not PlanState trees.  This effectively reverts a
      decision made in commit 1cc29fe7, which seemed like a good idea at
      the time to make ruleutils.c consistent with explain.c.  However,
      it's problematic because we'd really like to allow executor startup
      pruning to remove all the children of an append node when possible,
      leaving no child PlanState to resolve Vars against.  (That's not done
      here, but will be in the next patch.)  This requires different handling
      of subplans and initplans than before, but is otherwise a pretty
      straightforward change.
      
      Discussion: https://postgr.es/m/001001d4f44b$2a2cca50$7e865ef0$@lab.ntt.co.jp
      6ef77cf4
    • Alvaro Herrera's avatar
      Emit parameter values during query bind/execute errors · ba79cb5d
      Alvaro Herrera authored
      This makes such log entries more useful, since the cause of the error
      can be dependent on the parameter values.
      
      Author: Alexey Bashtanov, Álvaro Herrera
      Discussion: https://postgr.es/m/0146a67b-a22a-0519-9082-bc29756b93a2@imap.cc
      Reviewed-by: Peter Eisentraut, Andres Freund, Tom Lane
      ba79cb5d
    • Tom Lane's avatar
      Use only one thread to handle incoming signals on Windows. · 16114f2e
      Tom Lane authored
      Since its inception, our Windows signal emulation code has worked by
      running a main signal thread that just watches for incoming signal
      requests, and then spawns a new thread to handle each such request.
      That design is meant for servers in which requests can take substantial
      effort to process, and it's worth parallelizing the handling of
      requests.  But those assumptions are just bogus for our signal code.
      It's not much more than pg_queue_signal(), which is cheap and can't
      parallelize at all, plus we don't really expect lots of signals to
      arrive at the same backend at once.  More importantly, this approach
      creates failure modes that we could do without: either inability to
      spawn a new thread or inability to create a new pipe handle will risk
      loss of signals.  Hence, dispense with the separate per-signal threads
      and just service each request in-line in the main signal thread.  This
      should be a bit faster (for the normal case of one signal at a time)
      as well as more robust.
      
      Patch by me; thanks to Andrew Dunstan for testing and Amit Kapila
      for review.
      
      Discussion: https://postgr.es/m/4412.1575748586@sss.pgh.pa.us
      16114f2e
    • Peter Eisentraut's avatar
      Remove ATPrepSetStatistics · 105eb360
      Peter Eisentraut authored
      It was once possible to do ALTER TABLE ... SET STATISTICS on system
      tables without allow_sytem_table_mods.  This was changed apparently by
      accident between PostgreSQL 9.1 and 9.2, but a code comment still
      claimed this was possible.  Without that functionality, having a
      separate ATPrepSetStatistics() is useless, so use the generic
      ATSimplePermissions() instead and move the remaining custom code into
      ATExecSetStatistics().
      Reviewed-by: default avatarTom Lane <tgl@sss.pgh.pa.us>
      Discussion: https://www.postgresql.org/message-id/flat/cc8d2648-a0ec-7a86-13e5-db473484e19e%402ndquadrant.com
      105eb360
    • Peter Eisentraut's avatar
      Fix output of Unicode normalization test · b8024121
      Peter Eisentraut authored
      Several off-by-more-than-one errors caused the output in case of a
      test failure to be truncated and unintelligible.
      Reviewed-by: default avatarTom Lane <tgl@sss.pgh.pa.us>
      Discussion: https://www.postgresql.org/message-id/flat/6a7a8516-7d11-8fbd-0e8b-eadb4f0679eb%402ndquadrant.com
      b8024121
    • Michael Paquier's avatar
      Fix some compiler warnings with timestamp parsing in formatting.c · c341c7d3
      Michael Paquier authored
      gcc-7 used with a sufficient optimization level complains about warnings
      around do_to_timestamp() regarding the initialization and handling of
      some of its variables.  Recent commits 66c74f8b and d589f944 made things
      made the interface more confusing, so document which variables are
      always expected and initialize properly the optional ones when they are
      set.
      
      Author: Andrey Lepikhov, Michael Paquier
      Discussion: https://postgr.es/m/a7e28b83-27b1-4e1c-c76b-4268c4b785bc@postgrespro.ru
      c341c7d3
  8. 10 Dec, 2019 5 commits
    • Tom Lane's avatar
      Fix tuple column count in pg_control_init(). · 8729fa72
      Tom Lane authored
      Oversight in commit 2e4db241.
      
      Nathan Bossart
      
      Discussion: https://postgr.es/m/1B616360-396A-4482-AA28-375566C86160@amazon.com
      8729fa72
    • Peter Eisentraut's avatar
      Cosmetic cleaning of pg_config.h.win32 · 877b61e9
      Peter Eisentraut authored
      Clean up some comments (some generated by old versions of autoconf)
      and some random ordering differences, so it's easier to diff this
      against the default pg_config.h or pg_config.h.in.  Remove LOCALEDIR
      handling from pg_config.h.win32 altogether because it's already in
      pg_config_paths.h.
      877b61e9
    • Alvaro Herrera's avatar
      Add backend-only appendStringInfoStringQuoted · 6cafde1b
      Alvaro Herrera authored
      This provides a mechanism to emit literal values in informative
      messages, such as query parameters.  The new code is more complex than
      what it replaces, primarily because it wants to be more efficient.
      It also has the (currently unused) additional optional capability of
      specifying a maximum size to print.
      
      The new function lives out of common/stringinfo.c so that frontend users
      of that file need not pull in unnecessary multibyte-encoding support
      code.
      
      Author: Álvaro Herrera and Alexey Bashtanov, after a suggestion from Andres Freund
      Reviewed-by: Tom Lane
      Discussion: https://postgr.es/m/20190920203905.xkv5udsd5dxfs6tr@alap3.anarazel.de
      6cafde1b
    • Tom Lane's avatar
      In pg_ctl, work around ERROR_SHARING_VIOLATION on the postmaster log file. · 0da33c76
      Tom Lane authored
      On Windows, we use CMD.EXE to redirect the postmaster's stdout/stderr
      into a log file.  CMD.EXE will open that file with non-sharing-friendly
      parameters, and the file will remain open for a short time after the
      postmaster has removed postmaster.pid.  This can result in an
      ERROR_SHARING_VIOLATION failure if we attempt to start a new postmaster
      immediately with the same log file (e.g. during "pg_ctl restart").
      This seems to explain intermittent buildfarm failures we've been seeing
      on Windows machines.
      
      To fix, just open and close the log file using our own pgwin32_open(),
      which will wait if necessary to avoid the failure.  (Perhaps someday
      we should stop using CMD.EXE, but that would be a far more complex
      patch, and it doesn't seem worth the trouble ... yet.)
      
      Back-patch to v12.  This only solves the problem when frontend fopen()
      is redirected to pgwin32_fopen(), which has only been true since commit
      0ba06e0b.  Hence, no point in back-patching further, unless we care
      to back-patch that change too.
      
      Diagnosis and patch by Alexander Lakhin (bug #16154).
      
      Discussion: https://postgr.es/m/16154-1ccf0b537b24d5e0@postgresql.org
      0da33c76
    • Etsuro Fujita's avatar
      Fix handling of multiple AFTER ROW triggers on a foreign table. · 5a20b021
      Etsuro Fujita authored
      AfterTriggerExecute() retrieves a fresh tuple or pair of tuples from a
      tuplestore and then stores the tuple(s) in the passed-in slot(s) if
      AFTER_TRIGGER_FDW_FETCH, while it uses the most-recently-retrieved
      tuple(s) stored in the slot(s) if AFTER_TRIGGER_FDW_REUSE.  This was
      done correctly before 12, but commit ff11e7f4 broke it by mistakenly
      clearing the tuple(s) stored in the slot(s) in that function, leading to
      an assertion failure as reported in bug #16139 from Alexander Lakhin.
      
      Also, fix some other issues with the aforementioned commit in passing:
      
      * For tg_newslot, which is a slot added to the TriggerData struct by the
        commit to store new updated tuples, it didn't ensure the slot was NULL
        if there was no such tuple.
      * The commit failed to update the documentation about the trigger
        interface.
      
      Author: Etsuro Fujita
      Backpatch-through: 12
      Discussion: https://postgr.es/m/16139-94f9ccf0db6119ec%40postgresql.org
      5a20b021
  9. 09 Dec, 2019 2 commits
    • Tom Lane's avatar
      Fix race condition in our Windows signal emulation. · 28e6a2fd
      Tom Lane authored
      pg_signal_dispatch_thread() responded to the client (signal sender)
      and disconnected the pipe before actually setting the shared variables
      that make the signal visible to the backend process's main thread.
      In the worst case, it seems, effective delivery of the signal could be
      postponed for as long as the machine has any other work to do.
      
      To fix, just move the pg_queue_signal() call so that we do it before
      responding to the client.  This essentially makes pgkill() synchronous,
      which is a stronger guarantee than we have on Unix.  That may be
      overkill, but on the other hand we have not seen comparable timing bugs
      on any Unix platform.
      
      While at it, add some comments to this sadly underdocumented code.
      
      Problem diagnosis and fix by Amit Kapila; I just added the comments.
      
      Back-patch to all supported versions, as it appears that this can cause
      visible NOTIFY timing oddities on all of them, and there might be
      other misbehavior due to slow delivery of other signals.
      
      Discussion: https://postgr.es/m/32745.1575303812@sss.pgh.pa.us
      28e6a2fd
    • Tom Lane's avatar
      Improve isolationtester's timeout management. · 99351a8b
      Tom Lane authored
      isolationtester.c had a hard-wired limit of 3 minutes per test step.
      It now emerges that this isn't quite enough for some of the slowest
      buildfarm animals.  This isn't the first time we've had to raise
      this limit (cf. 1db439ad), so let's make it configurable.  This
      patch raises the default to 5 minutes, and introduces an environment
      variable PGISOLATIONTIMEOUT that can be set if more time is needed,
      following the precedent of PGCTLTIMEOUT.
      
      Also, modify isolationtester so that when the timeout is hit,
      it explicitly reports having sent a cancel.  This makes the regression
      failure log considerably more intelligible.  (In the worst case, a
      timed-out test might actually be reported as "passing" without this
      extra output, so arguably this is a bug fix in itself.)
      
      In passing, update the README file, which had apparently not gotten
      touched when we added "make check" support here.
      
      Back-patch to 9.6; older versions don't have comparable timeout logic.
      
      Discussion: https://postgr.es/m/22964.1575842935@sss.pgh.pa.us
      99351a8b