1. 13 Jan, 2019 3 commits
  2. 11 Jan, 2019 6 commits
    • Andrew Dunstan's avatar
      Free pre-modification HeapTuple in ALTER TABLE ... TYPE ... · e33884d4
      Andrew Dunstan authored
      This was an oversight in commit 3b174b1a.
      
      Per offline gripe from Alvaro Herrera
      
      Backpatch to release 11.
      e33884d4
    • Tom Lane's avatar
      Avoid sharing PARAM_EXEC slots between different levels of NestLoop. · 1db5667b
      Tom Lane authored
      Up to now, createplan.c attempted to share PARAM_EXEC slots for
      NestLoopParams across different plan levels, if the same underlying Var
      was being fed down to different righthand-side subplan trees by different
      NestLoops.  This was, I think, more of an artifact of using subselect.c's
      PlannerParamItem infrastructure than an explicit design goal, but anyway
      that was the end result.
      
      This works well enough as long as the plan tree is executing synchronously,
      but the feature whereby Gather can execute the parallelized subplan locally
      breaks it.  An upper NestLoop node might execute for a row retrieved from
      a parallel worker, and assign a value for a PARAM_EXEC slot from that row,
      while the leader's copy of the parallelized subplan is suspended with a
      different active value of the row the Var comes from.  When control
      eventually returns to the leader's subplan, it gets the wrong answers if
      the same PARAM_EXEC slot is being used within the subplan, as reported
      in bug #15577 from Bartosz Polnik.
      
      This is pretty reminiscent of the problem fixed in commit 46c508fb, and
      the proper fix seems to be the same: don't try to share PARAM_EXEC slots
      across different levels of controlling NestLoop nodes.
      
      This requires decoupling NestLoopParam handling from PlannerParamItem
      handling, although the logic remains somewhat similar.  To avoid bizarre
      division of labor between subselect.c and createplan.c, I decided to move
      all the param-slot-assignment logic for both cases out of those files
      and put it into a new file paramassign.c.  Hopefully it's a bit better
      documented now, too.
      
      A regression test case for this might be nice, but we don't know a
      test case that triggers the problem with a suitably small amount
      of data.
      
      Back-patch to 9.6 where we added Gather nodes.  It's conceivable that
      related problems exist in older branches; but without some evidence
      for that, I'll leave the older branches alone.
      
      Discussion: https://postgr.es/m/15577-ca61ab18904af852@postgresql.org
      1db5667b
    • Peter Eisentraut's avatar
      doc: Correct documentation of install-time environment variables · 8b89a886
      Peter Eisentraut authored
      Since approximately PostgreSQL 10, it is no longer required that
      environment variables at installation time such as PERL, PYTHON, TCLSH
      be "full path names", so change that phrasing in the installation
      instructions.  (The exact time of change appears to differ for PERL
      and the others, but it works consistently in PostgreSQL 10.)
      
      Also while we're here document the defaults for PERL and PYTHON, but
      since the search list for TCLSH is so long, let's leave that out so we
      don't need to maintain a copy of that list in the installation
      instructions.
      8b89a886
    • Peter Eisentraut's avatar
      Create INSTALL file using Pandoc · 96b8b8b6
      Peter Eisentraut authored
      Replace using lynx with using pandoc.  Pandoc creates better looking
      output and it avoids the delicate locale/encoding issues of lynx because
      it always uses UTF-8 for both input and output.
      
      Note: requires Pandoc >=1.13
      
      Discussion: https://www.postgresql.org/message-id/flat/dcfaa74d-8037-bb32-f9e0-3fea7ccf4551@2ndquadrant.com/Reviewed-by: default avatarMi Tar <mmitar@gmail.com>
      96b8b8b6
    • Peter Eisentraut's avatar
      Add value 'current' for recovery_target_timeline · ff853060
      Peter Eisentraut authored
      This value represents the default behavior of using the current
      timeline.  Previously, this was represented by an empty string.
      
      (Before the removal of recovery.conf, this setting could not be chosen
      explicitly but was used when recovery_target_timeline was not
      mentioned at all.)
      
      Discussion: https://www.postgresql.org/message-id/flat/6dd2c23a-4162-8469-410f-bfe146e28c0c@2ndquadrant.com/Reviewed-by: default avatarDavid Steele <david@pgmasters.net>
      Reviewed-by: default avatarMichael Paquier <michael@paquier.xyz>
      ff853060
    • Amit Kapila's avatar
      Extend pg_stat_statements_reset to reset statistics specific to a · 43cbedab
      Amit Kapila authored
      particular user/db/query.
      
      The function pg_stat_statements_reset() is extended to accept userid, dbid,
      and queryid as input parameters.  Now, it can discard the statistics
      gathered so far by pg_stat_statements corresponding to the specified
      userid, dbid, and queryid.  If no parameter is specified or all the
      specified parameters have default value aka 0, it will discard all
      statistics as per the old behavior.
      
      The new behavior is useful to get the fresh statistics for a specific
      user/database/query without resetting all the existing statistics.
      
      Author: Haribabu Kommi, with few additional changes by me
      Reviewed-by: Michael Paquier, Amit Kapila and Fujii Masao
      Discussion: https://postgr.es/m/CAJrrPGcyh-gkFswyc6C661K6cknL0XkNqVT0sQt2mFNMR4HRKA@mail.gmail.com
      43cbedab
  3. 10 Jan, 2019 10 commits
  4. 09 Jan, 2019 2 commits
    • Tom Lane's avatar
      Reduce the size of the fmgr_builtin_oid_index[] array. · 8ff5f824
      Tom Lane authored
      This index array was originally defined to have 10000 entries (ranging
      up to FirstGenbkiObjectId), but we really only need entries up to the
      last existing builtin function OID, currently 6121.  That saves close
      to 8K of never-accessed space in the server executable, at the small
      price of one more fetch in fmgr_isbuiltin().
      
      We could reduce the array size still further by renumbering a few of
      the highest-numbered builtin functions; but there's a small risk of
      breaking clients that have chosen to hardwire those function OIDs,
      so it's not clear if it'd be worth the trouble.  (We should, however,
      discourage future patches from choosing function OIDs above 6K as long
      as there's still lots of space below that.)
      
      Discussion: https://postgr.es/m/12359.1547063064@sss.pgh.pa.us
      8ff5f824
    • Tom Lane's avatar
      Update docs & tests to reflect that unassigned OLD/NEW are now NULL. · 59029b6f
      Tom Lane authored
      For a long time, plpgsql has allowed trigger functions to parse
      references to OLD and NEW even if the current trigger event type didn't
      assign a value to one or the other variable; but actually executing such
      a reference would fail.  The v11 changes to use "expanded records" for
      DTYPE_REC variables changed the behavior so that the unassigned variable
      now reads as a null composite value.  While this behavioral change was
      more or less unintentional, it seems that leaving it like this is better
      than adding code and complexity to be bug-compatible with the old way.
      The change doesn't break any code that worked before, and it eliminates
      a gotcha that often required extra code to work around.
      
      Hence, update the docs to say that these variables are "null" not
      "unassigned" when not relevant to the event type.  And add a regression
      test covering the behavior, so that we'll notice if we ever break it
      again.
      
      Per report from Kristjan Tammekivi.
      
      Discussion: https://postgr.es/m/CAABK7uL-uC9ZxKBXzo_68pKt7cECfNRv+c35CXZpjq6jCAzYYA@mail.gmail.com
      59029b6f
  5. 08 Jan, 2019 3 commits
  6. 07 Jan, 2019 4 commits
  7. 06 Jan, 2019 1 commit
    • Tom Lane's avatar
      Replace the data structure used for keyword lookup. · afb0d071
      Tom Lane authored
      Previously, ScanKeywordLookup was passed an array of string pointers.
      This had some performance deficiencies: the strings themselves might
      be scattered all over the place depending on the compiler (and some
      quick checking shows that at least with gcc-on-Linux, they indeed
      weren't reliably close together).  That led to very cache-unfriendly
      behavior as the binary search touched strings in many different pages.
      Also, depending on the platform, the string pointers might need to
      be adjusted at program start, so that they couldn't be simple constant
      data.  And the ScanKeyword struct had been designed with an eye to
      32-bit machines originally; on 64-bit it requires 16 bytes per
      keyword, making it even more cache-unfriendly.
      
      Redesign so that the keyword strings themselves are allocated
      consecutively (as part of one big char-string constant), thereby
      eliminating the touch-lots-of-unrelated-pages syndrome.  And get
      rid of the ScanKeyword array in favor of three separate arrays:
      uint16 offsets into the keyword array, uint16 token codes, and
      uint8 keyword categories.  That reduces the overhead per keyword
      to 5 bytes instead of 16 (even less in programs that only need
      one of the token codes and categories); moreover, the binary search
      only touches the offsets array, further reducing its cache footprint.
      This also lets us put the token codes somewhere else than the
      keyword strings are, which avoids some unpleasant build dependencies.
      
      While we're at it, wrap the data used by ScanKeywordLookup into
      a struct that can be treated as an opaque type by most callers.
      That doesn't change things much right now, but it will make it
      less painful to switch to a hash-based lookup method, as is being
      discussed in the mailing list thread.
      
      Most of the change here is associated with adding a generator
      script that can build the new data structure from the same
      list-of-PG_KEYWORD header representation we used before.
      The PG_KEYWORD lists that plpgsql and ecpg used to embed in
      their scanner .c files have to be moved into headers, and the
      Makefiles have to be taught to invoke the generator script.
      This work is also necessary if we're to consider hash-based lookup,
      since the generator script is what would be responsible for
      constructing a hash table.
      
      Aside from saving a few kilobytes in each program that includes
      the keyword table, this seems to speed up raw parsing (flex+bison)
      by a few percent.  So it's worth doing even as it stands, though
      we think we can gain even more with a follow-on patch to switch
      to hash-based lookup.
      
      John Naylor, with further hacking by me
      
      Discussion: https://postgr.es/m/CAJVSVGXdFVU2sgym89XPL=Lv1zOS5=EHHQ8XWNzFL=mTXkKMLw@mail.gmail.com
      afb0d071
  8. 05 Jan, 2019 1 commit
    • Tom Lane's avatar
      Fix program build rule in src/bin/scripts/Makefile. · c5c7fa26
      Tom Lane authored
      Commit 69ae9dcb added a globally-visible "%: %.o" rule, but we failed
      to notice that src/bin/scripts/Makefile already had such a rule.
      Apparently, the later occurrence of the same rule wins in nearly all
      versions of gmake ... but not in the one used by buildfarm member jacana.
      jacana is evidently using the global rule, which says to link "$<",
      ie just the first dependency.  But the scripts makefile needs to
      link "$^", ie all the dependencies listed for the target.
      
      There is, fortunately, no good reason not to use "$^" in the global
      version of the rule, so we can just do that and get rid of the local
      version.
      c5c7fa26
  9. 04 Jan, 2019 6 commits
  10. 03 Jan, 2019 2 commits
    • Tom Lane's avatar
      Use symbolic references for pg_language OIDs in the bootstrap data. · 814c9019
      Tom Lane authored
      This patch teaches genbki.pl to replace pg_language names by OIDs
      in much the same way as it already does for pg_am names etc, and
      converts pg_proc.dat to use such symbolic references in the prolang
      column.
      
      Aside from getting rid of a few more magic numbers in the initial
      catalog data, this means that Gen_fmgrtab.pl no longer needs to read
      pg_language.dat, since it doesn't have to know the OID of the "internal"
      language; now it's just looking for the string "internal".
      
      No need for a catversion bump, since the contents of postgres.bki
      don't actually change at all.
      
      John Naylor
      
      Discussion: https://postgr.es/m/CAJVSVGWtUqxpfAaxS88vEGvi+jKzWZb2EStu5io-UPc4p9rSJg@mail.gmail.com
      814c9019
    • Tom Lane's avatar
      Improve ANALYZE's handling of concurrent-update scenarios. · 7170268e
      Tom Lane authored
      This patch changes the rule for whether or not a tuple seen by ANALYZE
      should be included in its sample.
      
      When we last touched this logic, in commit 51e1445f, we weren't
      thinking very hard about tuples being UPDATEd by a long-running
      concurrent transaction.  In such a case, we might see the pre-image as
      either LIVE or DELETE_IN_PROGRESS depending on timing; and we might see
      the post-image not at all, or as INSERT_IN_PROGRESS.  Since the existing
      code will not sample either DELETE_IN_PROGRESS or INSERT_IN_PROGRESS
      tuples, this leads to concurrently-updated rows being omitted from the
      sample entirely.  That's not very helpful, and it's especially the wrong
      thing if the concurrent transaction ends up rolling back.
      
      The right thing seems to be to sample DELETE_IN_PROGRESS rows just as if
      they were live.  This makes the "sample it" and "count it" decisions the
      same, which seems good for consistency.  It's clearly the right thing
      if the concurrent transaction ends up rolling back; in effect, we are
      sampling as though IN_PROGRESS transactions haven't happened yet.
      Also, this combination of choices ensures maximum robustness against
      the different combinations of whether and in which state we might see the
      pre- and post-images of an update.
      
      It's slightly annoying that we end up recording immediately-out-of-date
      stats in the case where the transaction does commit, but on the other
      hand the stats are fine for columns that didn't change in the update.
      And the alternative of sampling INSERT_IN_PROGRESS rows instead seems
      like a bad idea, because then the sampling would be inconsistent with
      the way rows are counted for the stats report.
      
      Per report from Mark Chambers; thanks to Jeff Janes for diagnosing
      what was happening.  Back-patch to all supported versions.
      
      Discussion: https://postgr.es/m/CAFh58O_Myr6G3tcH3gcGrF-=OExB08PJdWZcSBcEcovaiPsrHA@mail.gmail.com
      7170268e
  11. 02 Jan, 2019 2 commits