1. 09 Feb, 2019 9 commits
    • Tom Lane's avatar
      Build out the planner support function infrastructure. · a391ff3c
      Tom Lane authored
      Add support function requests for estimating the selectivity, cost,
      and number of result rows (if a SRF) of the target function.
      
      The lack of a way to estimate selectivity of a boolean-returning
      function in WHERE has been a recognized deficiency of the planner
      since Berkeley days.  This commit finally fixes it.
      
      In addition, non-constant estimates of cost and number of output
      rows are now possible.  We still fall back to looking at procost
      and prorows if the support function doesn't service the request,
      of course.
      
      To make concrete use of the possibility of estimating output rowcount
      for SRFs, this commit adds support functions for array_unnest(anyarray)
      and the integer variants of generate_series; the lack of plausible
      rowcount estimates for those, even when it's obvious to a human,
      has been a repeated subject of complaints.  Obviously, much more
      could now be done in this line, but I'm mostly just trying to get
      the infrastructure in place.
      
      Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
      a391ff3c
    • Tom Lane's avatar
      Create the infrastructure for planner support functions. · 1fb57af9
      Tom Lane authored
      Rename/repurpose pg_proc.protransform as "prosupport".  The idea is
      still that it names an internal function that provides knowledge to
      the planner about the behavior of the function it's attached to;
      but redesign the API specification so that it's not limited to doing
      just one thing, but can support an extensible set of requests.
      
      The original purpose of simplifying a function call is handled by
      the first request type to be invented, SupportRequestSimplify.
      Adjust all the existing transform functions to handle this API,
      and rename them fron "xxx_transform" to "xxx_support" to reflect
      the potential generalization of what they do.  (Since we never
      previously provided any way for extensions to add transform functions,
      this change doesn't create an API break for them.)
      
      Also add DDL and pg_dump support for attaching a support function to a
      user-defined function.  Unfortunately, DDL access has to be restricted
      to superusers, at least for now; but seeing that support functions
      will pretty much have to be written in C, that limitation is just
      theoretical.  (This support is untested in this patch, but a follow-on
      patch will add cases that exercise it.)
      
      Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
      1fb57af9
    • Tom Lane's avatar
      Refactor the representation of indexable clauses in IndexPaths. · 1a8d5afb
      Tom Lane authored
      In place of three separate but interrelated lists (indexclauses,
      indexquals, and indexqualcols), an IndexPath now has one list
      "indexclauses" of IndexClause nodes.  This holds basically the same
      information as before, but in a more useful format: in particular, there
      is now a clear connection between an indexclause (an original restriction
      clause from WHERE or JOIN/ON) and the indexquals (directly usable index
      conditions) derived from it.
      
      We also change the ground rules a bit by mandating that clause commutation,
      if needed, be done up-front so that what is stored in the indexquals list
      is always directly usable as an index condition.  This gets rid of repeated
      re-determination of which side of the clause is the indexkey during costing
      and plan generation, as well as repeated lookups of the commutator
      operator.  To minimize the added up-front cost, the typical case of
      commuting a plain OpExpr is handled by a new special-purpose function
      commute_restrictinfo().  For RowCompareExprs, generating the new clause
      properly commuted to begin with is not really any more complex than before,
      it's just different --- and we can save doing that work twice, as the
      pretty-klugy original implementation did.
      
      Tracking the connection between original and derived clauses lets us
      also track explicitly whether the derived clauses are an exact or lossy
      translation of the original.  This provides a cheap solution to getting
      rid of unnecessary rechecks of boolean index clauses, which previously
      seemed like it'd be more expensive than it was worth.
      
      Another pleasant (IMO) side-effect is that EXPLAIN now always shows
      index clauses with the indexkey on the left; this seems less confusing.
      
      This commit leaves expand_indexqual_conditions() and some related
      functions in a slightly messy state.  I didn't bother to change them
      any more than minimally necessary to work with the new data structure,
      because all that code is going to be refactored out of existence in
      a follow-on patch.
      
      Discussion: https://postgr.es/m/22182.1549124950@sss.pgh.pa.us
      1a8d5afb
    • Tom Lane's avatar
      Call set_rel_pathlist_hook before generate_gather_paths, not after. · 64015838
      Tom Lane authored
      The previous ordering of these steps satisfied the nominal requirement
      that set_rel_pathlist_hook could editorialize on the whole set of Paths
      constructed for a base relation.  In practice, though, trying to change
      the set of partial paths was impossible.  Adding one didn't work because
      (a) it was too late to be included in Gather paths made by the core code,
      and (b) calling add_partial_path after generate_gather_paths is unsafe,
      because it might try to delete a path it thinks is dominated, but that
      is already embedded in some Gather path(s).  Nor could the hook safely
      remove partial paths, for the same reason that they might already be
      embedded in Gathers.
      
      Better to call extensions first, let them add partial paths as desired,
      and then gather.  In v11 and up, we already doubled down on that ordering
      by postponing gathering even further for single-relation queries; so even
      if the hook wished to editorialize on Gather path construction, it could
      not.
      
      Report and patch by KaiGai Kohei.  Back-patch to 9.6 where Gather paths
      were added.
      
      Discussion: https://postgr.es/m/CAOP8fzahwpKJRTVVTqo2AE=mDTz_efVzV6Get_0=U3SO+-ha1A@mail.gmail.com
      64015838
    • Peter Eisentraut's avatar
      Use better comment marker in Autoconf input · 4446565d
      Peter Eisentraut authored
      The comment marker "#" is copied to the output, so it's only
      appropriate for comments that make sense in the shell output.  For
      comments about the Autoconf language, "dnl" should be used.
      4446565d
    • Andres Freund's avatar
      Reset, not recreate, execGrouping.c style hashtables. · 356687bd
      Andres Freund authored
      This uses the facility added in the preceding commit to fix
      performance issues caused by rebuilding the hashtable (with its
      comparator expression being the most expensive bit), after every
      reset. That's especially important when the comparator is JIT
      compiled.
      
      Bug: #15592 #15486
      Reported-By: Jakub Janeček, Dmitry Marakasov
      Author: Andres Freund
      Discussion:
          https://postgr.es/m/15486-05850f065da42931@postgresql.org
          https://postgr.es/m/20190114180423.ywhdg2iagzvh43we@alap3.anarazel.de
      Backpatch: 11, where I broke this in bf6c614a
      356687bd
    • Andres Freund's avatar
      Allow to reset execGrouping.c style tuple hashtables. · 317ffdfe
      Andres Freund authored
      This has the advantage that the comparator expression, the table's
      slot, etc do not have to be rebuilt. Additionally the simplehash.h
      hashtable within the tuple hashtable now keeps its previous size and
      doesn't need to be reallocated. That both reduces allocator overhead,
      and improves performance in cases where the input estimation was off
      by a significant factor.
      
      To avoid an API/ABI break, the new parameter is exposed via the new
      BuildTupleHashTableExt(), and BuildTupleHashTable() now is a wrapper
      around the former, that continues to allocate the table itself in the
      tablecxt.
      
      Using this fixes performance issues discovered in the two bugs
      referenced. This commit however has not converted the callers, that's
      done in a separate commit.
      
      Bug: #15592 #15486
      Reported-By: Jakub Janeček, Dmitry Marakasov
      Author: Andres Freund
      Discussion:
          https://postgr.es/m/15486-05850f065da42931@postgresql.org
          https://postgr.es/m/20190114180423.ywhdg2iagzvh43we@alap3.anarazel.de
      Backpatch: 11, this is a prerequisite for other fixes
      317ffdfe
    • Andres Freund's avatar
      simplehash: Add support for resetting a hashtable's contents. · 3b632a58
      Andres Freund authored
      A hashtable reset just reset the hashtable entries, but does not free
      memory.
      
      Author: Andres Freund
      Discussion: https://postgr.es/m/20190114180423.ywhdg2iagzvh43we@alap3.anarazel.de
      Bug: #15592 #15486
      Backpatch: 11, this is a prerequisite for other fixes
      3b632a58
    • Andres Freund's avatar
      Plug leak in BuildTupleHashTable by creating ExprContext in correct context. · 5567d12c
      Andres Freund authored
      In bf6c614a I added a expr context to evaluate the grouping
      expression. Unfortunately the code I added initialized them while in
      the calling context, rather the table context.  Additionally, I used
      CreateExprContext() rather than CreateStandaloneExprContext(), which
      creates the econtext in the estate's query context.
      
      Fix that by using CreateStandaloneExprContext when in the table's
      tablecxt. As we rely on the memory being freed by a memory context
      reset that means that the econtext's shutdown callbacks aren't being
      called, but that seems ok as the expressions are tightly controlled
      due to ExecBuildGroupingEqual().
      
      Bug: #15592
      Reported-By: Dmitry Marakasov
      Author: Andres Freund
      Discussion: https://postgr.es/m/20190114222838.h6r3fuyxjxkykf6t@alap3.anarazel.de
      Backpatch: 11, where I broke this in bf6c614a
      5567d12c
  2. 08 Feb, 2019 4 commits
  3. 07 Feb, 2019 10 commits
  4. 06 Feb, 2019 7 commits
    • Peter Geoghegan's avatar
      Avoid amcheck inline compression false positives. · eba77534
      Peter Geoghegan authored
      The previous tacit assumption that index_form_tuple() hides differences
      in the TOAST state of its input datums was wrong.  Normalize input
      varlena datums by decompressing compressed values, and forming a new
      index tuple for fingerprinting using uncompressed inputs.  The final
      normalized representation may actually be compressed once again within
      index_form_tuple(), though that shouldn't matter.  When the original
      tuple is found to have no datums that are compressed inline, fingerprint
      the original tuple directly.
      
      Normalization avoids false positive reports of corruption in certain
      cases.  For example, the executor can apply toasting with some inline
      compression to an entire heap tuple because its input has a single
      external TOAST pointer.  Varlena datums for other attributes that are
      not particularly good candidates for inline compression can be
      compressed in the heap tuple in passing, without the representation of
      the same values in index tuples ever receiving concomitant inline
      compression.
      
      Add a test case to recreate the issue in a simpler though less realistic
      way: by exploiting differences in pg_attribute.attstorage between heap
      and index relations.
      
      This bug was discovered by me during testing of an upcoming set of nbtree
      enhancements.  It was also independently reported by Andreas Kunert, as
      bug #15597.  His test case was rather more realistic than the one I
      ended up using.
      
      Bug: #15597
      Discussion: https://postgr.es/m/CAH2-WznrVd9ie+TTJ45nDT+v2nUt6YJwQrT9SebCdQKtAvfPZw@mail.gmail.com
      Discussion: https://postgr.es/m/15597-294e5d3e7f01c407@postgresql.org
      Backpatch: 11-, where heapallindexed verification was introduced.
      eba77534
    • Peter Eisentraut's avatar
      Hide cascade messages in collate tests · 727921f4
      Peter Eisentraut authored
      These are not relevant to the tests and would just uselessly bloat
      patches.
      727921f4
    • Tom Lane's avatar
      Propagate lateral-reference information to indirect descendant relations. · bdd9a99a
      Tom Lane authored
      create_lateral_join_info() computes a bunch of information about lateral
      references between base relations, and then attempts to propagate those
      markings to appendrel children of the original base relations.  But the
      original coding neglected the possibility of indirect descendants
      (grandchildren etc).  During v11 development we noticed that this was
      wrong for partitioned-table cases, but failed to realize that it was just
      as wrong for any appendrel.  While the case can't arise for appendrels
      derived from traditional table inheritance (because we make a flat
      appendrel for that), nested appendrels can arise from nested UNION ALL
      subqueries.  Failure to mark the lower-level relations as having lateral
      references leads to confusion in add_paths_to_append_rel about whether
      unparameterized paths can be built.  It's not very clear whether that
      leads to any user-visible misbehavior; the lack of field reports suggests
      that it may cause nothing worse than minor cost misestimation.  Still,
      it's a bug, and it leads to failures of Asserts that I intend to add
      later.
      
      To fix, we need to propagate information from all appendrel parents,
      not just those that are RELOPT_BASERELs.  We can still do it in one
      pass, if we rely on the append_rel_list to be ordered with ancestor
      relationships before descendant ones; add assertions checking that.
      While fixing this, we can make a small performance improvement by
      traversing the append_rel_list just once instead of separately for
      each appendrel parent relation.
      
      Noted while investigating bug #15613, though this patch does not fix
      that (which is why I'm not committing the related Asserts yet).
      
      Discussion: https://postgr.es/m/3951.1549403812@sss.pgh.pa.us
      bdd9a99a
    • Andrew Dunstan's avatar
      Unify searchpath and do file logic in MSVC build scripts. · 592123ef
      Andrew Dunstan authored
      Commit f83419b7 failed to notice that mkvcbuild.pl and build.pl use
      different searchpath and do-file logic, breaking the latter, so it is
      adjusted to use the same logic as mkvcbuild.pl.
      592123ef
    • Andres Freund's avatar
      Fix heap_getattr() handling of fast defaults. · 171e0418
      Andres Freund authored
      Previously heap_getattr() returned NULL for attributes with a fast
      default value (c.f. 16828d5c), as it had no handling whatsoever
      for that case.
      
      A previous fix, 7636e5c6, attempted to fix issues caused by this
      oversight, but just expanding OLD tuples for triggers doesn't actually
      solve the underlying issue.
      
      One known consequence of this bug is that the check for HOT updates
      can return the wrong result, when a previously fast-default'ed column
      is set to NULL. Which in turn means that an index over a column with
      fast default'ed columns might be corrupt if the underlying column(s)
      allow NULLs.
      
      Fix by handling fast default columns in heap_getattr(), remove now
      superfluous expansion in GetTupleForTrigger().
      
      Author: Andres Freund
      Discussion: https://postgr.es/m/20190201162404.onngi77f26baem4g@alap3.anarazel.de
      Backpatch: 11, where fast defaults were introduced
      171e0418
    • Michael Paquier's avatar
      Tighten some regexes with proper character escaping in pg_dump TAP tests · d07fb681
      Michael Paquier authored
      Some tests have been using regular expressions which have been lax in
      escaping dots, which may cause tests to pass when they should not.  This
      make the whole set of tests more robust where needed.
      
      Author: David Rowley
      Reviewed-by: Daniel Gustafsson, Michael Paquier
      Discussion: https://postgr.es/m/CAKJS1f9jD8aVo1BTH+Vgwd=f-ynbuRVrS90XbWMT6UigaOQJTA@mail.gmail.com
      d07fb681
    • Andrew Dunstan's avatar
      Fix included file path for modern perl · f83419b7
      Andrew Dunstan authored
      Contrary to the comment on 772d4b76, only paths starting with "./" or
      "../" are considered relative to the current working directory by perl's
      "do" function. So this patch converts all the relevant cases to use "./"
      paths. This only affects MSVC.
      
      Backpatch to all live branches.
      f83419b7
  5. 05 Feb, 2019 5 commits
    • Andrew Dunstan's avatar
      Keep perl style checker happy · 8916b33e
      Andrew Dunstan authored
      It doesn't like code before "use strict;".
      8916b33e
    • Tom Lane's avatar
      Update time zone data files to tzdata release 2018i. · d63dc0aa
      Tom Lane authored
      DST law changes in Kazakhstan, Metlakatla, and São Tomé and Príncipe.
      Kazakhstan's Qyzylorda zone is split in two, creating a new zone
      Asia/Qostanay, as some areas did not change UTC offset.
      Historical corrections for Hong Kong and numerous Pacific islands.
      d63dc0aa
    • Andrew Dunstan's avatar
      Fix searchpath for modern Perl for genbki.pl · f884a968
      Andrew Dunstan authored
      This was fixed for MSVC tools by commit 1df92eea, but per
      buildfarm member bowerbird genbki.pl needs the same treatment.
      
      Backpatch to all live branches.
      f884a968
    • Tom Lane's avatar
      Remove unnecessary "inline" marker introduced in commit 4be058fe. · 24114e8b
      Tom Lane authored
      Some of our older buildfarm members bleat about this coding,
      along the lines of
      
      prepjointree.c:112: warning: 'get_result_relid' declared inline after being called
      prepjointree.c:112: warning: previous declaration of 'get_result_relid' was here
      
      Modern compilers will probably inline this function without being
      prompted, so rather than move the function, let's just drop the
      marking.
      24114e8b
    • Tom Lane's avatar
      Doc: in each release branch, keep only that branch's own release notes. · 527b5ed1
      Tom Lane authored
      Historically we've had each release branch include all prior branches'
      notes, including minor-release changes, back to the beginning of the
      project.  That's basically an O(N^2) proposition, and it was starting to
      catch up with us: as of HEAD the back-branch release notes alone accounted
      for nearly 30% of the documentation.  While there's certainly some value
      in easy access to back-branch notes, this is getting out of hand.
      
      Hence, switch over to the rule that each branch contains only its own
      release notes.  So as to not make older notes too hard to find, each
      branch will provide URLs for the immediately preceding branches'
      release notes on the project website.
      
      There might be value in providing aggregated notes across all branches
      somewhere on the website, but that's a task for another day.
      
      Discussion: https://postgr.es/m/cbd4aeb5-2d9c-8b84-e968-9e09393d4c83@postgresql.org
      527b5ed1
  6. 04 Feb, 2019 5 commits
    • Tom Lane's avatar
      Fix dumping of matviews with indirect dependencies on primary keys. · 6e4d45b5
      Tom Lane authored
      Commit 62215de2 turns out to have been not quite on-the-mark.
      When we are forced to postpone dumping of a materialized view into
      the dump's post-data section (because it depends on a unique index
      that isn't created till that section), we may also have to postpone
      dumping other matviews that depend on said matview.  The previous fix
      didn't reliably work for such cases: it'd break the dependency loops
      properly, producing a workable object ordering, but it didn't
      necessarily mark all the matviews as "postponed_def".  This led to
      harmless bleating about "archive items not in correct section order",
      as reported by Tom Cassidy in bug #15602.  Less harmlessly,
      selective-restore options such as --section might misbehave due to
      the matview dump objects not being properly labeled.
      
      The right way to fix it is to consider that each pre-data dependency
      we break amounts to moving the no-longer-dependent object into
      post-data, and hence we should mark that object if it's a matview.
      
      Back-patch to all supported versions, since the issue's been there
      since matviews were introduced.
      
      Discussion: https://postgr.es/m/15602-e895445f73dc450b@postgresql.org
      6e4d45b5
    • Peter Eisentraut's avatar
      Remove unused macro · f602cf49
      Peter Eisentraut authored
      Use was removed in 6d46f478 but
      definition was forgotten.
      f602cf49
    • Andrew Gierth's avatar
      Move port-specific parts of with_temp_install to port makefile. · 54f5f887
      Andrew Gierth authored
      Rather than define ld_library_path_ver with a big nested $(if), just
      put the overriding values in the makefiles for the relevant ports.
      
      Also add a variable for port makefiles to append their own stuff to
      with_temp_install, and use it to set LD_LIBRARY_PATH_RPATH=1 on
      FreeBSD which is needed to make LD_LIBRARY_PATH override DT_RPATH
      if DT_RUNPATH is not set (which seems to depend in unpredictable ways
      on the choice of compiler, at least on my system).
      
      Backpatch for the benefit of anyone doing regression tests on FreeBSD.
      (For other platforms there should be no functional change.)
      54f5f887
    • Amit Kapila's avatar
      Make FSM test portable. · 08ecdfe7
      Amit Kapila authored
      In b0eaa4c5, we allow FSM to be created only after 4 pages.  One of the
      tests check the FSM contents and to do that it populates many tuples in
      the relation.  The FSM contents depend on the availability of freespace in
      the page and it could vary because of the alignment of tuples.
      
      This commit removes the dependency on FSM contents.
      
      Author: Amit Kapila
      Discussion: https://postgr.es/m/CAA4eK1KADF6K1bagr0--mGv3dMcZ%3DH_Z-Qtvdfbp5PjaC6PJJA%40mail.gmail.com
      08ecdfe7
    • Amit Kapila's avatar
      Avoid creation of the free space map for small heap relations, take 2. · b0eaa4c5
      Amit Kapila authored
      Previously, all heaps had FSMs. For very small tables, this means that the
      FSM took up more space than the heap did. This is wasteful, so now we
      refrain from creating the FSM for heaps with 4 pages or fewer. If the last
      known target block has insufficient space, we still try to insert into some
      other page before giving up and extending the relation, since doing
      otherwise leads to table bloat. Testing showed that trying every page
      penalized performance slightly, so we compromise and try every other page.
      This way, we visit at most two pages. Any pages with wasted free space
      become visible at next relation extension, so we still control table bloat.
      As a bonus, directly attempting one or two pages can even be faster than
      consulting the FSM would have been.
      
      Once the FSM is created for a heap we don't remove it even if somebody
      deletes all the rows from the corresponding relation.  We don't think it is
      a useful optimization as it is quite likely that relation will again grow
      to the same size.
      
      Author: John Naylor, Amit Kapila
      Reviewed-by: Amit Kapila
      Tested-by: Mithun C Y
      Discussion: https://www.postgresql.org/message-id/CAJVSVGWvB13PzpbLEecFuGFc5V2fsO736BsdTakPiPAcdMM5tQ@mail.gmail.com
      b0eaa4c5