1. 27 Jun, 2016 3 commits
    • Teodor Sigaev's avatar
      Change predecence of phrase operator. · 6734a1ca
      Teodor Sigaev authored
      <-> operator now have higher predecence than & (AND) operator. This change
      was motivated by unexpected difference of similar queries:
      'a & b <-> c'::tsquery and 'b <-> c & a'. Before first query means
      (a & b) <-> c and second one - '(b <-> c) & a', now phrase operator evaluates
      first.
      
      Per suggestion from Tom Lane 32260.1465402409@sss.pgh.pa.us
      6734a1ca
    • Teodor Sigaev's avatar
      Do not fallback to AND for FTS phrase operator. · 3dbbd0f0
      Teodor Sigaev authored
      If there is no positional information of lexemes then phrase operator will not
      fallback to AND operator. This change makes needing to modify TS_execute()
      interface, because somewhere (in indexes, for example) positional information
      is unaccesible and in this cases we need to force fallback to AND.
      
      Per discussion c19fcfec308e6ccd952cdde9e648b505@mail.gmail.com
      3dbbd0f0
    • Teodor Sigaev's avatar
      Make exact distance match for FTS phrase operator · 028350f6
      Teodor Sigaev authored
      Phrase operator now requires exact distance betweens lexems instead of
      less-or-equal.
      
      Per discussion c19fcfec308e6ccd952cdde9e648b505@mail.gmail.com
      028350f6
  2. 26 Jun, 2016 3 commits
    • Tom Lane's avatar
      Avoid making a separate pass over the query to check for partializability. · f1993038
      Tom Lane authored
      It's rather silly to make a separate pass over the tlist + HAVING qual,
      and a separate set of visits to the syscache, when get_agg_clause_costs
      already has all the required information in hand.  This nets out as less
      code as well as fewer cycles.
      f1993038
    • Tom Lane's avatar
      Rethink node-level representation of partial-aggregation modes. · 19e972d5
      Tom Lane authored
      The original coding had three separate booleans representing partial
      aggregation behavior, which was confusing, unreadable, and error-prone,
      not least because the booleans weren't always listed in the same order.
      It was also inadequate for the allegedly-desirable future extension to
      support intermediate partial aggregation, because we'd need separate
      markers for serialization and deserialization in such a case.
      
      Merge these bools into an enum "AggSplit" to provide symbolic names for
      the supported operating modes (and document what those are).  By assigning
      the values of the enum constants carefully, we can treat AggSplit values
      as options bitmasks so that tests of what to do aren't noticeably more
      expensive than before.
      
      While at it, get rid of Aggref.aggoutputtype.  That's not needed since
      commit 59a3795c got rid of setrefs.c's special-purpose Aggref comparison
      code, and it likewise seemed more confusing than helpful.
      
      Assorted comment cleanup as well (there's still more that I want to do
      in that line).
      
      catversion bump for change in Aggref node contents.  Should be the last
      one for partial-aggregation changes.
      
      Discussion: <29309.1466699160@sss.pgh.pa.us>
      19e972d5
    • Tom Lane's avatar
      Simplify planner's final setup of Aggrefs for partial aggregation. · 59a3795c
      Tom Lane authored
      Commit e06a3896's original coding for constructing the execution-time
      expression tree for a combining aggregate was rather messy, involving
      duplicating quite a lot of code in setrefs.c so that it could inject
      a nonstandard matching rule for Aggrefs.  Get rid of that in favor of
      explicitly constructing a combining Aggref with a partial Aggref as input,
      then allowing setref's normal matching logic to match the partial Aggref
      to the output of the lower plan node and hence replace it with a Var.
      
      In passing, rename and redocument make_partialgroup_input_target to have
      some connection to what it actually does.
      59a3795c
  3. 24 Jun, 2016 5 commits
    • Alvaro Herrera's avatar
      Fix handling of multixacts predating pg_upgrade · e3ad3ffa
      Alvaro Herrera authored
      After pg_upgrade, it is possible that some tuples' Xmax have multixacts
      corresponding to the old installation; such multixacts cannot have
      running members anymore.  In many code sites we already know not to read
      them and clobber them silently, but at least when VACUUM tries to freeze
      a multixact or determine whether one needs freezing, there's an attempt
      to resolve it to its member transactions by calling GetMultiXactIdMembers,
      and if the multixact value is "in the future" with regards to the
      current valid multixact range, an error like this is raised:
          ERROR:  MultiXactId 123 has not been created yet -- apparent wraparound
      and vacuuming fails.  Per discussion with Andrew Gierth, it is completely
      bogus to try to resolve multixacts coming from before a pg_upgrade,
      regardless of where they stand with regards to the current valid
      multixact range.
      
      It's possible to get from under this problem by doing SELECT FOR UPDATE
      of the problem tuples, but if tables are large, this is slow and
      tedious, so a more thorough solution is desirable.
      
      To fix, we realize that multixacts in xmax created in 9.2 and previous
      have a specific bit pattern that is never used in 9.3 and later (we
      already knew this, per comments and infomask tests sprinkled in various
      places, but we weren't leveraging this knowledge appropriately).
      Whenever the infomask of the tuple matches that bit pattern, we just
      ignore the multixact completely as if Xmax wasn't set; or, in the case
      of tuple freezing, we act as if an unwanted value is set and clobber it
      without decoding.  This guarantees that no errors will be raised, and
      that the values will be progressively removed until all tables are
      clean.  Most callers of GetMultiXactIdMembers are patched to recognize
      directly that the value is a removable "empty" multixact and avoid
      calling GetMultiXactIdMembers altogether.
      
      To avoid changing the signature of GetMultiXactIdMembers() in back
      branches, we keep the "allow_old" boolean flag but rename it to
      "from_pgupgrade"; if the flag is true, we always return an empty set
      instead of looking up the multixact.  (I suppose we could remove the
      argument in the master branch, but I chose not to do so in this commit).
      
      This was broken all along, but the error-facing message appeared first
      because of commit 8e9a16ab8f7f and was partially fixed in a25c2b7c4db3.
      This fix, backpatched all the way back to 9.3, goes approximately in the
      same direction as a25c2b7c4db3 but should cover all cases.
      
      Bug analysis by Andrew Gierth and Álvaro Herrera.
      
      A number of public reports match this bug:
        https://www.postgresql.org/message-id/20140330040029.GY4582@tamriel.snowman.net
        https://www.postgresql.org/message-id/538F3D70.6080902@publicrelay.com
        https://www.postgresql.org/message-id/556439CF.7070109@pscs.co.uk
        https://www.postgresql.org/message-id/SG2PR06MB0760098A111C88E31BD4D96FB3540@SG2PR06MB0760.apcprd06.prod.outlook.com
        https://www.postgresql.org/message-id/20160615203829.5798.4594@wrigleys.postgresql.org
      e3ad3ffa
    • Tom Lane's avatar
      Fix building of large (bigger than shared_buffers) hash indexes. · 8cf739de
      Tom Lane authored
      When the index is predicted to need more than NBuffers buckets,
      CREATE INDEX attempts to sort the index entries by hash key before
      insertion, so as to reduce thrashing.  This code path got broken by
      commit 9f03ca91, which overlooked that _hash_form_tuple() is not
      just an alias for index_form_tuple().  The index got built anyway, but
      with garbage data, so that searches for pre-existing tuples always
      failed.  Fix by refactoring to separate construction of the indexable
      data from calling index_form_tuple().
      
      Per bug #14210 from Daniel Newman.  Back-patch to 9.5 where the
      bug was introduced.
      
      Report: <20160623162507.17237.39471@wrigleys.postgresql.org>
      8cf739de
    • Robert Haas's avatar
      postgres_fdw: Fix incorrect NULL handling in join pushdown. · 9e9c38e1
      Robert Haas authored
      something.* IS NOT NULL means that every attribute of the row is not
      NULL, not that the row itself is non-NULL (e.g. because it's coming
      from below an outer join.  Use (somevar.*)::pg_catalog.text IS NOT
      NULL instead.
      
      Ashutosh Bapat, per a report by Rushabh Lathia.  Reviewed by
      Amit Langote and Etsuro Fujita.  Schema-qualification added by me.
      9e9c38e1
    • Robert Haas's avatar
      postgres_fdw: Remove useless return statement. · 267569b2
      Robert Haas authored
      Etsuro Fujita
      267569b2
    • Peter Eisentraut's avatar
      bd406af1
  4. 23 Jun, 2016 2 commits
    • Andrew Dunstan's avatar
      Add tab completion for pager_min_lines to psql. · 562e4497
      Andrew Dunstan authored
      This was inadvertantly omitted from commit
      7655f4cc. Mea culpa.
      
      Backpatched to 9.5 where pager_min_lines was introduced.
      562e4497
    • Tom Lane's avatar
      Fix small memory leak in partial-aggregate deserialization functions. · bd1693e8
      Tom Lane authored
      A deserialize function's result is short-lived data during partial
      aggregation, since we're just going to pass it to the combine function
      and then it's of no use anymore.  However, the built-in deserialize
      functions allocated their results in the aggregate state context,
      resulting in a query-lifespan memory leak.  It's probably not possible for
      this to amount to anything much at present, since the number of leaked
      results would only be the number of worker processes.  But it might become
      a problem in future.  To fix, don't use the same convenience subroutine for
      setting up results that the aggregate transition functions use.
      
      David Rowley
      
      Report: <10050.1466637736@sss.pgh.pa.us>
      bd1693e8
  5. 22 Jun, 2016 7 commits
    • Tom Lane's avatar
      Improve user-facing documentation for partial/parallel aggregation. · 2d673424
      Tom Lane authored
      Add a section to xaggr.sgml, as we have done in the past for other
      extensions to the aggregation functionality.  Assorted wordsmithing
      and other minor improvements.
      
      David Rowley and Tom Lane
      2d673424
    • Tom Lane's avatar
      Update oidjoins regression test for 9.6. · 63ae0523
      Tom Lane authored
      Looks like we had some more catalog drift recently.
      63ae0523
    • Tom Lane's avatar
      Fix type-safety problem with parallel aggregate serial/deserialization. · f8ace547
      Tom Lane authored
      The original specification for this called for the deserialization function
      to have signature "deserialize(serialtype) returns transtype", which is a
      security violation if transtype is INTERNAL (which it always would be in
      practice) and serialtype is not (which ditto).  The patch blithely overrode
      the opr_sanity check for that, which was sloppy-enough work in itself,
      but the indisputable reason this cannot be allowed to stand is that CREATE
      FUNCTION will reject such a signature and thus it'd be impossible for
      extensions to create parallelizable aggregates.
      
      The minimum fix to make the signature type-safe is to add a second, dummy
      argument of type INTERNAL.  But to lock it down a bit more and make misuse
      of INTERNAL-accepting functions less likely, let's get rid of the ability
      to specify a "serialtype" for an aggregate and just say that the only
      useful serialtype is BYTEA --- which, in practice, is the only interesting
      value anyway, due to the usefulness of the send/recv infrastructure for
      this purpose.  That means we only have to allow "serialize(internal)
      returns bytea" and "deserialize(bytea, internal) returns internal" as
      the signatures for these support functions.
      
      In passing fix bogus signature of int4_avg_combine, which I found thanks
      to adding an opr_sanity check on combinefunc signatures.
      
      catversion bump due to removing pg_aggregate.aggserialtype and adjusting
      signatures of assorted built-in functions.
      
      David Rowley and Tom Lane
      
      Discussion: <27247.1466185504@sss.pgh.pa.us>
      f8ace547
    • Tom Lane's avatar
      Make "postgres -C guc" print "" not "(null)" for null-valued GUCs. · e45e990e
      Tom Lane authored
      Commit 0b0baf26 et al made this case print "(null)" on the grounds that
      that's what happened on platforms that didn't crash.  But neither behavior
      was actually intentional.  What we should print is just an empty string,
      for compatibility with the behavior of SHOW and other ways of examining
      string GUCs.  Those code paths don't distinguish NULL from empty strings,
      so we should not here either.  Per gripe from Alain Radix.
      
      Like the previous patch, back-patch to 9.2 where -C option was introduced.
      
      Discussion: <CA+YdpwxPUADrmxSD7+Td=uOshMB1KkDN7G7cf+FGmNjjxMhjbw@mail.gmail.com>
      e45e990e
    • Peter Eisentraut's avatar
      Improve cleanup in rolenames test · 6a9c5181
      Peter Eisentraut authored
      Drop test_schema at the end, because that otherwise interferes with the
      collate.linux.utf8 test.
      6a9c5181
    • Bruce Momjian's avatar
      Update comment about allowing GUCs to change scanning. · 3e9df858
      Bruce Momjian authored
      Reported-by: David G. Johnston
      
      Discussion: CAKFQuwZZvnxwSq9tNtvL+uyuDKGgV91zR_agtPxQHRWMWQRP8g@mail.gmail.com
      3e9df858
    • Tom Lane's avatar
      Document that dependency tracking doesn't consider function bodies. · 34292107
      Tom Lane authored
      If there's anyplace in our SGML docs that explains this behavior, I can't
      find it right at the moment.  Add an explanation in "Dependency Tracking"
      which seems like the authoritative place for such a discussion.  Per
      gripe from Michelle Schwan.
      
      While at it, update this section's example of a dependency-related
      error message: they last looked like that in 8.3.  And remove the
      explanation of dependency updates from pre-7.3 installations, which
      is probably no longer worth anybody's brain cells to read.
      
      The bogus error message example seems like an actual documentation bug,
      so back-patch to all supported branches.
      
      Discussion: <20160620160047.5792.49827@wrigleys.postgresql.org>
      34292107
  6. 21 Jun, 2016 1 commit
    • Tom Lane's avatar
      Refactor planning of projection steps that don't need a Result plan node. · 8b9d323c
      Tom Lane authored
      The original upper-planner-pathification design (commit 3fc6e2d7)
      assumed that we could always determine during Path formation whether or not
      we would need a Result plan node to perform projection of a targetlist.
      That turns out not to work very well, though, because createplan.c still
      has some responsibilities for choosing the specific target list associated
      with sorting/grouping nodes (in particular it might choose to add resjunk
      columns for sorting).  We might not ever refactor that --- doing so would
      push more work into Path formation, which isn't attractive --- and we
      certainly won't do so for 9.6.  So, while create_projection_path and
      apply_projection_to_path can tell for sure what will happen if the subpath
      is projection-capable, they can't tell for sure when it isn't.  This is at
      least a latent bug in apply_projection_to_path, which might think it can
      apply a target to a non-projecting node when the node will end up computing
      something different.
      
      Also, I'd tied the creation of a ProjectionPath node to whether or not a
      Result is needed, but it turns out that we sometimes need a ProjectionPath
      node anyway to avoid modifying a possibly-shared subpath node.  Callers had
      to use create_projection_path for such cases, and we added code to them
      that knew about the potential omission of a Result node and attempted to
      adjust the cost estimates for that.  That was uncertainly correct and
      definitely ugly/unmaintainable.
      
      To fix, have create_projection_path explicitly check whether a Result
      is needed and adjust its cost estimate accordingly, though it creates
      a ProjectionPath in either case.  apply_projection_to_path is now mostly
      just an optimized version that can avoid creating an extra Path node when
      the input is known to not be shared with any other live path.  (There
      is one case that create_projection_path doesn't handle, which is pushing
      parallel-safe expressions below a Gather node.  We could make it do that
      by duplicating the GatherPath, but there seems no need as yet.)
      
      create_projection_plan still has to recheck the tlist-match condition,
      which means that if the matching situation does get changed by createplan.c
      then we'll have made a slightly incorrect cost estimate.  But there seems
      no help for that in the near term, and I doubt it occurs often enough,
      let alone would change planning decisions often enough, to be worth
      stressing about.
      
      I added a "dummypp" field to ProjectionPath to track whether
      create_projection_path thinks a Result is needed.  This is not really
      necessary as-committed because create_projection_plan doesn't look at the
      flag; but it seems like a good idea to remember what we thought when
      forming the cost estimate, if only for debugging purposes.
      
      In passing, get rid of the target_parallel parameter added to
      apply_projection_to_path by commit 54f5c515.  I don't think that's a good
      idea because it involves callers in what should be an internal decision,
      and opens us up to missing optimization opportunities if callers think they
      don't need to provide a valid flag, as most don't.  For the moment, this
      just costs us an extra has_parallel_hazard call when planning a Gather.
      If that starts to look expensive, I think a better solution would be to
      teach PathTarget to carry/cache knowledge of parallel-safety of its
      contents.
      8b9d323c
  7. 20 Jun, 2016 7 commits
  8. 19 Jun, 2016 1 commit
    • Tom Lane's avatar
      Docs: improve description of psql's %R prompt escape sequence. · 705ad7f3
      Tom Lane authored
      Dilian Palauzov pointed out in bug #14201 that the docs failed to mention
      the possibility of %R producing '(' due to an unmatched parenthesis.
      
      He proposed just adding that in the same style as the other options were
      listed; but it seemed to me that the sentence was already nearly
      unintelligible, so I rewrote it a bit more extensively.
      
      Report: <20160619121113.5789.68274@wrigleys.postgresql.org>
      705ad7f3
  9. 18 Jun, 2016 6 commits
    • Tom Lane's avatar
      Improve error message annotation for GRANT/REVOKE on untrusted PLs. · 9bc33323
      Tom Lane authored
      The annotation for "ERROR: language "foo" is not trusted" used to say
      "HINT: Only superusers can use untrusted languages", which was fairly
      poorly thought out.  For one thing, it's not a hint about what to do,
      but a statement of fact, which makes it errdetail.  But also, this
      fails to clarify things much, because there's a missing step in the
      chain of reasoning.  I think it's more useful to say "GRANT and REVOKE
      are not allowed on untrusted languages, because only superusers can use
      untrusted languages".
      
      It's been like this for a long time, but given the lack of previous
      complaints, I don't think this is worth back-patching.
      
      Discussion: <1417.1466289901@sss.pgh.pa.us>
      9bc33323
    • Tom Lane's avatar
      Update 9.6 release notes through today. · a3f42e85
      Tom Lane authored
      a3f42e85
    • Tom Lane's avatar
      Restore foreign-key-aware estimation of join relation sizes. · 100340e2
      Tom Lane authored
      This patch provides a new implementation of the logic added by commit
      137805f8 and later removed by 77ba6108.  It differs from the original
      primarily in expending much less effort per joinrel in large queries,
      which it accomplishes by doing most of the matching work once per query not
      once per joinrel.  Hopefully, it's also less buggy and better commented.
      The never-documented enable_fkey_estimates GUC remains gone.
      
      There remains work to be done to make the selectivity estimates account
      for nulls in FK referencing columns; but that was true of the original
      patch as well.  We may be able to address this point later in beta.
      In the meantime, any error should be in the direction of overestimating
      rather than underestimating joinrel sizes, which seems like the direction
      we want to err in.
      
      Tomas Vondra and Tom Lane
      
      Discussion: <31041.1465069446@sss.pgh.pa.us>
      100340e2
    • Tom Lane's avatar
      Still another try at fixing scanjoin_target insertion into parallel plans. · 598aa194
      Tom Lane authored
      The previous code neglected the fact that the scanjoin_target might
      carry sortgroupref labelings that we need to absorb.  Instead, do
      create_projection_path() unconditionally, and tweak the path's cost
      estimate after the fact.  (I'm now convinced that we ought to refactor
      the way we account for sometimes not needing a separate projection step,
      but right now is not the time for that sort of cleanup.)
      
      Problem identified by Amit Kapila, patch by me.
      598aa194
    • Tom Lane's avatar
      Fix parallel-safety markings for contrib/dblink. · 7e81a18d
      Tom Lane authored
      As shown by buildfarm reports, dblink_build_sql_insert and
      dblink_build_sql_update are *not* parallel safe, because they
      may attempt to access temporary tables of the local session.
      
      Although dblink_build_sql_delete doesn't actually touch the
      contents of the referenced table, it seems consistent and prudent
      to mark it PARALLEL RESTRICTED too.
      7e81a18d
    • Tom Lane's avatar
      Fix handling of argument and result datatypes for partial aggregation. · 915b703e
      Tom Lane authored
      When doing partial aggregation, the args list of the upper (combining)
      Aggref node is replaced by a Var representing the output of the partial
      aggregation steps, which has either the aggregate's transition data type
      or a serialized representation of that.  However, nodeAgg.c blindly
      continued to use the args list as an indication of the user-level argument
      types.  This broke resolution of polymorphic transition datatypes at
      executor startup (though it accidentally failed to fail for the ANYARRAY
      case, which is likely the only one anyone had tested).  Moreover, the
      constructed FuncExpr passed to the finalfunc contained completely wrong
      information, which would have led to bogus answers or crashes for any case
      where the finalfunc examined that information (which is only likely to be
      with polymorphic aggregates using a non-polymorphic transition type).
      
      As an independent bug, apply_partialaggref_adjustment neglected to resolve
      a polymorphic transition datatype before assigning it as the output type
      of the lower-level Aggref node.  This again accidentally failed to fail
      for ANYARRAY but would be unlikely to work in other cases.
      
      To fix the first problem, record the user-level argument types in a
      separate OID-list field of Aggref, and look to that rather than the args
      list when asking what the argument types were.  (It turns out to be
      convenient to include any "direct" arguments in this list too, although
      those are not currently subject to being overwritten.)
      
      Rather than adding yet another resolve_aggregate_transtype() call to fix
      the second problem, add an aggtranstype field to Aggref, and store the
      resolved transition type OID there when the planner first computes it.
      (By doing this in the planner and not the parser, we can allow the
      aggregate's transition type to change from time to time, although no DDL
      support yet exists for that.)  This saves nothing of consequence for
      simple non-polymorphic aggregates, but for polymorphic transition types
      we save a catalog lookup during executor startup as well as several
      planner lookups that are new in 9.6 due to parallel query planning.
      
      In passing, fix an error that was introduced into count_agg_clauses_walker
      some time ago: it was applying exprTypmod() to something that wasn't an
      expression node at all, but a TargetEntry.  exprTypmod silently returned
      -1 so that there was not an obvious failure, but this broke the intended
      sensitivity of aggregate space consumption estimates to the typmod of
      varchar and similar data types.  This part needs to be back-patched.
      
      Catversion bump due to change of stored Aggref nodes.
      
      Discussion: <8229.1466109074@sss.pgh.pa.us>
      915b703e
  10. 17 Jun, 2016 5 commits
    • Tom Lane's avatar
      Docs typo fix. · d30d1acf
      Tom Lane authored
      Guillaume Lelarge
      d30d1acf
    • Alvaro Herrera's avatar
      Finish up XLOG_HINT renaming · 1ca4a1b5
      Alvaro Herrera authored
      Commit b8fd1a09 renamed XLOG_HINT to XLOG_FPI, but neglected two
      places.
      
      Backpatch to 9.3, like that commit.
      1ca4a1b5
    • Robert Haas's avatar
      pg_visibility: Add pg_truncate_visibility_map function. · 71d05a2c
      Robert Haas authored
      This requires some core changes as well so that we can properly
      WAL-log the truncation.  Specifically, it changes the format of the
      XLOG_SMGR_TRUNCATE WAL record, so bump XLOG_PAGE_MAGIC.
      
      Patch by me, reviewed but not fully endorsed by Andres Freund.
      71d05a2c
    • Robert Haas's avatar
      Try again to fix the way the scanjoin_target is used with partial paths. · 54f5c515
      Robert Haas authored
      Commit 04ae11f6 removed some broken
      code to apply the scan/join target to partial paths, but its theory
      that this processing step is totally unnecessary turns out to be wrong.
      Put similar code back again, but this time, check for parallel-safety
      and avoid in-place modifications to paths that may already have been
      used as part of some other path.
      
      (This is not an entirely elegant solution to this problem; it might
      be better, for example, to postpone generate_gather_paths for the
      topmost scan/join rel until after the scan/join target has been
      applied.  But this is not the time for such redesign work.)
      
      Amit Kapila and Robert Haas
      54f5c515
    • Robert Haas's avatar
      Add VACUUM (DISABLE_PAGE_SKIPPING) for emergencies. · ede62e56
      Robert Haas authored
      If you really want to vacuum every single page in the relation,
      regardless of apparent visibility status or anything else, you can use
      this option.  In previous releases, this behavior could be achieved
      using VACUUM (FREEZE), but because we can now recognize all-frozen
      pages as not needing to be frozen again, that no longer works.  There
      should be no need for routine use of this option, but maybe bugs or
      disaster recovery will necessitate its use.
      
      Patch by me, reviewed by Andres Freund.
      ede62e56