1. 26 Jan, 2020 1 commit
    • Heikki Linnakangas's avatar
      Refactor XLogReadRecord(), adding XLogBeginRead() function. · 38a95731
      Heikki Linnakangas authored
      The signature of XLogReadRecord() required the caller to pass the starting
      WAL position as argument, or InvalidXLogRecPtr to continue reading at the
      end of previous record. That's slightly awkward to the callers, as most
      of them don't want to randomly jump around in the WAL stream, but start
      reading at one position and then read everything from that point onwards.
      Remove the 'RecPtr' argument and add a new function XLogBeginRead() to
      specify the starting position instead. That's more convenient for the
      callers. Also, xlogreader holds state that is reset when you change the
      starting position, so having a separate function for doing that feels like
      a more natural fit.
      
      This changes XLogFindNextRecord() function so that it doesn't reset the
      xlogreader's state to what it was before the call anymore. Instead, it
      positions the xlogreader to the found record, like XLogBeginRead().
      
      Reviewed-by: Kyotaro Horiguchi, Alvaro Herrera
      Discussion: https://www.postgresql.org/message-id/5382a7a3-debe-be31-c860-cb810c08f366%40iki.fi
      38a95731
  2. 25 Jan, 2020 2 commits
    • Tom Lane's avatar
      Clean up EXPLAIN's handling of per-worker details. · 10013684
      Tom Lane authored
      Previously, it was possible for EXPLAIN ANALYZE of a parallel query
      to produce several different "Workers" fields for a single plan node,
      because different portions of explain.c independently generated
      per-worker data and wrapped that output in separate fields.  This
      is pretty bogus, especially for the structured output formats: even
      if it's not technically illegal, most programs would have a hard time
      dealing with such data.
      
      To improve matters, add infrastructure that allows redirecting
      per-worker values into a side data structure, and then collect that
      data into a single "Workers" field after we've finished running all
      the relevant code for a given plan node.
      
      There are a few visible side-effects:
      
      * In text format, instead of something like
      
        Sort Method: external merge  Disk: 4920kB
        Worker 0:  Sort Method: external merge  Disk: 5880kB
        Worker 1:  Sort Method: external merge  Disk: 5920kB
        Buffers: shared hit=682 read=10188, temp read=1415 written=2101
        Worker 0:  actual time=130.058..130.324 rows=1324 loops=1
          Buffers: shared hit=337 read=3489, temp read=505 written=739
        Worker 1:  actual time=130.273..130.512 rows=1297 loops=1
          Buffers: shared hit=345 read=3507, temp read=505 written=744
      
      you get
      
        Sort Method: external merge  Disk: 4920kB
        Buffers: shared hit=682 read=10188, temp read=1415 written=2101
        Worker 0:  actual time=130.058..130.324 rows=1324 loops=1
          Sort Method: external merge  Disk: 5880kB
          Buffers: shared hit=337 read=3489, temp read=505 written=739
        Worker 1:  actual time=130.273..130.512 rows=1297 loops=1
          Sort Method: external merge  Disk: 5920kB
          Buffers: shared hit=345 read=3507, temp read=505 written=744
      
      * When JIT is enabled, any relevant per-worker JIT stats are attached
      to the child node of the Gather or Gather Merge node, which is where
      the other per-worker output has always been.  Previously, that info
      was attached directly to a Gather node, or missed entirely for Gather
      Merge.
      
      * A query's summary JIT data no longer includes a bogus
      "Worker Number: -1" field.
      
      A notable code-level change is that indenting for lines of text-format
      output should now be handled by calling "ExplainIndentText(es)",
      instead of hard-wiring how much space to emit.  This seems a good deal
      cleaner anyway.
      
      This patch also adds a new "explain.sql" regression test script that's
      dedicated to testing EXPLAIN.  There is more that can be done in that
      line, certainly, but for now it just adds some coverage of the XML and
      YAML output formats, which had been completely untested.
      
      Although this is surely a bug fix, it's not clear that people would
      be happy with rearranging EXPLAIN output in a minor release, so apply
      to HEAD only.
      
      Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's;
      reviewed by Georgios Kokolatos
      
      Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
      10013684
    • Dean Rasheed's avatar
      Add functions gcd() and lcm() for integer and numeric types. · 13661ddd
      Dean Rasheed authored
      These compute the greatest common divisor and least common multiple of
      a pair of numbers using the Euclidean algorithm.
      
      Vik Fearing, reviewed by Fabien Coelho.
      
      Discussion: https://postgr.es/m/adbd3e0b-e3f1-5bbc-21db-03caf1cef0f7@2ndquadrant.com
      13661ddd
  3. 24 Jan, 2020 7 commits
  4. 23 Jan, 2020 6 commits
    • Tom Lane's avatar
      Add configure probe for rl_completion_suppress_quote. · c3270444
      Tom Lane authored
      I had supposed that all versions of Readline that have filename
      quoting hooks also have the rl_completion_suppress_quote variable.
      But it seems OpenBSD managed to find a version someplace that does
      not, so we'll have to expend a separate configure probe for that.
      
      (Light testing suggests that this version also lacks the bugs that
      make it necessary to frob that variable.  Hooray!)
      
      Per buildfarm.
      c3270444
    • Tom Lane's avatar
      Fix an oversight in commit 4c70098f. · 9a3a75cb
      Tom Lane authored
      I had supposed that the from_char_seq_search() call sites were
      all passing the constant arrays you'd expect them to pass ...
      but on looking closer, the one for DY format was passing the
      days[] array not days_short[].  This accidentally worked because
      the day abbreviations in English are all the same as the first
      three letters of the full day names.  However, once we took out
      the "maximum comparison length" logic, it stopped working.
      
      As penance for that oversight, add regression test cases covering
      this, as well as every other switch case in DCH_from_char() that
      was not reached according to the code coverage report.
      
      Also, fold the DCH_RM and DCH_rm cases into one --- now that
      seq_search is case independent, there's no need to pass different
      comparison arrays for those cases.
      
      Back-patch, as the previous commit was.
      9a3a75cb
    • Tom Lane's avatar
      Clean up formatting.c's logic for matching constant strings. · 4c70098f
      Tom Lane authored
      seq_search(), which is used to match input substrings to constants
      such as month and day names, had a lot of bizarre and unnecessary
      behaviors.  It was mostly possible to avert our eyes from that before,
      but we don't want to duplicate those behaviors in the upcoming patch
      to allow recognition of non-English month and day names.  So it's time
      to clean this up.  In particular:
      
      * seq_search scribbled on the input string, which is a pretty dangerous
      thing to do, especially in the badly underdocumented way it was done here.
      Fortunately the input string is a temporary copy, but that was being made
      three subroutine levels away, making it something easy to break
      accidentally.  The behavior is externally visible nonetheless, in the form
      of odd case-folding in error reports about unrecognized month/day names.
      The scribbling is evidently being done to save a few calls to pg_tolower,
      but that's such a cheap function (at least for ASCII data) that it's
      pretty pointless to worry about.  In HEAD I switched it to be
      pg_ascii_tolower to ensure it is cheap in all cases; but there are corner
      cases in Turkish where this'd change behavior, so leave it as pg_tolower
      in the back branches.
      
      * seq_search insisted on knowing the case form (all-upper, all-lower,
      or initcap) of the constant strings, so that it didn't have to case-fold
      them to perform case-insensitive comparisons.  This likewise seems like
      excessive micro-optimization, given that pg_tolower is certainly very
      cheap for ASCII data.  It seems unsafe to assume that we know the case
      form that will come out of pg_locale.c for localized month/day names, so
      it's better just to define the comparison rule as "downcase all strings
      before comparing".  (The choice between downcasing and upcasing is
      arbitrary so far as English is concerned, but it might not be in other
      locales, so follow citext's lead here.)
      
      * seq_search also had a parameter that'd cause it to report a match
      after a maximum number of characters, even if the constant string were
      longer than that.  This was not actually used because no caller passed
      a value small enough to cut off a comparison.  Replicating that behavior
      for localized month/day names seems expensive as well as useless, so
      let's get rid of that too.
      
      * from_char_seq_search used the maximum-length parameter to truncate
      the input string in error reports about not finding a matching name.
      This leads to rather confusing reports in many cases.  Worse, it is
      outright dangerous if the input string isn't all-ASCII, because we
      risk truncating the string in the middle of a multibyte character.
      That'd lead either to delivering an illegible error message to the
      client, or to encoding-conversion failures that obscure the actual
      data problem.  Get rid of that in favor of truncating at whitespace
      if any (a suggestion due to Alvaro Herrera).
      
      In addition to fixing these things, I const-ified the input string
      pointers of DCH_from_char and its subroutines, to make sure there
      aren't any other scribbling-on-input problems.
      
      The risk of generating a badly-encoded error message seems like
      enough of a bug to justify back-patching, so patch all supported
      branches.
      
      Discussion: https://postgr.es/m/29432.1579731087@sss.pgh.pa.us
      4c70098f
    • Tom Lane's avatar
      Improve psql's tab completion for filenames. · cd69ec66
      Tom Lane authored
      The Readline library contains a fair amount of knowledge about how to
      tab-complete filenames, but it turns out that that doesn't work too well
      unless we follow its expectation that we use its filename quoting hooks
      to quote and de-quote filenames.  We were trying to do such quote handling
      within complete_from_files(), and that's still what we have to do if we're
      using libedit, which lacks those hooks.  But for Readline, it works a lot
      better if we tell Readline that single-quote is a quoting character and
      then provide hooks that know the details of the quoting rules for SQL
      and psql meta-commands.
      
      Hence, resurrect the quoting hook functions that existed in the original
      version of tab-complete.c (and were disabled by commit f6689a32 because
      they "didn't work so well yet"), and whack on them until they do seem to
      work well.
      
      Notably, this fixes bug #16059 from Steven Winfield, who pointed out
      that the previous coding would strip quote marks from filenames in SQL
      COPY commands, even though they're syntactically necessary there.
      Now, we not only don't do that, but we'll add a quote mark when you
      tab-complete, even if you didn't type one.
      
      Getting this to work across a range of libedit versions (and, to a
      lesser extent, libreadline versions) was depressingly difficult.
      It will be interesting to see whether the new regression test cases
      pass everywhere in the buildfarm.
      
      Some future patch might try to handle quoted SQL identifiers with
      similar explicit quoting/dequoting logic, but that's for another day.
      
      Patch by me, reviewed by Peter Eisentraut.
      
      Discussion: https://postgr.es/m/16059-8836946734c02b84@postgresql.org
      cd69ec66
    • Michael Paquier's avatar
      Doc: Fix and tweak documentation for ANALYZE reporting · 5ba40b62
      Michael Paquier authored
      The docs had some typos and grammar mistakes, and its indentation was
      inconsistent.
      
      Author: Amit Langote, Justin Pryzby
      Discussion: https://postgr.es/m/20200116151930.GM26045@telsasoft.com
      5ba40b62
    • Michael Paquier's avatar
      Clarify some comments in vacuumlazy.c · f942dfb9
      Michael Paquier authored
      Author: Justin Pryzby
      Discussion: https://postgr.es/m/20200113004542.GA26045@telsasoft.com
      f942dfb9
  5. 22 Jan, 2020 4 commits
    • Alvaro Herrera's avatar
      Add BRIN test case · 611ce856
      Alvaro Herrera authored
      This test case was sketched in commit message 4c870109 to explain an
      ancient bug; it translates to a coverage increase, so add it to the BRIN
      regression tests.
      611ce856
    • Fujii Masao's avatar
      Add GUC ignore_invalid_pages. · 41c184bc
      Fujii Masao authored
      Detection of WAL records having references to invalid pages
      during recovery causes PostgreSQL to raise a PANIC-level error,
      aborting the recovery. Setting ignore_invalid_pages to on causes
      the system to ignore those WAL records (but still report a warning),
      and continue recovery. This behavior may cause crashes, data loss,
      propagate or hide corruption, or other serious problems.
      However, it may allow you to get past the PANIC-level error,
      to finish the recovery, and to cause the server to start up.
      
      Author: Fujii Masao
      Reviewed-by: Michael Paquier
      Discussion: https://www.postgresql.org/message-id/CAHGQGwHCK6f77yeZD4MHOnN+PaTf6XiJfEB+Ce7SksSHjeAWtg@mail.gmail.com
      41c184bc
    • Amit Kapila's avatar
      Fix the computation of max dead tuples during the vacuum. · 79a3efb8
      Amit Kapila authored
      In commit 40d964ec, we changed the way memory is allocated for dead
      tuples but forgot to update the place where we compute the maximum
      number of dead tuples.  This could lead to invalid memory requests.
      
      Reported-by: Andres Freund
      Diagnosed-by: Andres Freund
      Author: Masahiko Sawada
      Reviewed-by: Amit Kapila and Dilip Kumar
      Discussion: https://postgr.es/m/20200121060020.e3cr7s7fj5rw4lok@alap3.anarazel.de
      79a3efb8
    • Michael Paquier's avatar
      Fix concurrent indexing operations with temporary tables · a904abe2
      Michael Paquier authored
      Attempting to use CREATE INDEX, DROP INDEX or REINDEX with CONCURRENTLY
      on a temporary relation with ON COMMIT actions triggered unexpected
      errors because those operations use multiple transactions internally to
      complete their work.  Here is for example one confusing error when using
      ON COMMIT DELETE ROWS:
      ERROR:  index "foo" already contains data
      
      Issues related to temporary relations and concurrent indexing are fixed
      in this commit by enforcing the non-concurrent path to be taken for
      temporary relations even if using CONCURRENTLY, transparently to the
      user.  Using a non-concurrent path does not matter in practice as locks
      cannot be taken on a temporary relation by a session different than the
      one owning the relation, and the non-concurrent operation is more
      effective.
      
      The problem exists with REINDEX since v12 with the introduction of
      CONCURRENTLY, and with CREATE/DROP INDEX since CONCURRENTLY exists for
      those commands.  In all supported versions, this caused only confusing
      error messages to be generated.  Note that with REINDEX, it was also
      possible to issue a REINDEX CONCURRENTLY for a temporary relation owned
      by a different session, leading to a server crash.
      
      The idea to enforce transparently the non-concurrent code path for
      temporary relations comes originally from Andres Freund.
      
      Reported-by: Manuel Rigger
      Author: Michael Paquier, Heikki Linnakangas
      Reviewed-by: Andres Freund, Álvaro Herrera, Heikki Linnakangas
      Discussion: https://postgr.es/m/CA+u7OA6gP7YAeCguyseusYcc=uR8+ypjCcgDDCTzjQ+k6S9ksQ@mail.gmail.com
      Backpatch-through: 9.4
      a904abe2
  6. 21 Jan, 2020 3 commits
    • Tom Lane's avatar
      Clarify behavior of adding and altering a column in same ALTER command. · 9b9c5f27
      Tom Lane authored
      The behavior of something like
      
      ALTER TABLE transactions
        ADD COLUMN status varchar(30) DEFAULT 'old',
        ALTER COLUMN status SET default 'current';
      
      is to fill existing table rows with 'old', not 'current'.  That's
      intentional and desirable for a couple of reasons:
      
      * It makes the behavior the same whether you merge the sub-commands
      into one ALTER command or give them separately;
      
      * If we applied the new default while filling the table, there would
      be no way to get the existing behavior in one SQL command.
      
      The same reasoning applies in cases that add a column and then
      manipulate its GENERATED/IDENTITY status in a second sub-command,
      since the generation expression is really just a kind of default.
      However, that wasn't very obvious (at least not to me; earlier in
      the referenced discussion thread I'd thought it was a bug to be
      fixed).  And it certainly wasn't documented.
      
      Hence, add documentation, code comments, and a test case to clarify
      that this behavior is all intentional.
      
      In passing, adjust ATExecAddColumn's defaults-related relkind check
      so that it matches up exactly with ATRewriteTables, instead of being
      effectively (though not literally) the negated inverse condition.
      The reasoning can be explained a lot more concisely that way, too
      (not to mention that the comment now matches the code, which it
      did not before).
      
      Discussion: https://postgr.es/m/10365.1558909428@sss.pgh.pa.us
      9b9c5f27
    • Andres Freund's avatar
      Fix edge case leading to agg transitions skipping ExecAggTransReparent() calls. · affdde2e
      Andres Freund authored
      The code checking whether an aggregate transition value needs to be
      reparented into the current context has always only compared the
      transition return value with the previous transition value by datum,
      i.e. without regard for NULLness.  This normally works, because when
      the transition function returns NULL (via fcinfo->isnull), it'll
      return a value that won't be the same as its input value.
      
      But there's no hard requirement that that's the case. And it turns
      out, it's possible to hit this case (see discussion or reproducers),
      leading to a non-null transition value not being reparented, followed
      by a crash caused by that.
      
      Instead of adding another comparison of NULLness, instead have
      ExecAggTransReparent() ensure that pergroup->transValue ends up as 0
      when the new transition value is NULL. That avoids having to add an
      additional branch to the much more common cases of the transition
      function returning the old transition value (which is a pointer in
      this case), and when the new value is different, but not NULL.
      
      In branches since 69c3936a, also deduplicate the reparenting code
      between the expression evaluation based transitions, and the path for
      ordered aggregates.
      
      Reported-By: Teodor Sigaev, Nikita Glukhov
      Author: Andres Freund
      Discussion: https://postgr.es/m/bd34e930-cfec-ea9b-3827-a8bc50891393@sigaev.ru
      Backpatch: 9.4-, this issue has existed since at least 7.4
      affdde2e
    • Michael Paquier's avatar
      Add GUC variables for stat tracking and timeout as PGDLLIMPORT · 62c9b522
      Michael Paquier authored
      This helps integration of extensions with Windows.  The following
      parameters are changed:
      - idle_in_transaction_session_timeout (9.6 and newer versions)
      - lock_timeout
      - statement_timeout
      - track_activities
      - track_counts
      - track_functions
      
      Author: Pascal Legrand
      Reviewed-by: Amit Kamila, Julien Rouhaud, Michael Paquier
      Discussion: https://postgr.es/m/1579298868581-0.post@n3.nabble.com
      Backpatch-through: 9.4
      62c9b522
  7. 20 Jan, 2020 5 commits
    • Tom Lane's avatar
      Further tweaking of jsonb_set_lax(). · 31f403e9
      Tom Lane authored
      Some buildfarm members were still warning about this, because in
      9c679a08 I'd missed decorating one of the ereport() code paths
      with a dummy return.
      
      Also, adjust the error messages to be more in line with project
      style guide.
      31f403e9
    • Tom Lane's avatar
      Fix pg_dump's sigTermHandler() to use _exit() not exit(). · cd23a201
      Tom Lane authored
      sigTermHandler() tried to be careful to invoke only operations that
      are safe to do in a signal handler.  But for some reason we forgot
      that exit(3) is not among those, because it calls atexit handlers
      that might do various random things.  (pg_dump itself installs no
      atexit handlers, but e.g. OpenSSL does.)  That led to crashes or
      lockups when attempting to terminate a parallel dump or restore
      via a signal.
      
      Fix by calling _exit() instead.
      
      Per bug #16199 from Raúl Marín.  Back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/16199-cb2f121146a96f9b@postgresql.org
      cd23a201
    • Heikki Linnakangas's avatar
      Fix crash in BRIN inclusion op functions, due to missing datum copy. · 4c870109
      Heikki Linnakangas authored
      The BRIN add_value() and union() functions need to make a longer-lived
      copy of the argument, if they want to store it in the BrinValues struct
      also passed as argument. The functions for the "inclusion operator
      classes" used with box, range and inet types didn't take into account
      that the union helper function might return its argument as is, without
      making a copy. Check for that case, and make a copy if necessary. That
      case arises at least with the range_union() function, when one of the
      arguments is an 'empty' range:
      
      CREATE TABLE brintest (n numrange);
      CREATE INDEX brinidx ON brintest USING brin (n);
      INSERT INTO brintest VALUES ('empty');
      INSERT INTO brintest VALUES (numrange(0, 2^1000::numeric));
      INSERT INTO brintest VALUES ('(-1, 0)');
      
      SELECT brin_desummarize_range('brinidx', 0);
      SELECT brin_summarize_range('brinidx', 0);
      
      Backpatch down to 9.5, where BRIN was introduced.
      
      Discussion: https://www.postgresql.org/message-id/e6e1d6eb-0a67-36aa-e779-bcca59167c14%40iki.fi
      Reviewed-by: Emre Hasegeli, Tom Lane, Alvaro Herrera
      4c870109
    • Amit Kapila's avatar
      Allow vacuum command to process indexes in parallel. · 40d964ec
      Amit Kapila authored
      This feature allows the vacuum to leverage multiple CPUs in order to
      process indexes.  This enables us to perform index vacuuming and index
      cleanup with background workers.  This adds a PARALLEL option to VACUUM
      command where the user can specify the number of workers that can be used
      to perform the command which is limited by the number of indexes on a
      table.  Specifying zero as a number of workers will disable parallelism.
      This option can't be used with the FULL option.
      
      Each index is processed by at most one vacuum process.  Therefore parallel
      vacuum can be used when the table has at least two indexes.
      
      The parallel degree is either specified by the user or determined based on
      the number of indexes that the table has, and further limited by
      max_parallel_maintenance_workers.  The index can participate in parallel
      vacuum iff it's size is greater than min_parallel_index_scan_size.
      
      Author: Masahiko Sawada and Amit Kapila
      Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra,
      Mahendra Singh and Sergei Kornilov
      Tested-by: Mahendra Singh and Prabhat Sahu
      Discussion:
      https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
      https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
      40d964ec
    • Tom Lane's avatar
      Fix out-of-memory handling in ecpglib. · 44f1fc8d
      Tom Lane authored
      ecpg_build_params() would crash on a null pointer dereference if
      realloc() failed, due to updating the persistent "stmt" struct
      too aggressively.  (Even without the crash, this would've leaked
      the old storage that we were trying to realloc.)
      
      Per Coverity.  This seems to have been broken in commit 0cc05079,
      so back-patch into v12.
      44f1fc8d
  8. 19 Jan, 2020 3 commits
  9. 18 Jan, 2020 2 commits
    • Tom Lane's avatar
      Doc: rearrange the documentation of binary-string functions. · 34a0a81b
      Tom Lane authored
      Rather than intermixing the discussion of text-string and binary-string
      functions, make a clean break, moving all discussion of binary-string
      operations into section 9.5.  This involves some duplication of
      function descriptions between 9.4 and 9.5, but it seems cleaner on the
      whole since the individual descriptions are clearer (and on the other
      side of the coin, it gets rid of some duplicated descriptions, too).
      
      Move the convert*/encode/decode functions to a separate table, because
      they don't quite seem to fit under the heading of "binary string
      functions".
      
      Also provide full documentation of the textual formats supported by
      encode() and decode() (which was the original goal of this patch
      series, many moons ago).
      
      Also move the table of built-in encoding conversions out of section 9.4,
      where it no longer had any relevance whatsoever, and put it into section
      23.3 about character sets.  I chose to put both that and table 23.2
      (multibyte-translation-table) into a new <sect2> so as not to break up
      the flow of discussion in 23.3.3.
      
      Also do a bunch of minor copy-editing on the function descriptions
      in 9.4 and 9.5.
      
      Karl Pinc, reviewed by Fabien Coelho, further hacking by me
      
      Discussion: https://postgr.es/m/20190304163347.7bca4897@slate.meme.com
      34a0a81b
    • Michael Paquier's avatar
      Add GUC checks for ssl_min_protocol_version and ssl_max_protocol_version · 41aadeeb
      Michael Paquier authored
      Mixing incorrect bounds set in the SSL context leads to confusing error
      messages generated by OpenSSL which are hard to act on.  New checks are
      added within the GUC machinery to improve the user experience as they
      apply to any SSL implementation, not only OpenSSL, and doing the checks
      beforehand avoids the creation of a SSL during a reload (or startup)
      which we know will never be used anyway.
      
      Backpatch down to 12, as those parameters have been introduced by
      e73e67c7.
      
      Author: Michael Paquier
      Reviewed-by: Daniel Gustafsson
      Discussion: https://postgr.es/m/20200114035420.GE1515@paquier.xyz
      Backpatch-through: 12
      41aadeeb
  10. 17 Jan, 2020 7 commits
    • Alexander Korotkov's avatar
      Avoid full scan of GIN indexes when possible · 4b754d6c
      Alexander Korotkov authored
      The strategy of GIN index scan is driven by opclass-specific extract_query
      method.  This method that needed search mode is GIN_SEARCH_MODE_ALL.  This
      mode means that matching tuple may contain none of extracted entries.  Simple
      example is '!term' tsquery, which doesn't need any term to exist in matching
      tsvector.
      
      In order to handle such scan key GIN calculates virtual entry, which contains
      all TIDs of all entries of attribute.  In fact this is full scan of index
      attribute.  And typically this is very slow, but allows to handle some queries
      correctly in GIN.  However, current algorithm calculate such virtual entry for
      each GIN_SEARCH_MODE_ALL scan key even if they are multiple for the same
      attribute.  This is clearly not optimal.
      
      This commit improves the situation by introduction of "exclude only" scan keys.
      Such scan keys are not capable to return set of matching TIDs.  Instead, they
      are capable only to filter TIDs produced by normal scan keys.  Therefore,
      each attribute should contain at least one normal scan key, while rest of them
      may be "exclude only" if search mode is GIN_SEARCH_MODE_ALL.
      
      The same optimization might be applied to the whole scan, not per-attribute.
      But that leads to NULL values elimination problem.  There is trade-off between
      multiple possible ways to do this.  We probably want to do this later using
      some cost-based decision algorithm.
      
      Discussion: https://postgr.es/m/CAOBaU_YGP5-BEt5Cc0%3DzMve92vocPzD%2BXiZgiZs1kjY0cj%3DXBg%40mail.gmail.com
      Author: Nikita Glukhov, Alexander Korotkov, Tom Lane, Julien Rouhaud
      Reviewed-by: Julien Rouhaud, Tomas Vondra, Tom Lane
      4b754d6c
    • Tom Lane's avatar
      Repair more failures with SubPlans in multi-row VALUES lists. · 41c6f9db
      Tom Lane authored
      Commit 9b63c13f turns out to have been fundamentally misguided:
      the parent node's subPlan list is by no means the only way in which
      a child SubPlan node can be hooked into the outer execution state.
      As shown in bug #16213 from Matt Jibson, we can also get short-lived
      tuple table slots added to the outer es_tupleTable list.  At this point
      I have little faith that there aren't other possible connections as
      well; the long time it took to notice this problem shows that this
      isn't a heavily-exercised situation.
      
      Therefore, revert that fix, returning to the coding that passed a
      NULL parent plan pointer down to the transiently-built subexpressions.
      That gives us a pretty good guarantee that they won't hook into the
      outer executor state in any way.  But then we need some other solution
      to make SubPlans work.  Adopt the solution speculated about in the
      previous commit's log message: do expression initialization at plan
      startup for just those VALUES rows containing SubPlans, abandoning the
      goal of reclaiming memory intra-query for those rows.  In practice it
      seems unlikely that queries containing a vast number of VALUES rows
      would be using SubPlans in them, so this should not give up much.
      
      (BTW, this test case also refutes my claim in connection with the prior
      commit that the issue only arises with use of LATERAL.  That was just
      wrong: some variants of SubLink always produce SubPlans.)
      
      As with previous patch, back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/16213-871ac3bc208ecf23@postgresql.org
      41c6f9db
    • Alvaro Herrera's avatar
      Set ReorderBufferTXN->final_lsn more eagerly · 15cac3a5
      Alvaro Herrera authored
      ... specifically, set it incrementally as each individual change is
      spilled down to disk.  This way, it is set correctly when the
      transaction disappears without trace, ie. without leaving an XACT_ABORT
      wal record.  (This happens when the server crashes midway through a
      transaction.)
      
      Failing to have final_lsn prevents ReorderBufferRestoreCleanup() from
      working, since it needs the final_lsn in order to know the endpoint of
      its iteration through spilled files.
      
      Commit df9f682c already tried to fix the problem, but it didn't set
      the final_lsn in all cases.  Revert that, since it's no longer needed.
      
      Author: Vignesh C
      Reviewed-by: Amit Kapila, Dilip Kumar
      Discussion: https://postgr.es/m/CALDaNm2CLk+K9JDwjYST0sPbGg5AQdvhUt0jbKyX_HdAE0jk3A@mail.gmail.com
      15cac3a5
    • Tomas Vondra's avatar
      Allocate freechunks bitmap as part of SlabContext · 543852fd
      Tomas Vondra authored
      The bitmap used by SlabCheck to cross-check free chunks in a block used
      to be allocated for each SlabCheck call, and was never freed. The memory
      leak could be fixed by simply adding a pfree call, but it's actually a
      bad idea to do any allocations in SlabCheck at all as it assumes the
      state of the memory management as a whole is sane.
      
      So instead we allocate the bitmap as part of SlabContext, which means
      we don't need to do any allocations in SlabCheck and the bitmap goes
      away together with the SlabContext.
      
      Backpatch to 10, where the Slab context was introduced.
      
      Author: Tomas Vondra
      Reported-by: Andres Freund
      Reviewed-by: Tom Lane
      Backpatch-through: 10
      Discussion: https://www.postgresql.org/message-id/20200116044119.g45f7pmgz4jmodxj%40alap3.anarazel.de
      543852fd
    • Andrew Dunstan's avatar
    • Andrew Dunstan's avatar
      Add a non-strict version of jsonb_set · a83586b5
      Andrew Dunstan authored
      jsonb_set_lax() is the same as jsonb_set, except that it takes and extra
      argument that specifies what to do if the value argument is NULL. The
      default is 'use_json_null'. Other possibilities are 'raise_exception',
      'return_target' and 'delete_key', all these behaviours having been
      suggested as reasonable by various users.
      
      Discussion: https://postgr.es/m/375873e2-c957-3a8d-64f9-26c43c2b16e7@2ndQuadrant.com
      
      Reviewed by: Pavel Stehule
      a83586b5
    • Michael Paquier's avatar
      Move OpenSSL routines for min/max protocol setting to src/common/ · f7cd5896
      Michael Paquier authored
      Two routines have been added in OpenSSL 1.1.0 to set the protocol bounds
      allowed within a given SSL context:
      - SSL_CTX_set_min_proto_version
      - SSL_CTX_set_max_proto_version
      
      As Postgres supports OpenSSL down to 1.0.1 (as of HEAD), equivalent
      replacements exist in the tree, which are only available for the
      backend.  A follow-up patch is planned to add control of the SSL
      protocol bounds for libpq, so move those routines to src/common/ so as
      libpq can use them.
      
      Author: Daniel Gustafsson
      Discussion: https://postgr.es/m/4F246AE3-A7AE-471E-BD3D-C799D3748E03@yesql.se
      f7cd5896