1. 30 Jul, 2020 6 commits
  2. 29 Jul, 2020 8 commits
    • Peter Geoghegan's avatar
      Add hash_mem_multiplier GUC. · d6c08e29
      Peter Geoghegan authored
      Add a GUC that acts as a multiplier on work_mem.  It gets applied when
      sizing executor node hash tables that were previously size constrained
      using work_mem alone.
      
      The new GUC can be used to preferentially give hash-based nodes more
      memory than the generic work_mem limit.  It is intended to enable admin
      tuning of the executor's memory usage.  Overall system throughput and
      system responsiveness can be improved by giving hash-based executor
      nodes more memory (especially over sort-based alternatives, which are
      often much less sensitive to being memory constrained).
      
      The default value for hash_mem_multiplier is 1.0, which is also the
      minimum valid value.  This means that hash-based nodes continue to apply
      work_mem in the traditional way by default.
      
      hash_mem_multiplier is generally useful.  However, it is being added now
      due to concerns about hash aggregate performance stability for users
      that upgrade to Postgres 13 (which added disk-based hash aggregation in
      commit 1f39bce0).  While the old hash aggregate behavior risked
      out-of-memory errors, it is nevertheless likely that many users actually
      benefited.  Hash agg's previous indifference to work_mem during query
      execution was not just faster; it also accidentally made aggregation
      resilient to grouping estimate problems (at least in cases where this
      didn't create destabilizing memory pressure).
      
      hash_mem_multiplier can provide a certain kind of continuity with the
      behavior of Postgres 12 hash aggregates in cases where the planner
      incorrectly estimates that all groups (plus related allocations) will
      fit in work_mem/hash_mem.  This seems necessary because hash-based
      aggregation is usually much slower when only a small fraction of all
      groups can fit.  Even when it isn't possible to totally avoid hash
      aggregates that spill, giving hash aggregation more memory will reliably
      improve performance (the same cannot be said for external sort
      operations, which appear to be almost unaffected by memory availability
      provided it's at least possible to get a single merge pass).
      
      The PostgreSQL 13 release notes should advise users that increasing
      hash_mem_multiplier can help with performance regressions associated
      with hash aggregation.  That can be taken care of by a later commit.
      
      Author: Peter Geoghegan
      Reviewed-By: Álvaro Herrera, Jeff Davis
      Discussion: https://postgr.es/m/20200625203629.7m6yvut7eqblgmfo@alap3.anarazel.de
      Discussion: https://postgr.es/m/CAH2-WzmD%2Bi1pG6rc1%2BCjc4V6EaFJ_qSuKCCHVnH%3DoruqD-zqow%40mail.gmail.com
      Backpatch: 13-, where disk-based hash aggregation was introduced.
      d6c08e29
    • Fujii Masao's avatar
      pg_stat_statements: track number of rows processed by some utility commands. · 6023b7ea
      Fujii Masao authored
      This commit makes pg_stat_statements track the total number
      of rows retrieved or affected by CREATE TABLE AS, SELECT INTO,
      CREATE MATERIALIZED VIEW and FETCH commands.
      
      Suggested-by: Pascal Legrand
      Author: Fujii Masao
      Reviewed-by: Asif Rehman
      Discussion: https://postgr.es/m/1584293755198-0.post@n3.nabble.com
      6023b7ea
    • Fujii Masao's avatar
      Remove non-fast promotion. · b5310e4f
      Fujii Masao authored
      When fast promotion was supported in 9.3, non-fast promotion became
      undocumented feature and it's basically not available for ordinary users.
      However we decided not to remove non-fast promotion at that moment,
      to leave it for a release or two for debugging purpose or as an emergency
      method because fast promotion might have some issues, and then to
      remove it later. Now, several versions were released since that decision
      and there is no longer reason to keep supporting non-fast promotion.
      Therefore this commit removes non-fast promotion.
      
      Author: Fujii Masao
      Reviewed-by: Hamid Akhtar, Kyotaro Horiguchi
      Discussion: https://postgr.es/m/76066434-648f-f567-437b-54853b43398f@oss.nttdata.com
      b5310e4f
    • Jeff Davis's avatar
      HashAgg: use better cardinality estimate for recursive spilling. · 9878b643
      Jeff Davis authored
      Use HyperLogLog to estimate the group cardinality in a spilled
      partition. This estimate is used to choose the number of partitions if
      we recurse.
      
      The previous behavior was to use the number of tuples in a spilled
      partition as the estimate for the number of groups, which lead to
      overpartitioning. That could cause the number of batches to be much
      higher than expected (with each batch being very small), which made it
      harder to interpret EXPLAIN ANALYZE results.
      
      Reviewed-by: Peter Geoghegan
      Discussion: https://postgr.es/m/a856635f9284bc36f7a77d02f47bbb6aaf7b59b3.camel@j-davis.com
      Backpatch-through: 13
      9878b643
    • Michael Paquier's avatar
      Fix incorrect print format in json.c · f2130e77
      Michael Paquier authored
      Oid is unsigned, so %u needs to be used and not %d.  The code path
      involved here is not normally reachable, so no backpatch is done.
      
      Author: Justin Pryzby
      Discussion: https://postgr.es/m/20200728015523.GA27308@telsasoft.com
      f2130e77
    • Thomas Munro's avatar
      Move syncscan.c to src/backend/access/common. · cb04ad49
      Thomas Munro authored
      Since the tableam.c code needs to make use of the syncscan.c routines
      itself, and since other block-oriented AMs might also want to use it one
      day, it didn't make sense for it to live under src/backend/access/heap.
      Reviewed-by: default avatarAndres Freund <andres@anarazel.de>
      Discussion: https://postgr.es/m/CA%2BhUKGLCnG%3DNEAByg6bk%2BCT9JZD97Y%3DAxKhh27Su9FeGWOKvDg%40mail.gmail.com
      cb04ad49
    • Peter Geoghegan's avatar
      Rename another "hash_mem" local variable. · c49c74d1
      Peter Geoghegan authored
      Missed by my commit 564ce621.
      
      Backpatch: 13-, where disk-based hash aggregation was introduced.
      c49c74d1
    • Peter Geoghegan's avatar
      Correct obsolete UNION hash aggs comment. · b1d79127
      Peter Geoghegan authored
      Oversight in commit 1f39bce0, which added disk-based hash aggregation.
      
      Backpatch: 13-, where disk-based hash aggregation was introduced.
      b1d79127
  3. 28 Jul, 2020 6 commits
    • Peter Geoghegan's avatar
      Doc: Remove obsolete CREATE AGGREGATE note. · f36e8207
      Peter Geoghegan authored
      The planner is in fact willing to use hash aggregation when work_mem is
      not set high enough for everything to fit in memory.  This has been the
      case since commit 1f39bce0, which added disk-based hash aggregation.
      
      There are a few remaining cases in which hash aggregation is avoided as
      a matter of policy when the planner surmises that spilling will be
      necessary.  For example, callers of choose_hashed_setop() still
      conservatively avoid hash aggregation when spilling is anticipated.
      That doesn't seem like a good enough reason to mention hash aggregation
      in this context.
      
      Backpatch: 13-, where disk-based hash aggregation was introduced.
      f36e8207
    • David Rowley's avatar
      Make EXPLAIN ANALYZE of HashAgg more similar to Hash Join · 0e3e1c4e
      David Rowley authored
      There were various unnecessary differences between Hash Agg's EXPLAIN
      ANALYZE output and Hash Join's.  Here we modify the Hash Agg output so
      that it's better aligned to Hash Join's.
      
      The following changes have been made:
      1. Start batches counter at 1 instead of 0.
      2. Always display the "Batches" property, even when we didn't spill to
         disk.
      3. Use the text "Batches" instead of "HashAgg Batches" for text format.
      4. Use the text "Memory Usage" instead of "Peak Memory Usage" for text
         format.
      5. Include "Batches" before "Memory Usage" in both text and non-text
         formats.
      
      In passing also modify the "Planned Partitions" property so that we show
      it regardless of if the value is 0 or not for non-text EXPLAIN formats.
      This was pointed out by Justin Pryzby and probably should have been part
      of 40efbf87.
      
      Reviewed-by: Justin Pryzby, Jeff Davis
      Discussion: https://postgr.es/m/CAApHDvrshRnA6C0VFnu7Fb9TVvgGo80PUMm5+2DiaS1gEkPvtw@mail.gmail.com
      Backpatch-through: 13, where HashAgg batching was introduced
      0e3e1c4e
    • David Rowley's avatar
      Doc: Improve documentation for pg_jit_available() · d7c8576e
      David Rowley authored
      Per complaint from Scott Ribe. Based on wording suggestion from Tom Lane.
      
      Discussion: https://postgr.es/m/1956E806-1468-4417-9A9D-235AE1D5FE1A@elevated-dev.com
      Backpatch-through: 11, where pg_jit_available() was added
      d7c8576e
    • Amit Kapila's avatar
      Extend the logical decoding output plugin API with stream methods. · 45fdc973
      Amit Kapila authored
      This adds seven methods to the output plugin API, adding support for
      streaming changes of large in-progress transactions.
      
      * stream_start
      * stream_stop
      * stream_abort
      * stream_commit
      * stream_change
      * stream_message
      * stream_truncate
      
      Most of this is a simple extension of the existing methods, with
      the semantic difference that the transaction (or subtransaction)
      is incomplete and may be aborted later (which is something the
      regular API does not really need to deal with).
      
      This also extends the 'test_decoding' plugin, implementing these
      new stream methods.
      
      The stream_start/start_stop are used to demarcate a chunk of changes
      streamed for a particular toplevel transaction.
      
      This commit simply adds these new APIs and the upcoming patch to "allow
      the streaming mode in ReorderBuffer" will use these APIs.
      
      Author: Tomas Vondra, Dilip Kumar, Amit Kapila
      Reviewed-by: Amit Kapila
      Tested-by: Neha Sharma and Mahendra Singh Thalor
      Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
      45fdc973
    • Etsuro Fujita's avatar
      Fix some issues with step generation in partition pruning. · 13838740
      Etsuro Fujita authored
      In the case of range partitioning, get_steps_using_prefix() assumes that
      the passed-in prefix list contains at least one clause for each of the
      partition keys earlier than one specified in the passed-in
      step_lastkeyno, but the caller (ie, gen_prune_steps_from_opexps())
      didn't take it into account, which led to a server crash or incorrect
      results when the list contained no clauses for such partition keys, as
      reported in bug #16500 and #16501 from Kobayashi Hisanori.  Update the
      caller to call that function only when the list created there contains
      at least one clause for each of the earlier partition keys in the case
      of range partitioning.
      
      While at it, fix some other issues:
      
      * The list to pass to get_steps_using_prefix() is allowed to contain
        multiple clauses for the same partition key, as described in the
        comment for that function, but that function actually assumed that the
        list contained just a single clause for each of middle partition keys,
        which led to an assertion failure when the list contained multiple
        clauses for such partition keys.  Update that function to match the
        comment.
      * In the case of hash partitioning, partition keys are allowed to be
        NULL, in which case the list to pass to get_steps_using_prefix()
        contains no clauses for NULL partition keys, but that function treats
        that case as like the case of range partitioning, which led to the
        assertion failure.  Update the assertion test to take into account
        NULL partition keys in the case of hash partitioning.
      * Fix a typo in a comment in get_steps_using_prefix_recurse().
      * gen_partprune_steps() failed to detect self-contradiction from
        strict-qual clauses and an IS NULL clause for the same partition key
        in some cases, producing incorrect partition-pruning steps, which led
        to incorrect results of partition pruning, but didn't cause any
        user-visible problems fortunately, as the self-contradiction is
        detected later in the query planning.  Update that function to detect
        the self-contradiction.
      
      Per bug #16500 and #16501 from Kobayashi Hisanori.  Patch by me, initial
      diagnosis for the reported issue and review by Dmitry Dolgov.
      Back-patch to v11, where partition pruning was introduced.
      
      Discussion: https://postgr.es/m/16500-d1613f2a78e1e090%40postgresql.org
      Discussion: https://postgr.es/m/16501-5234a9a0394f6754%40postgresql.org
      13838740
    • Peter Geoghegan's avatar
      Remove hashagg_avoid_disk_plan GUC. · bcbf9446
      Peter Geoghegan authored
      Note: This GUC was originally named enable_hashagg_disk when it appeared
      in commit 1f39bce0, which added disk-based hash aggregation.  It was
      subsequently renamed in commit 92c58fd9.
      
      Author: Peter Geoghegan
      Reviewed-By: Jeff Davis, Álvaro Herrera
      Discussion: https://postgr.es/m/9d9d1e1252a52ea1bad84ea40dbebfd54e672a0f.camel%40j-davis.com
      Backpatch: 13-, where disk-based hash aggregation was introduced.
      bcbf9446
  4. 27 Jul, 2020 2 commits
    • Michael Paquier's avatar
      Fix corner case with 16kB-long decompression in pgcrypto, take 2 · a3ab7a70
      Michael Paquier authored
      A compressed stream may end with an empty packet.  In this case
      decompression finishes before reading the empty packet and the
      remaining stream packet causes a failure in reading the following
      data.  This commit makes sure to consume such extra data, avoiding a
      failure when decompression the data.  This corner case was reproducible
      easily with a data length of 16kB, and existed since e94dd6ab.  A cheap
      regression test is added to cover this case based on a random,
      incompressible string.
      
      The first attempt of this patch has allowed to find an older failure
      within the compression logic of pgcrypto, fixed by b9b61057.  This
      involved SLES 15 with z390 where a custom flavor of libz gets used.
      Bonus thanks to Mark Wong for providing access to the specific
      environment.
      
      Reported-by: Frank Gagnepain
      Author: Kyotaro Horiguchi, Michael Paquier
      Reviewed-by: Tom Lane
      Discussion: https://postgr.es/m/16476-692ef7b84e5fb893@postgresql.org
      Backpatch-through: 9.5
      a3ab7a70
    • Michael Paquier's avatar
      Fix handling of structure for bytea data type in ECPG · e9713579
      Michael Paquier authored
      Some code paths dedicated to bytea used the structure for varchar.  This
      did not lead to any actual bugs, as bytea and varchar have the same
      definition, but it could become a trap if one of these definitions
      changes for a new feature or a bug fix.
      
      Issue introduced by 050710b3.
      
      Author: Shenhao Wang
      Reviewed-by: Vignesh C, Michael Paquier
      Discussion: https://postgr.es/m/07ac7dee1efc44f99d7f53a074420177@G08CNEXMBPEKD06.g08.fujitsu.local
      Backpatch-through: 12
      e9713579
  5. 26 Jul, 2020 3 commits
    • Jeff Davis's avatar
      Fix LookupTupleHashEntryHash() pipeline-stall issue. · 200f6100
      Jeff Davis authored
      Refactor hash lookups in nodeAgg.c to improve performance.
      
      Author: Andres Freund and Jeff Davis
      Discussion: https://postgr.es/m/20200612213715.op4ye4q7gktqvpuo%40alap3.anarazel.de
      Backpatch-through: 13
      200f6100
    • David Rowley's avatar
      Allocate consecutive blocks during parallel seqscans · 56788d21
      David Rowley authored
      Previously we would allocate blocks to parallel workers during a parallel
      sequential scan 1 block at a time.  Since other workers were likely to
      request a block before a worker returns for another block number to work
      on, this could lead to non-sequential I/O patterns in each worker which
      could cause the operating system's readahead to perform poorly or not at
      all.
      
      Here we change things so that we allocate consecutive "chunks" of blocks
      to workers and have them work on those until they're done, at which time
      we allocate another chunk for the worker.  The size of these chunks is
      based on the size of the relation.
      
      Initial patch here was by Thomas Munro which showed some good improvements
      just having a fixed chunk size of 64 blocks with a simple ramp-down near
      the end of the scan. The revisions of the patch to make the chunk size
      based on the relation size and the adjusted ramp-down in powers of two was
      done by me, along with quite extensive benchmarking to determine the
      optimal chunk sizes.
      
      For the most part, benchmarks have shown significant performance
      improvements for large parallel sequential scans on Linux, FreeBSD and
      Windows using SSDs.  It's less clear how this affects the performance of
      cloud providers.  Tests done so far are unable to obtain stable enough
      performance to provide meaningful benchmark results.  It is possible that
      this could cause some performance regressions on more obscure filesystems,
      so we may need to later provide users with some ability to get something
      closer to the old behavior.  For now, let's leave that until we see that
      it's really required.
      
      Author: Thomas Munro, David Rowley
      Reviewed-by: Ranier Vilela, Soumyadeep Chakraborty, Robert Haas
      Reviewed-by: Amit Kapila, Kirk Jamison
      Discussion: https://postgr.es/m/CA+hUKGJ_EErDv41YycXcbMbCBkztA34+z1ts9VQH+ACRuvpxig@mail.gmail.com
      56788d21
    • Michael Paquier's avatar
      Tweak behavior of pg_stat_activity.leader_pid · 11a68e4b
      Michael Paquier authored
      The initial implementation of leader_pid in pg_stat_activity added by
      b025f32e took the approach to strictly print what a PGPROC entry
      includes.  In short, if a backend has been involved in parallel query at
      least once, leader_pid would remain set as long as the backend is alive.
      For a parallel group leader, this means that the field would always be
      set after it participated at least once in parallel query, and after
      more discussions this could be confusing if using for example a
      connection pooler.
      
      This commit changes the data printed so as leader_pid becomes always
      NULL for a parallel group leader, showing up a non-NULL value only for
      the parallel workers, and actually as long as a parallel query is
      running as workers are shut down once the query has completed.
      
      This does not change the definition of any catalog, so no catalog bump
      is needed.  Per discussion with Justin Pryzby, Álvaro Herrera, Julien
      Rouhaud and me.
      
      Discussion: https://postgr.es/m/20200721035145.GB17300@paquier.xyz
      Backpatch-through: 13
      11a68e4b
  6. 25 Jul, 2020 5 commits
    • Noah Misch's avatar
      Remove optimization for RAND_poll() failing. · 15e44197
      Noah Misch authored
      The loop to generate seed data will exit on RAND_status(), so we don't
      need to handle the case of RAND_poll() failing separately.  Failures
      here are rare, so this a code cleanup, essentially.
      
      Daniel Gustafsson, reviewed by David Steele and Michael Paquier.
      
      Discussion: https://postgr.es/m/9B038FA5-23E8-40D0-B932-D515E1D8F66A@yesql.se
      15e44197
    • Noah Misch's avatar
      Use RAND_poll() for seeding randomness after fork(). · ce4939ff
      Noah Misch authored
      OpenSSL deprecated RAND_cleanup(), and OpenSSL 1.1.0 made it into a
      no-op.  Replace it with RAND_poll(), per an OpenSSL community
      recommendation.  While this has no user-visible consequences under
      OpenSSL defaults, it might help under non-default settings.
      
      Daniel Gustafsson, reviewed by David Steele and Michael Paquier.
      
      Discussion: https://postgr.es/m/9B038FA5-23E8-40D0-B932-D515E1D8F66A@yesql.se
      ce4939ff
    • Tom Lane's avatar
      Improve performance of binary COPY FROM through better buffering. · 0a0727cc
      Tom Lane authored
      At least on Linux and macOS, fread() turns out to have far higher
      per-call overhead than one could wish.  Reading 64KB of data at a time
      and then parceling it out with our own memcpy logic makes binary COPY
      from a file significantly faster --- around 30% in simple testing for
      cases with narrow text columns (on Linux ... even more on macOS).
      
      In binary COPY from frontend, there's no per-call fread(), and this
      patch introduces an extra layer of memcpy'ing, but it still manages
      to eke out a small win.  Apparently, the control-logic overhead in
      CopyGetData() is enough to be worth avoiding for small fetches.
      
      Bharath Rupireddy and Amit Langote, reviewed by Vignesh C,
      cosmetic tweaks by me
      
      Discussion: https://postgr.es/m/CALj2ACU5Bz06HWLwqSzNMN=Gupoj6Rcn_QVC+k070V4em9wu=A@mail.gmail.com
      0a0727cc
    • Tom Lane's avatar
      Mark built-in coercion functions as leakproof where possible. · 8a37951e
      Tom Lane authored
      Making these leakproof seems helpful since (for example) if you have a
      function f(int8) that is leakproof, you don't want it to effectively
      become non-leakproof when you apply it to an int4 or int2 column.
      But that's what happens today, since the implicit up-coercion will
      not be leakproof.
      
      Most of the coercion functions that visibly can't throw errors are
      functions that convert numeric datatypes to other, wider ones.
      Notable is that float4_numeric and float8_numeric can be marked
      leakproof; before commit a57d312a they could not have been.
      I also marked the functions that coerce strings to "name" as leakproof;
      that's okay today because they truncate silently, but if we ever
      reconsidered that behavior then they could no longer be leakproof.
      
      I desisted from marking rtrim1() as leakproof; it appears so right now,
      but the code seems a little too complex and perhaps subject to change,
      since it's shared with other SQL functions.
      
      Discussion: https://postgr.es/m/459322.1595607431@sss.pgh.pa.us
      8a37951e
    • Amit Kapila's avatar
      Fix buffer usage stats for nodes above Gather Merge. · 2a249422
      Amit Kapila authored
      Commit 85c9d347 addressed a similar problem for Gather and Gather
      Merge nodes but forgot to account for nodes above parallel nodes.  This
      still works for nodes above Gather node because we shut down the workers
      for Gather node as soon as there are no more tuples.  We can do a similar
      thing for Gather Merge as well but it seems better to account for stats
      during nodes shutdown after completing the execution.
      
      Reported-by: Stéphane Lorek, Jehan-Guillaume de Rorthais
      Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
      Reviewed-by: Amit Kapila
      Backpatch-through: 10, where it was introduced
      Discussion: https://postgr.es/m/20200718160206.584532a2@firost
      2a249422
  7. 24 Jul, 2020 3 commits
    • Tom Lane's avatar
      Replace TS_execute's TS_EXEC_CALC_NOT flag with TS_EXEC_SKIP_NOT. · 79d6d1a2
      Tom Lane authored
      It's fairly silly that ignoring NOT subexpressions is TS_execute's
      default behavior.  It's wrong on its face and it encourages errors
      of omission.  Moreover, the only two remaining callers that aren't
      specifying CALC_NOT are in ts_headline calculations, and it's very
      arguable that those are bugs: if you've specified "!foo" in your
      query, why would you want to get a headline that includes "foo"?
      
      Hence, rip that out and change the default behavior to be to calculate
      NOT accurately.  As a concession to the slim chance that there is still
      somebody somewhere who needs the incorrect behavior, provide a new
      SKIP_NOT flag to explicitly request that.
      
      Back-patch into v13, mainly because it seems better to change this
      at the same time as the previous commit's rejiggering of TS_execute
      related APIs.  Any outside callers affected by this change are
      probably also affected by that one.
      
      Discussion: https://postgr.es/m/CALT9ZEE-aLotzBg-pOp2GFTesGWVYzXA3=mZKzRDa_OKnLF7Mg@mail.gmail.com
      79d6d1a2
    • Tom Lane's avatar
      Fix assorted bugs by changing TS_execute's callback API to ternary logic. · 2f2007fb
      Tom Lane authored
      Text search sometimes failed to find valid matches, for instance
      '!crew:A'::tsquery might fail to locate 'crew:1B'::tsvector during
      an index search.  The root of the issue is that TS_execute's callback
      functions were not changed to use ternary (yes/no/maybe) reporting
      when we made the search logic itself do so.  It's somewhat annoying
      to break that API, but on the other hand we now see that any code
      using plain boolean logic is almost certainly broken since the
      addition of phrase search.  There seem to be very few outside callers
      of this code anyway, so we'll just break them intentionally to get
      them to adapt.
      
      This allows removal of tsginidx.c's private re-implementation of
      TS_execute, since that's now entirely duplicative.  It's also no
      longer necessary to avoid use of CALC_NOT in tsgistidx.c, since
      the underlying callbacks can now do something reasonable.
      
      Back-patch into v13.  We can't change this in stable branches,
      but it seems not quite too late to fix it in v13.
      
      Tom Lane and Pavel Borisov
      
      Discussion: https://postgr.es/m/CALT9ZEE-aLotzBg-pOp2GFTesGWVYzXA3=mZKzRDa_OKnLF7Mg@mail.gmail.com
      2f2007fb
    • Peter Eisentraut's avatar
      Rename configure.in to configure.ac · 25244b89
      Peter Eisentraut authored
      The new name has been preferred by Autoconf for a long time.  Future
      versions of Autoconf will warn about the old name.
      
      Discussion: https://www.postgresql.org/message-id/flat/e796c185-5ece-8569-248f-dd3799701be1%402ndquadrant.com
      25244b89
  8. 23 Jul, 2020 4 commits
    • Tom Lane's avatar
      Fix ancient violation of zlib's API spec. · b9b61057
      Tom Lane authored
      contrib/pgcrypto mishandled the case where deflate() does not consume
      all of the offered input on the first try.  It reset the next_in pointer
      to the start of the input instead of leaving it alone, causing the wrong
      data to be fed to the next deflate() call.
      
      This has been broken since pgcrypto was committed.  The reason for the
      lack of complaints seems to be that it's fairly hard to get stock zlib
      to not consume all the input, so long as the output buffer is big enough
      (which it normally would be in pgcrypto's usage; AFAICT the input is
      always going to be packetized into packets no larger than ZIP_OUT_BUF).
      However, IBM's zlibNX implementation for AIX evidently will do it
      in some cases.
      
      I did not add a test case for this, because I couldn't find one that
      would fail with stock zlib.  When we put back the test case for
      bug #16476, that will cover the zlibNX situation well enough.
      
      While here, write deflate()'s second argument as Z_NO_FLUSH per its
      API spec, instead of hard-wiring the value zero.
      
      Per buildfarm results and subsequent investigation.
      
      Discussion: https://postgr.es/m/16476-692ef7b84e5fb893@postgresql.org
      b9b61057
    • Peter Eisentraut's avatar
      doc: Document that ssl_ciphers does not affect TLS 1.3 · 5733fa0f
      Peter Eisentraut authored
      TLS 1.3 uses a different way of specifying ciphers and a different
      OpenSSL API.  PostgreSQL currently does not support setting those
      ciphers.  For now, just document this.  In the future, support for
      this might be added somehow.
      Reviewed-by: default avatarJonathan S. Katz <jkatz@postgresql.org>
      Reviewed-by: default avatarTom Lane <tgl@sss.pgh.pa.us>
      5733fa0f
    • Thomas Munro's avatar
      Fix error message. · 42dee8b8
      Thomas Munro authored
      Remove extra space.  Back-patch to all releases, like commit 7897e3bb.
      
      Author: Lu, Chenyang <lucy.fnst@cn.fujitsu.com>
      Discussion: https://postgr.es/m/795d03c6129844d3803e7eea48f5af0d%40G08CNEXMBPEKD04.g08.fujitsu.local
      42dee8b8
    • Amit Kapila's avatar
      WAL Log invalidations at command end with wal_level=logical. · c55040cc
      Amit Kapila authored
      When wal_level=logical, write invalidations at command end into WAL so
      that decoding can use this information.
      
      This patch is required to allow the streaming of in-progress transactions
      in logical decoding.  The actual work to allow streaming will be committed
      as a separate patch.
      
      We still add the invalidations to the cache and write them to WAL at
      commit time in RecordTransactionCommit(). This uses the existing
      XLOG_INVALIDATIONS xlog record type, from the RM_STANDBY_ID resource
      manager (see LogStandbyInvalidations for details).
      
      So existing code relying on those invalidations (e.g. redo) does not need
      to be changed.
      
      The invalidations written at command end uses a new xlog record type
      XLOG_XACT_INVALIDATIONS, from RM_XACT_ID resource manager. See
      LogLogicalInvalidations for details.
      
      These new xlog records are ignored by existing redo procedures, which
      still rely on the invalidations written to commit records.
      
      The invalidations are decoded and accumulated in top-transaction, and then
      executed during replay.  This obviates the need to decode the
      invalidations as part of a commit record.
      
      Bump XLOG_PAGE_MAGIC, since this introduces XLOG_XACT_INVALIDATIONS.
      
      Author: Dilip Kumar, Tomas Vondra, Amit Kapila
      Reviewed-by: Amit Kapila
      Tested-by: Neha Sharma and Mahendra Singh Thalor
      Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
      c55040cc
  9. 22 Jul, 2020 3 commits
    • Michael Paquier's avatar
      Revert "Fix corner case with PGP decompression in pgcrypto" · 38f60f17
      Michael Paquier authored
      This reverts commit 9e108984, after finding out that buildfarm members
      running SLES 15 on z390 complain on the compression and decompression
      logic of the new test: pipistrelles, barbthroat and steamerduck.
      
      Those hosts are visibly using hardware-specific changes to improve zlib
      performance, requiring more investigation.
      
      Thanks to Tom Lane for the discussion.
      
      Discussion: https://postgr.es/m/20200722093749.GA2564@paquier.xyz
      Backpatch-through: 9.5
      38f60f17
    • Tom Lane's avatar
      Support infinity and -infinity in the numeric data type. · a57d312a
      Tom Lane authored
      Add infinities that behave the same as they do in the floating-point
      data types.  Aside from any intrinsic usefulness these may have,
      this closes an important gap in our ability to convert floating
      values to numeric and/or replace float-based APIs with numeric.
      
      The new values are represented by bit patterns that were formerly
      not used (although old code probably would take them for NaNs).
      So there shouldn't be any pg_upgrade hazard.
      
      Patch by me, reviewed by Dean Rasheed and Andrew Gierth
      
      Discussion: https://postgr.es/m/606717.1591924582@sss.pgh.pa.us
      a57d312a
    • Michael Paquier's avatar
      Fix corner case with PGP decompression in pgcrypto · 9e108984
      Michael Paquier authored
      A compressed stream may end with an empty packet, and PGP decompression
      finished before reading this empty packet in the remaining stream.  This
      caused a failure in pgcrypto, handling this case as corrupted data.
      This commit makes sure to consume such extra data, avoiding a failure
      when decompression the entire stream.  This corner case was reproducible
      with a data length of 16kB, and existed since its introduction in
      e94dd6ab.  A cheap regression test is added to cover this case.
      
      Thanks to Jeff Janes for the extra investigation.
      
      Reported-by: Frank Gagnepain
      Author: Kyotaro Horiguchi, Michael Paquier
      Discussion: https://postgr.es/m/16476-692ef7b84e5fb893@postgresql.org
      Backpatch-through: 9.5
      9e108984