1. 03 Aug, 2020 7 commits
  2. 02 Aug, 2020 2 commits
    • Tom Lane's avatar
      Fix minor issues in psql's new \dAc and related commands. · 533020d0
      Tom Lane authored
      The type-name pattern in \dAc and \dAf was matched only to the actual
      pg_type.typname string, which is fairly user-unfriendly in cases where
      that is not what's shown to the user by format_type (compare "_int4"
      and "integer[]").  Make this code match what \dT does, i.e. match the
      pattern against either typname or format_type() output.  Also fix its
      broken handling of schema-name restrictions.  (IOW, make these
      processSQLNamePattern calls match \dT's.)  While here, adjust
      whitespace to make the query a little prettier in -E output, too.
      
      Also improve some inaccuracies and shaky grammar in the related
      documentation.
      
      Noted while working on a patch for intarray's opclasses; I wondered
      why I couldn't get a match to "integer*" for the input type name.
      533020d0
    • David Rowley's avatar
      Use int64 instead of long in incremental sort code · 6ee3b5fb
      David Rowley authored
      Windows 64bit has 4-byte long values which is not suitable for tracking
      disk space usage in the incremental sort code. Let's just make all these
      fields int64s.
      
      Author: James Coleman
      Discussion: https://postgr.es/m/CAApHDvpky%2BUhof8mryPf5i%3D6e6fib2dxHqBrhp0Qhu0NeBhLJw%40mail.gmail.com
      Backpatch-through: 13, where the incremental sort code was added
      6ee3b5fb
  3. 01 Aug, 2020 5 commits
  4. 31 Jul, 2020 8 commits
    • Peter Geoghegan's avatar
      Restore lost amcheck TOAST test coverage. · c79aed4f
      Peter Geoghegan authored
      Commit eba77534 fixed an amcheck false positive bug involving
      inconsistencies in TOAST input state between table and index.  A test
      case was added that verified that such an inconsistency didn't result in
      a spurious corruption related error.
      
      Test coverage from the test was accidentally lost by commit 501e41dd,
      which propagated ALTER TABLE ...  SET STORAGE attstorage state to
      indexes.  This broke the test because the test specifically relied on
      attstorage not being propagated.  This artificially forced there to be
      index tuples whose datums were equivalent to the datums in the heap
      without the datums actually being bitwise equal.
      
      Fix this by updating pg_attribute directly instead.  Commit 501e41dd
      made similar changes to a test_decoding TOAST-related test case which
      made the same assumption, but overlooked the amcheck test case.
      
      Backpatch: 11-, just like commit eba77534 (and commit 501e41dd).
      c79aed4f
    • Tom Lane's avatar
      Fix oversight in ALTER TYPE: typmodin/typmodout must propagate to arrays. · 3d2376d5
      Tom Lane authored
      If a base type supports typmods, its array type does too, with the
      same interpretation.  Hence changes in pg_type.typmodin/typmodout
      must be propagated to the array type.
      
      While here, improve AlterTypeRecurse to not recurse to domains if
      there is nothing we'd need to change.
      
      Oversight in fe30e7eb.  Back-patch to v13 where that came in.
      3d2376d5
    • Tom Lane's avatar
      Fix recently-introduced performance problem in ts_headline(). · 78e73e87
      Tom Lane authored
      The new hlCover() algorithm that I introduced in commit c9b0c678
      turns out to potentially take O(N^2) or worse time on long documents,
      if there are many occurrences of individual query words but few or no
      substrings that actually satisfy the query.  (One way to hit this
      behavior is with a "common_word & rare_word" type of query.)  This
      seems unavoidable given the original goal of checking every substring
      of the document, so we have to back off that idea.  Fortunately, it
      seems unlikely that anyone would really want headlines spanning all of
      a long document, so we can avoid the worse-than-linear behavior by
      imposing a maximum length of substring that we'll consider.
      
      For now, just hard-wire that maximum length as a multiple of max_words
      times max_fragments.  Perhaps at some point somebody will argue for
      exposing it as a ts_headline parameter, but I'm hesitant to make such
      a feature addition in a back-patched bug fix.
      
      I also noted that the hlFirstIndex() function I'd added in that
      commit was unnecessarily stupid: it really only needs to check whether
      a HeadlineWordEntry's item pointer is null or not.  This wouldn't make
      all that much difference in typical cases with queries having just
      a few terms, but a cycle shaved is a cycle earned.
      
      In addition, add a CHECK_FOR_INTERRUPTS call in TS_execute_recurse.
      This ensures that hlCover's loop is cancellable if it manages to take
      a long time, and it may protect some other TS_execute callers as well.
      
      Back-patch to 9.6 as the previous commit was.  I also chose to add the
      CHECK_FOR_INTERRUPTS call to 9.5.  The old hlCover() algorithm seems
      to avoid the O(N^2) behavior, at least on the test case I tried, but
      nonetheless it's not very quick on a long document.
      
      Per report from Stephen Frost.
      
      Discussion: https://postgr.es/m/20200724160535.GW12375@tamriel.snowman.net
      78e73e87
    • Thomas Munro's avatar
      Fix compiler warning from Clang. · 7be04496
      Thomas Munro authored
      Per build farm.
      
      Discussion: https://postgr.es/m/20200731062626.GD3317%40paquier.xyz
      7be04496
    • Thomas Munro's avatar
      Preallocate some DSM space at startup. · 84b1c63a
      Thomas Munro authored
      Create an optional region in the main shared memory segment that can be
      used to acquire and release "fast" DSM segments, and can benefit from
      huge pages allocated at cluster startup time, if configured.  Fall back
      to the existing mechanisms when that space is full.  The size is
      controlled by a new GUC min_dynamic_shared_memory, defaulting to 0.
      
      Main region DSM segments initially contain whatever garbage the memory
      held last time they were used, rather than zeroes.  That change revealed
      that DSA areas failed to initialize themselves correctly in memory that
      wasn't zeroed first, so fix that problem.
      
      Discussion: https://postgr.es/m/CA%2BhUKGLAE2QBv-WgGp%2BD9P_J-%3Dyne3zof9nfMaqq1h3EGHFXYQ%40mail.gmail.com
      84b1c63a
    • Michael Paquier's avatar
      Fix comment in instrument.h · 7b1110d2
      Michael Paquier authored
      local_blks_dirtied tracks the number of local blocks dirtied, not shared
      ones.
      
      Author: Kirk Jamison
      Discussion: https://postgr.es/m/OSBPR01MB2341760686DC056DE89D2AB9EF710@OSBPR01MB2341.jpnprd01.prod.outlook.com
      7b1110d2
    • Thomas Munro's avatar
      Cache smgrnblocks() results in recovery. · c5315f4f
      Thomas Munro authored
      Avoid repeatedly calling lseek(SEEK_END) during recovery by caching
      the size of each fork.  For now, we can't use the same technique in
      other processes, because we lack a shared invalidation mechanism.
      
      Do this by generalizing the pre-existing caching used by FSM and VM
      to support all forks.
      
      Discussion: https://postgr.es/m/CAEepm%3D3SSw-Ty1DFcK%3D1rU-K6GSzYzfdD4d%2BZwapdN7dTa6%3DnQ%40mail.gmail.com
      c5315f4f
    • Michael Paquier's avatar
      Use multi-inserts for pg_attribute and pg_shdepend · e3931d01
      Michael Paquier authored
      For pg_attribute, this allows to insert at once a full set of attributes
      for a relation (roughly 15% of WAL reduction in extreme cases).  For
      pg_shdepend, this reduces the work done when creating new shared
      dependencies from a database template.  The number of slots used for the
      insertion is capped at 64kB of data inserted for both, depending on the
      number of items to insert and the length of the rows involved.
      
      More can be done for other catalogs, like pg_depend.  This part requires
      a different approach as the number of slots to use depends also on the
      number of entries discarded as pinned dependencies.  This is also
      related to the rework or dependency handling for ALTER TABLE and CREATE
      TABLE, mainly.
      
      Author: Daniel Gustafsson
      Reviewed-by: Andres Freund, Michael Paquier
      Discussion: https://postgr.es/m/20190213182737.mxn6hkdxwrzgxk35@alap3.anarazel.de
      e3931d01
  5. 30 Jul, 2020 7 commits
  6. 29 Jul, 2020 8 commits
    • Peter Geoghegan's avatar
      Add hash_mem_multiplier GUC. · d6c08e29
      Peter Geoghegan authored
      Add a GUC that acts as a multiplier on work_mem.  It gets applied when
      sizing executor node hash tables that were previously size constrained
      using work_mem alone.
      
      The new GUC can be used to preferentially give hash-based nodes more
      memory than the generic work_mem limit.  It is intended to enable admin
      tuning of the executor's memory usage.  Overall system throughput and
      system responsiveness can be improved by giving hash-based executor
      nodes more memory (especially over sort-based alternatives, which are
      often much less sensitive to being memory constrained).
      
      The default value for hash_mem_multiplier is 1.0, which is also the
      minimum valid value.  This means that hash-based nodes continue to apply
      work_mem in the traditional way by default.
      
      hash_mem_multiplier is generally useful.  However, it is being added now
      due to concerns about hash aggregate performance stability for users
      that upgrade to Postgres 13 (which added disk-based hash aggregation in
      commit 1f39bce0).  While the old hash aggregate behavior risked
      out-of-memory errors, it is nevertheless likely that many users actually
      benefited.  Hash agg's previous indifference to work_mem during query
      execution was not just faster; it also accidentally made aggregation
      resilient to grouping estimate problems (at least in cases where this
      didn't create destabilizing memory pressure).
      
      hash_mem_multiplier can provide a certain kind of continuity with the
      behavior of Postgres 12 hash aggregates in cases where the planner
      incorrectly estimates that all groups (plus related allocations) will
      fit in work_mem/hash_mem.  This seems necessary because hash-based
      aggregation is usually much slower when only a small fraction of all
      groups can fit.  Even when it isn't possible to totally avoid hash
      aggregates that spill, giving hash aggregation more memory will reliably
      improve performance (the same cannot be said for external sort
      operations, which appear to be almost unaffected by memory availability
      provided it's at least possible to get a single merge pass).
      
      The PostgreSQL 13 release notes should advise users that increasing
      hash_mem_multiplier can help with performance regressions associated
      with hash aggregation.  That can be taken care of by a later commit.
      
      Author: Peter Geoghegan
      Reviewed-By: Álvaro Herrera, Jeff Davis
      Discussion: https://postgr.es/m/20200625203629.7m6yvut7eqblgmfo@alap3.anarazel.de
      Discussion: https://postgr.es/m/CAH2-WzmD%2Bi1pG6rc1%2BCjc4V6EaFJ_qSuKCCHVnH%3DoruqD-zqow%40mail.gmail.com
      Backpatch: 13-, where disk-based hash aggregation was introduced.
      d6c08e29
    • Fujii Masao's avatar
      pg_stat_statements: track number of rows processed by some utility commands. · 6023b7ea
      Fujii Masao authored
      This commit makes pg_stat_statements track the total number
      of rows retrieved or affected by CREATE TABLE AS, SELECT INTO,
      CREATE MATERIALIZED VIEW and FETCH commands.
      
      Suggested-by: Pascal Legrand
      Author: Fujii Masao
      Reviewed-by: Asif Rehman
      Discussion: https://postgr.es/m/1584293755198-0.post@n3.nabble.com
      6023b7ea
    • Fujii Masao's avatar
      Remove non-fast promotion. · b5310e4f
      Fujii Masao authored
      When fast promotion was supported in 9.3, non-fast promotion became
      undocumented feature and it's basically not available for ordinary users.
      However we decided not to remove non-fast promotion at that moment,
      to leave it for a release or two for debugging purpose or as an emergency
      method because fast promotion might have some issues, and then to
      remove it later. Now, several versions were released since that decision
      and there is no longer reason to keep supporting non-fast promotion.
      Therefore this commit removes non-fast promotion.
      
      Author: Fujii Masao
      Reviewed-by: Hamid Akhtar, Kyotaro Horiguchi
      Discussion: https://postgr.es/m/76066434-648f-f567-437b-54853b43398f@oss.nttdata.com
      b5310e4f
    • Jeff Davis's avatar
      HashAgg: use better cardinality estimate for recursive spilling. · 9878b643
      Jeff Davis authored
      Use HyperLogLog to estimate the group cardinality in a spilled
      partition. This estimate is used to choose the number of partitions if
      we recurse.
      
      The previous behavior was to use the number of tuples in a spilled
      partition as the estimate for the number of groups, which lead to
      overpartitioning. That could cause the number of batches to be much
      higher than expected (with each batch being very small), which made it
      harder to interpret EXPLAIN ANALYZE results.
      
      Reviewed-by: Peter Geoghegan
      Discussion: https://postgr.es/m/a856635f9284bc36f7a77d02f47bbb6aaf7b59b3.camel@j-davis.com
      Backpatch-through: 13
      9878b643
    • Michael Paquier's avatar
      Fix incorrect print format in json.c · f2130e77
      Michael Paquier authored
      Oid is unsigned, so %u needs to be used and not %d.  The code path
      involved here is not normally reachable, so no backpatch is done.
      
      Author: Justin Pryzby
      Discussion: https://postgr.es/m/20200728015523.GA27308@telsasoft.com
      f2130e77
    • Thomas Munro's avatar
      Move syncscan.c to src/backend/access/common. · cb04ad49
      Thomas Munro authored
      Since the tableam.c code needs to make use of the syncscan.c routines
      itself, and since other block-oriented AMs might also want to use it one
      day, it didn't make sense for it to live under src/backend/access/heap.
      Reviewed-by: default avatarAndres Freund <andres@anarazel.de>
      Discussion: https://postgr.es/m/CA%2BhUKGLCnG%3DNEAByg6bk%2BCT9JZD97Y%3DAxKhh27Su9FeGWOKvDg%40mail.gmail.com
      cb04ad49
    • Peter Geoghegan's avatar
      Rename another "hash_mem" local variable. · c49c74d1
      Peter Geoghegan authored
      Missed by my commit 564ce621.
      
      Backpatch: 13-, where disk-based hash aggregation was introduced.
      c49c74d1
    • Peter Geoghegan's avatar
      Correct obsolete UNION hash aggs comment. · b1d79127
      Peter Geoghegan authored
      Oversight in commit 1f39bce0, which added disk-based hash aggregation.
      
      Backpatch: 13-, where disk-based hash aggregation was introduced.
      b1d79127
  7. 28 Jul, 2020 3 commits
    • Peter Geoghegan's avatar
      Doc: Remove obsolete CREATE AGGREGATE note. · f36e8207
      Peter Geoghegan authored
      The planner is in fact willing to use hash aggregation when work_mem is
      not set high enough for everything to fit in memory.  This has been the
      case since commit 1f39bce0, which added disk-based hash aggregation.
      
      There are a few remaining cases in which hash aggregation is avoided as
      a matter of policy when the planner surmises that spilling will be
      necessary.  For example, callers of choose_hashed_setop() still
      conservatively avoid hash aggregation when spilling is anticipated.
      That doesn't seem like a good enough reason to mention hash aggregation
      in this context.
      
      Backpatch: 13-, where disk-based hash aggregation was introduced.
      f36e8207
    • David Rowley's avatar
      Make EXPLAIN ANALYZE of HashAgg more similar to Hash Join · 0e3e1c4e
      David Rowley authored
      There were various unnecessary differences between Hash Agg's EXPLAIN
      ANALYZE output and Hash Join's.  Here we modify the Hash Agg output so
      that it's better aligned to Hash Join's.
      
      The following changes have been made:
      1. Start batches counter at 1 instead of 0.
      2. Always display the "Batches" property, even when we didn't spill to
         disk.
      3. Use the text "Batches" instead of "HashAgg Batches" for text format.
      4. Use the text "Memory Usage" instead of "Peak Memory Usage" for text
         format.
      5. Include "Batches" before "Memory Usage" in both text and non-text
         formats.
      
      In passing also modify the "Planned Partitions" property so that we show
      it regardless of if the value is 0 or not for non-text EXPLAIN formats.
      This was pointed out by Justin Pryzby and probably should have been part
      of 40efbf87.
      
      Reviewed-by: Justin Pryzby, Jeff Davis
      Discussion: https://postgr.es/m/CAApHDvrshRnA6C0VFnu7Fb9TVvgGo80PUMm5+2DiaS1gEkPvtw@mail.gmail.com
      Backpatch-through: 13, where HashAgg batching was introduced
      0e3e1c4e
    • David Rowley's avatar
      Doc: Improve documentation for pg_jit_available() · d7c8576e
      David Rowley authored
      Per complaint from Scott Ribe. Based on wording suggestion from Tom Lane.
      
      Discussion: https://postgr.es/m/1956E806-1468-4417-9A9D-235AE1D5FE1A@elevated-dev.com
      Backpatch-through: 11, where pg_jit_available() was added
      d7c8576e