1. 18 Mar, 2021 5 commits
    • Tomas Vondra's avatar
      Implement GROUP BY DISTINCT · be45be9c
      Tomas Vondra authored
      With grouping sets, it's possible that some of the grouping sets are
      duplicate.  This is especially common with CUBE and ROLLUP clauses. For
      example GROUP BY CUBE (a,b), CUBE (b,c) is equivalent to
      
        GROUP BY GROUPING SETS (
          (a, b, c),
          (a, b, c),
          (a, b, c),
          (a, b),
          (a, b),
          (a, b),
          (a),
          (a),
          (a),
          (c, a),
          (c, a),
          (c, a),
          (c),
          (b, c),
          (b),
          ()
        )
      
      Some of the grouping sets are calculated multiple times, which is mostly
      unnecessary.  This commit implements a new GROUP BY DISTINCT feature, as
      defined in the SQL standard, which eliminates the duplicate sets.
      
      Author: Vik Fearing
      Reviewed-by: Erik Rijkers, Georgios Kokolatos, Tomas Vondra
      Discussion: https://postgr.es/m/bf3805a8-d7d1-ae61-fece-761b7ff41ecc@postgresfriends.org
      be45be9c
    • Tomas Vondra's avatar
      Remove temporary files after backend crash · cd91de0d
      Tomas Vondra authored
      After a crash of a backend using temporary files, the files used to be
      left behind, on the basis that it might be useful for debugging. But we
      don't have any reports of anyone actually doing that, and it means the
      disk usage may grow over time due to repeated backend failures (possibly
      even hitting ENOSPC). So this behavior is a bit unfortunate, and fixing
      it required either manual cleanup (deleting files, which is error-prone)
      or restart of the instance (i.e. service disruption).
      
      This implements automatic cleanup of temporary files, controled by a new
      GUC remove_temp_files_after_crash. By default the files are removed, but
      it can be disabled to restore the old behavior if needed.
      
      Author: Euler Taveira
      Reviewed-by: Tomas Vondra, Michael Paquier, Anastasia Lubennikova, Thomas Munro
      Discussion: https://postgr.es/m/CAH503wDKdYzyq7U-QJqGn%3DGm6XmoK%2B6_6xTJ-Yn5WSvoHLY1Ww%40mail.gmail.com
      cd91de0d
    • Magnus Hagander's avatar
      Fix function name in error hint · da18d829
      Magnus Hagander authored
      pg_read_file() is the function that's in core, pg_file_read() is in
      adminpack. But when using pg_file_read() in adminpack it calls the *C*
      level function pg_read_file() in core, which probably threw the original
      author off. But the error hint should be about the SQL function.
      
      Reported-By: Sergei Kornilov
      Backpatch-through: 11
      Discussion: https://postgr.es/m/373021616060475@mail.yandex.ru
      da18d829
    • Amit Kapila's avatar
      Doc: Update description for parallel insert reloption. · ed62d373
      Amit Kapila authored
      Commit c8f78b61 added a new reloption to enable inserts in parallel-mode
      but forgot to update at one of the places about the same in docs. In
      passing, fix a typo in the same commit.
      
      Reported-by: Justin Pryzby
      Author: Justin Pryzby
      Reviewed-by: "Hou, Zhijie", Amit Kapila
      Discussion: https://postgr.es/m/20210318025228.GE11765@telsasoft.com
      ed62d373
    • Amit Kapila's avatar
      Add a new GUC and a reloption to enable inserts in parallel-mode. · c8f78b61
      Amit Kapila authored
      Commit 05c8482f added the implementation of parallel SELECT for
      "INSERT INTO ... SELECT ..." which may incur non-negligible overhead in
      the additional parallel-safety checks that it performs, even when, in the
      end, those checks determine that parallelism can't be used. This is
      normally only ever a problem in the case of when the target table has a
      large number of partitions.
      
      A new GUC option "enable_parallel_insert" is added, to allow insert in
      parallel-mode. The default is on.
      
      In addition to the GUC option, the user may want a mechanism to allow
      inserts in parallel-mode with finer granularity at table level. The new
      table option "parallel_insert_enabled" allows this. The default is true.
      
      Author: "Hou, Zhijie"
      Reviewed-by: Greg Nancarrow, Amit Langote, Takayuki Tsunakawa, Amit Kapila
      Discussion: https://postgr.es/m/CAA4eK1K-cW7svLC2D7DHoGHxdAdg3P37BLgebqBOC2ZLc9a6QQ%40mail.gmail.com
      Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV=qpFJrR3AcrTS3g@mail.gmail.com
      c8f78b61
  2. 17 Mar, 2021 15 commits
    • Andres Freund's avatar
      Fix memory lifetime issues of replication slot stats. · 5f79580a
      Andres Freund authored
      When accessing replication slot stats, introduced in 98681675,
      pgstat_read_statsfiles() reads the data into newly allocated
      memory. Unfortunately the current memory context at that point is the
      callers, leading to leaks and use-after-free dangers.
      
      The fix is trivial, explicitly use pgStatLocalContext. There's some
      potential for further improvements, but that's outside of the scope of
      this bugfix.
      
      No backpatch necessary, feature is only in HEAD.
      
      Author: Andres Freund <andres@anarazel.de>
      Discussion: https://postgr.es/m/20210317230447.c7uc4g3vbs4wi32i@alap3.anarazel.de
      5f79580a
    • Tom Lane's avatar
      Doc: remove duplicated step in RLS example. · 70945649
      Tom Lane authored
      Seems to have been a copy-and-paste mistake in 093129c9.
      Per report from max1@inbox.ru.
      
      Discussion: https://postgr.es/m/161591740692.24273.4202054598867879464@wrigleys.postgresql.org
      70945649
    • Tom Lane's avatar
      Code review for server's handling of "tablespace map" files. · 8620a7f6
      Tom Lane authored
      While looking at Robert Foggia's report, I noticed a passel of
      other issues in the same area:
      
      * The scheme for backslash-quoting newlines in pathnames is just
      wrong; it will misbehave if the last ordinary character in a pathname
      is a backslash.  I'm not sure why we're bothering to allow newlines
      in tablespace paths, but if we're going to do it we should do it
      without introducing other problems.  Hence, backslashes themselves
      have to be backslashed too.
      
      * The author hadn't read the sscanf man page very carefully, because
      this code would drop any leading whitespace from the path.  (I doubt
      that a tablespace path with leading whitespace could happen in
      practice; but if we're bothering to allow newlines in the path, it
      sure seems like leading whitespace is little less implausible.)  Using
      sscanf for the task of finding the first space is overkill anyway.
      
      * While I'm not 100% sure what the rationale for escaping both \r and
      \n is, if the idea is to allow Windows newlines in the file then this
      code failed, because it'd throw an error if it saw \r followed by \n.
      
      * There's no cross-check for an incomplete final line in the map file,
      which would be a likely apparent symptom of the improper-escaping
      bug.
      
      On the generation end, aside from the escaping issue we have:
      
      * If needtblspcmapfile is true then do_pg_start_backup will pass back
      escaped strings in tablespaceinfo->path values, which no caller wants
      or is prepared to deal with.  I'm not sure if there's a live bug from
      that, but it looks like there might be (given the dubious assumption
      that anyone actually has newlines in their tablespace paths).
      
      * It's not being very paranoid about the possibility of random stuff
      in the pg_tblspc directory.  IMO we should ignore anything without an
      OID-like name.
      
      The escaping rule change doesn't seem back-patchable: it'll require
      doubling of backslashes in the tablespace_map file, which is basically
      a basebackup format change.  The odds of that causing trouble are
      considerably more than the odds of the existing bug causing trouble.
      The rest of this seems somewhat unlikely to cause problems too,
      so no back-patch.
      8620a7f6
    • Tom Lane's avatar
      Prevent buffer overrun in read_tablespace_map(). · a50e4fd0
      Tom Lane authored
      Robert Foggia of Trustwave reported that read_tablespace_map()
      fails to prevent an overrun of its on-stack input buffer.
      Since the tablespace map file is presumed trustworthy, this does
      not seem like an interesting security vulnerability, but still
      we should fix it just in the name of robustness.
      
      While here, document that pg_basebackup's --tablespace-mapping option
      doesn't work with tar-format output, because it doesn't.  To make it
      work, we'd have to modify the tablespace_map file within the tarball
      sent by the server, which might be possible but I'm not volunteering.
      (Less-painful solutions would require changing the basebackup protocol
      so that the source server could adjust the map.  That's not very
      appetizing either.)
      a50e4fd0
    • Tom Lane's avatar
      Add end-to-end testing of pg_basebackup's tar-format output. · 081876d7
      Tom Lane authored
      The existing test script does run pg_basebackup with the -Ft option,
      but it makes no real attempt to verify the sanity of the results.
      We wouldn't know if the output is incompatible with standard "tar"
      programs, nor if the server fails to start from the restored output.
      Notably, this means that xlog.c's read_tablespace_map() is not being
      meaningfully tested, since that code is used only in the tar-format
      case.  (We do have reasonable coverage of restoring from plain-format
      output, though it's over in src/test/recovery not here.)
      
      Hence, attempt to untar the output and start a server from it,
      rather just hoping it's OK.
      
      This test assumes that the local "tar" has the "-C directory"
      switch.  Although that's not promised by POSIX, my research
      suggests that all non-extinct tar implementations have it.
      Should the buildfarm's opinion differ, we can complicate the
      test a bit to avoid requiring that.
      
      Possibly this should be back-patched, but I'm unsure about
      whether it could work on Windows before d66b23b0.
      081876d7
    • Tom Lane's avatar
      Doc: improve discussion of variable substitution in PL/pgSQL. · c783e656
      Tom Lane authored
      This was a bit disjointed, partly because of a not-well-considered
      decision to document SQL commands that don't return result rows as
      though they had nothing in common with commands that do.  Rearrange
      so that we have one discussion of variable substitution that clearly
      applies to all types of SQL commands, and then handle the question
      of processing command output separately.  Clarify that EXPLAIN,
      CREATE TABLE AS SELECT, and similar commands that incorporate an
      optimizable statement will act like optimizable statements for the
      purposes of variable substitution.  Do a bunch of minor wordsmithing
      in the same area.
      
      David Johnston and Tom Lane, reviewed by Pavel Stehule and David
      Steele
      
      Discussion: https://postgr.es/m/CAKFQuwYvMKucM5fnZvHSo-ah4S=_n9gmKeu6EAo=_fTrohunqQ@mail.gmail.com
      c783e656
    • Thomas Munro's avatar
    • Michael Paquier's avatar
      Fix comment in indexing.c · 9fd2952c
      Michael Paquier authored
      578b2297, that removed support for WITH OIDS, has changed
      CatalogTupleInsert() to not return an Oid, but one comment was still
      mentioning that.
      
      Author: Vik Fearing
      Discussion: https://postgr.es/m/fef01975-ed10-3601-7b9e-80ecef72d00b@postgresfriends.org
      9fd2952c
    • Peter Eisentraut's avatar
      Small error message improvement · e1ae40f3
      Peter Eisentraut authored
      e1ae40f3
    • Thomas Munro's avatar
      Update the names of Parallel Hash Join phases. · 378802e3
      Thomas Munro authored
      Commit 3048898e dropped -ING from some wait event names that correspond
      to barrier phases.  Update the phases' names to match.
      
      While we're here making cosmetic changes, also rename "DONE" to "FREE".
      That pairs better with "ALLOCATE", and describes the activity that
      actually happens in that phase (as we do for the other phases) rather
      than describing a state.  The distinction is clearer after bugfix commit
      3b8981b6 split the phase into two.  As for the growth barriers, rename
      their "ALLOCATE" phase to "REALLOCATE", which is probably a better
      description of what happens then.  Also improve the comments about
      the phases a bit.
      
      Discussion: https://postgr.es/m/CA%2BhUKG%2BMDpwF2Eo2LAvzd%3DpOh81wUTsrwU1uAwR-v6OGBB6%2B7g%40mail.gmail.com
      378802e3
    • Thomas Munro's avatar
      Fix race in Parallel Hash Join batch cleanup. · 3b8981b6
      Thomas Munro authored
      With very unlucky timing and parallel_leader_participation off, PHJ
      could attempt to access per-batch state just as it was being freed.
      There was code intended to prevent that by checking for a cleared
      pointer, but it was buggy.
      
      Fix, by introducing an extra barrier phase.  The new phase
      PHJ_BUILD_RUNNING means that it's safe to access the per-batch state to
      find a batch to help with, and PHJ_BUILD_DONE means that it is too late.
      The last to detach will free the array of per-batch state as before, but
      now it will also atomically advance the phase at the same time, so that
      late attachers can avoid the hazard, without the data race.  This
      mirrors the way per-batch hash tables are freed (see phases
      PHJ_BATCH_PROBING and PHJ_BATCH_DONE).
      
      Revealed by a one-off build farm failure, where BarrierAttach() failed a
      sanity check assertion, because the memory had been clobbered by
      dsa_free().
      
      Back-patch to 11, where the code arrived.
      Reported-by: default avatarMichael Paquier <michael@paquier.xyz>
      Discussion: https://postgr.es/m/20200929061142.GA29096%40paquier.xyz
      3b8981b6
    • Thomas Munro's avatar
      Fix transaction.sql tests in higher isolation levels. · 37929599
      Thomas Munro authored
      It seems like a useful sanity check to be able to run "installcheck"
      against a cluster running with default_transaction_level set to
      serializable or repeatable read.  Only one thing currently fails in
      those configurations, so let's fix that.
      
      No back-patch for now, because it fails in many other places in some of
      the stable branches.  We'd have to go back and fix those too if we
      included this configuration in automated testing.
      Reviewed-by: default avatarTom Lane <tgl@sss.pgh.pa.us>
      Discussion: https://postgr.es/m/CA%2BhUKGJUaHeK%3DHLATxF1JOKDjKJVrBKA-zmbPAebOM0Se2FQRg%40mail.gmail.com
      37929599
    • Amit Kapila's avatar
      Fix race condition in drop subscription's handling of tablesync slots. · 6b67d72b
      Amit Kapila authored
      Commit ce0fdbfe made tablesync slots permanent and allow Drop
      Subscription to drop such slots. However, it is possible that before
      tablesync worker could get the acknowledgment of slot creation, drop
      subscription stops it and that can lead to a dangling slot on the
      publisher. Prevent cancel/die interrupts while creating a slot in the
      tablesync worker.
      
      Reported-by: Thomas Munro as per buildfarm
      Author: Amit Kapila
      Reviewed-by: Vignesh C, Takamichi Osumi
      Discussion: https://postgr.es/m/CA+hUKGJG9dWpw1cOQ2nzWU8PHjm=PTraB+KgE5648K9nTfwvxg@mail.gmail.com
      6b67d72b
    • Amit Kapila's avatar
      Doc: Add a description of substream in pg_subscription. · 7efeb214
      Amit Kapila authored
      Commit 46482432 added a new column substream in pg_subscription but
      forgot to update the docs.
      
      Reported-by: Peter Smith
      Author: Amit Kapila
      Reviewed-by: Peter Smith
      Discussion: https://postgr.es/m/CAHut+PuPGGASnh2Dy37VYODKULVQo-5oE=Shc6gwtRizDt==cA@mail.gmail.com
      7efeb214
    • Thomas Munro's avatar
      Enable parallelism in REFRESH MATERIALIZED VIEW. · 9e7ccd9e
      Thomas Munro authored
      Pass CURSOR_OPT_PARALLEL_OK to pg_plan_query() so that parallel plans
      are considered when running the underlying SELECT query.  This wasn't
      done in commit e9baa5e9, which did this for CREATE MATERIALIZED VIEW,
      because it wasn't yet known to be safe.
      
      Since REFRESH always inserts into a freshly created table before later
      merging or swapping the data into place with separate operations, we can
      enable such plans here too.
      
      Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
      Reviewed-by: default avatarHou, Zhijie <houzj.fnst@cn.fujitsu.com>
      Reviewed-by: default avatarLuc Vlaming <luc@swarm64.com>
      Reviewed-by: default avatarThomas Munro <thomas.munro@gmail.com>
      Discussion: https://postgr.es/m/CALj2ACXg-4hNKJC6nFnepRHYT4t5jJVstYvri%2BtKQHy7ydcr8A%40mail.gmail.com
      9e7ccd9e
  3. 16 Mar, 2021 9 commits
  4. 15 Mar, 2021 6 commits
  5. 13 Mar, 2021 5 commits