1. 06 Apr, 2021 20 commits
    • Tom Lane's avatar
      Fix missing #include in nodeResultCache.h. · 789d81de
      Tom Lane authored
      Per cpluspluscheck.
      789d81de
    • Peter Eisentraut's avatar
      psql: Show all query results by default · 3a513067
      Peter Eisentraut authored
      Previously, psql printed only the last result if a command string
      returned multiple result sets.  Now it prints all of them.  The
      previous behavior can be obtained by setting the psql variable
      SHOW_ALL_RESULTS to off.
      
      Author: Fabien COELHO <coelho@cri.ensmp.fr>
      Reviewed-by: default avatar"Iwata, Aya" <iwata.aya@jp.fujitsu.com>
      Reviewed-by: default avatarDaniel Verite <daniel@manitou-mail.org>
      Reviewed-by: default avatarPeter Eisentraut <peter.eisentraut@2ndquadrant.com>
      Reviewed-by: default avatarKyotaro Horiguchi <horikyota.ntt@gmail.com>
      Reviewed-by: default avatarvignesh C <vignesh21@gmail.com>
      Discussion: https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1904132231510.8961@lancre
      3a513067
    • Tomas Vondra's avatar
      Fix handling of clauses incompatible with extended statistics · 518442c7
      Tomas Vondra authored
      Handling of incompatible clauses while applying extended statistics was
      a bit confused - while handling a mix of compatible and incompatible
      clauses it sometimes incorrectly treated the incompatible clauses as
      compatible, resulting in a crash.
      
      Fixed by reworking the code applying the selected statistics object to
      make it easier to understand, and adding a proper compatibility check.
      
      Reported-by: David Rowley
      Discussion: https://postgr.es/m/CAApHDvpYT10-nkSp8xXe-nbO3jmoaRyRFHbzh-RWMfAJynqgpQ%40mail.gmail.com
      518442c7
    • Peter Geoghegan's avatar
      Refactor lazy_scan_heap() loop. · 7ab96cf6
      Peter Geoghegan authored
      Add a lazy_scan_heap() subsidiary function that handles heap pruning and
      tuple freezing: lazy_scan_prune().  This is a great deal cleaner.  The
      code that remains in lazy_scan_heap()'s per-block loop can now be
      thought of as code that either comes before or after the call to
      lazy_scan_prune(), which is now the clear focal point.  This division is
      enforced by the way in which we now manage state.  lazy_scan_prune()
      outputs state (using its own struct) that describes what to do with the
      page following pruning and freezing (e.g., visibility map maintenance,
      recording free space in the FSM).  It doesn't get passed any special
      instructional state from the preamble code, though.
      
      Also cleanly separate the logic used by a VACUUM with INDEX_CLEANUP=off
      from the logic used by single-heap-pass VACUUMs.  The former case is now
      structured as the omission of index and heap vacuuming by a two pass
      VACUUM.  The latter case goes back to being used only when the table
      happens to have no indexes (just as it was before commit a96c41fe).
      This structure is much more natural, since the whole point of
      INDEX_CLEANUP=off is to skip the index and heap vacuuming that would
      otherwise take place.  The single-heap-pass case doesn't skip any useful
      work, though -- it just does heap pruning and heap vacuuming together
      when the table happens to have no indexes.
      
      Both of these changes are preparation for an upcoming patch that
      generalizes the mechanism used by INDEX_CLEANUP=off.  The later patch
      will allow VACUUM to give up on index and heap vacuuming dynamically, as
      problems emerge (e.g., with wraparound), so that an affected VACUUM
      operation can finish up as soon as possible.
      
      Also fix a very old bug in single-pass VACUUM VERBOSE output.  We were
      reporting the number of tuples deleted via pruning as a direct
      substitute for reporting the number of LP_DEAD items removed in a
      function that deals with the second pass over the heap.  But that
      doesn't work at all -- they're two different things.
      
      To fix, start tracking the total number of LP_DEAD items encountered
      during pruning, and use that in the report instead.  A single pass
      VACUUM will always vacuum away whatever LP_DEAD items a heap page has
      immediately after it is pruned, so the total number of LP_DEAD items
      encountered during pruning equals the total number vacuumed-away.
      (They are _not_ equal in the INDEX_CLEANUP=off case, but that's okay
      because skipping index vacuuming is now a totally orthogonal concept to
      one-pass VACUUM.)
      
      Also stop reporting the count of LP_UNUSED items in VACUUM VERBOSE
      output.  This makes the output of VACUUM VERBOSE more consistent with
      log_autovacuum's output (because it never showed information about
      LP_UNUSED items).  VACUUM VERBOSE reported LP_UNUSED items left behind
      by the last VACUUM, and LP_UNUSED items created via pruning HOT chains
      during the current VACUUM (it never included LP_UNUSED items left behind
      by the current VACUUM's second pass over the heap).  This makes it
      useless as an indicator of line pointer bloat, which must have been the
      original intention. (Like the first VACUUM VERBOSE issue, this issue was
      arguably an oversight in commit 282d2a03, which added the heap-only
      tuple optimization.)
      
      Finally, stop reporting empty_pages in VACUUM VERBOSE output, and start
      reporting pages_removed instead.  This also makes the output of VACUUM
      VERBOSE more consistent with log_autovacuum's output (which does not
      show empty_pages, but does show pages_removed).  An empty page isn't
      meaningfully different to a page that is almost empty, or a page that is
      empty but for only a small number of remaining LP_UNUSED items.
      
      Author: Peter Geoghegan <pg@bowt.ie>
      Reviewed-By: default avatarRobert Haas <robertmhaas@gmail.com>
      Reviewed-By: default avatarMasahiko Sawada <sawada.mshk@gmail.com>
      Discussion: https://postgr.es/m/CAH2-WznneCXTzuFmcwx_EyRQgfsfJAAsu+CsqRFmFXCAar=nJw@mail.gmail.com
      7ab96cf6
    • Tom Lane's avatar
      Clean up treatment of missing default and CHECK-constraint records. · 091e22b2
      Tom Lane authored
      Andrew Gierth reported that it's possible to crash the backend if no
      pg_attrdef record is found to match an attribute that has atthasdef set.
      AttrDefaultFetch warns about this situation, but then leaves behind
      a relation tupdesc that has null "adbin" pointer(s), which most places
      don't guard against.
      
      We considered promoting the warning to an error, but throwing errors
      during relcache load is pretty drastic: it effectively locks one out
      of using the relation at all.  What seems better is to leave the
      load-time behavior as a warning, but then throw an error in any code
      path that wants to use a default and can't find it.  This confines
      the error to a subset of INSERT/UPDATE operations on the table, and
      in particular will at least allow a pg_dump to succeed.
      
      Also, we should fix AttrDefaultFetch to not leave any null pointers
      in the tupdesc, because that just creates an untested bug hazard.
      
      While at it, apply the same philosophy of "warn at load, throw error
      only upon use of the known-missing info" to CHECK constraints.
      CheckConstraintFetch is very nearly the same logic as AttrDefaultFetch,
      but for reasons lost in the mists of time, it was throwing ERROR for
      the same cases that AttrDefaultFetch treats as WARNING.  Make the two
      functions more nearly alike.
      
      In passing, get rid of potentially-O(N^2) loops in equalTupleDesc
      by making AttrDefaultFetch sort the entries after fetching them,
      so that equalTupleDesc can assume that entries in two equal tupdescs
      must be in matching order.  (CheckConstraintFetch already was sorting
      CHECK constraints, but equalTupleDesc hadn't been told about it.)
      
      There's some argument for back-patching this, but with such a small
      number of field reports, I'm content to fix it in HEAD.
      
      Discussion: https://postgr.es/m/87pmzaq4gx.fsf@news-spur.riddles.org.uk
      091e22b2
    • Fujii Masao's avatar
      Stop archive recovery if WAL generated with wal_level=minimal is found. · 9de9294b
      Fujii Masao authored
      Previously if hot standby was enabled, archive recovery exited with
      an error when it found WAL generated with wal_level=minimal.
      But if hot standby was disabled, it just reported a warning and
      continued in that case. Which could lead to data loss or errors
      during normal operation. A warning was emitted, but users could
      easily miss that and not notice this serious situation until
      they encountered the actual errors.
      
      To improve this situation, this commit changes archive recovery
      so that it exits with FATAL error when it finds WAL generated with
      wal_level=minimal whatever the setting of hot standby. This enables
      users to notice the serious situation soon.
      
      The FATAL error is thrown if archive recovery starts from a base
      backup taken before wal_level is changed to minimal. When archive
      recovery exits with the error, if users have a base backup taken
      after setting wal_level to higher than minimal, they can recover
      the database by starting archive recovery from that newer backup.
      But note that if such backup doesn't exist, there is no easy way to
      complete archive recovery, which may make the database server
      unstartable and users may lose whole database. The commit adds
      the note about this risk into the document.
      
      Even in the case of unstartable database server, previously by just
      disabling hot standby users could avoid the error during archive
      recovery, forcibly start up the server and salvage data from it.
      But note that this commit makes this procedure unavailable at all.
      
      Author: Takamichi Osumi
      Reviewed-by: Laurenz Albe, Kyotaro Horiguchi, David Steele, Fujii Masao
      Discussion: https://postgr.es/m/OSBPR01MB4888CBE1DA08818FD2D90ED8EDF90@OSBPR01MB4888.jpnprd01.prod.outlook.com
      9de9294b
    • Heikki Linnakangas's avatar
      Mark test_enc_conversion() as STRICT. · c4c393b3
      Heikki Linnakangas authored
      Reported-by: Jaime Casanova, using SQLsmith
      Discussion: https://www.postgresql.org/message-id/20210402235337.GA4082@ahch-to
      c4c393b3
    • Dean Rasheed's avatar
      pgbench: Function to generate random permutations. · 6b258e3d
      Dean Rasheed authored
      This adds a new function, permute(), that generates pseudorandom
      permutations of arbitrary sizes. This can be used to randomly shuffle
      a set of values to remove unwanted correlations. For example,
      permuting the output from a non-uniform random distribution so that
      all the most common values aren't collocated, allowing more realistic
      tests to be performed.
      
      Formerly, hash() was recommended for this purpose, but that suffers
      from collisions that might alter the distribution, so recommend
      permute() for this purpose instead.
      
      Fabien Coelho and Hironobu Suzuki, with additional hacking be me.
      Reviewed by Thomas Munro, Alvaro Herrera and Muhammad Usama.
      
      Discussion: https://postgr.es/m/alpine.DEB.2.21.1807280944370.5142@lancre
      6b258e3d
    • Etsuro Fujita's avatar
      Adjust input value to WaitEventSetWait() in ExecAppendAsyncEventWait(). · a8af856d
      Etsuro Fujita authored
      Adjust the number of events given to WaitEventSetWait() so that it
      doesn't exceed the maximum number of events in the WaitEventSet given
      to that function (set->nevents_space) in hopes of making the buildfarm
      green.
      
      Per valgrind failure report from Tom Lane and the buildfarm.
      
      Author: Etsuro Fujita
      Discussion: https://postgr.es/m/3411577.1617289776%40sss.pgh.pa.us
      a8af856d
    • Peter Eisentraut's avatar
      ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION · 82ed7748
      Peter Eisentraut authored
      At present, if we want to update publications in a subscription, we
      can use SET PUBLICATION.  However, it requires supplying all
      publications that exists and the new publications.  If we want to add
      new publications, it's inconvenient.  The new syntax only supplies the
      new publications.  When the refresh is true, it only refreshes the new
      publications.
      
      Author: Japin Li <japinli@hotmail.com>
      Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
      Discussion: https://www.postgresql.org/message-id/flat/MEYP282MB166939D0D6C480B7FBE7EFFBB6BC0@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
      82ed7748
    • Amit Kapila's avatar
      Fix the tests added by commit ac4645c0. · 266b5673
      Amit Kapila authored
      In the tests, after disabling the subscription, we were not waiting for
      the replication connection to drop from the publisher. So when the test
      was trying to use the same slot to fetch the messages via SQL API, it
      sometimes gives an error that the replication slot is active for other
      PID.
      
      Per buildfarm.
      266b5673
    • David Rowley's avatar
      Fix compiler warning in fe-trace.c for MSVC · 9bc9b460
      David Rowley authored
      It seems that in MSVC timeval's tv_sec field is of type long.
      localtime() takes a time_t pointer.  Since long is 32-bit even on 64-bit
      builds in MSVC, passing a long pointer instead of the correct time_t
      pointer generated a compiler warning.  Fix that.
      
      Reviewed-by: Tom Lane
      Discussion: https://postgr.es/m/CAApHDvoRG25X_=ZCGSPb4KN_j2iu=G2uXsRSg8NBZeuhkOSETg@mail.gmail.com
      9bc9b460
    • Peter Eisentraut's avatar
      Change return type of EXTRACT to numeric · a2da77cd
      Peter Eisentraut authored
      The previous implementation of EXTRACT mapped internally to
      date_part(), which returned type double precision (since it was
      implemented long before the numeric type existed).  This can lead to
      imprecise output in some cases, so returning numeric would be
      preferrable.  Changing the return type of an existing function is a
      bit risky, so instead we do the following:  We implement a new set of
      functions, which are now called "extract", in parallel to the existing
      date_part functions.  They work the same way internally but use
      numeric instead of float8.  The EXTRACT construct is now mapped by the
      parser to these new extract functions.  That way, dumps of views
      etc. from old versions (which would use date_part) continue to work
      unchanged, but new uses will map to the new extract functions.
      
      Additionally, the reverse compilation of EXTRACT now reproduces the
      original syntax, using the new mechanism introduced in
      40c24bfe.
      
      The following minor changes of behavior result from the new
      implementation:
      
      - The column name from an isolated EXTRACT call is now "extract"
        instead of "date_part".
      
      - Extract from date now rejects inappropriate field names such as
        HOUR.  It was previously mapped internally to extract from
        timestamp, so it would silently accept everything appropriate for
        timestamp.
      
      - Return values when extracting fields with possibly fractional
        values, such as second and epoch, now have the full scale that the
        value has internally (so, for example, '1.000000' instead of just
        '1').
      Reported-by: default avatarPetr Fedorov <petr.fedorov@phystech.edu>
      Reviewed-by: default avatarTom Lane <tgl@sss.pgh.pa.us>
      Discussion: https://www.postgresql.org/message-id/flat/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
      a2da77cd
    • Fujii Masao's avatar
      Fix typo in pgstat.c. · f5d94e40
      Fujii Masao authored
      Introduced by 98681675.
      
      Author: Vignesh C
      Discussion: https://postgr.es/m/CALDaNm1DqgaLBAJrtGznKk1sR1mH-augmp7LfGvxWwTUhah+rg@mail.gmail.com
      f5d94e40
    • Fujii Masao's avatar
      Add function to log the memory contexts of specified backend process. · 43620e32
      Fujii Masao authored
      Commit 3e98c0ba added pg_backend_memory_contexts view to display
      the memory contexts of the backend process. However its target process
      is limited to the backend that is accessing to the view. So this is
      not so convenient when investigating the local memory bloat of other
      backend process. To improve this situation, this commit adds
      pg_log_backend_memory_contexts() function that requests to log
      the memory contexts of the specified backend process.
      
      This information can be also collected by calling
      MemoryContextStats(TopMemoryContext) via a debugger. But
      this technique cannot be used in some environments because no debugger
      is available there. So, pg_log_backend_memory_contexts() allows us to
      see the memory contexts of specified backend more easily.
      
      Only superusers are allowed to request to log the memory contexts
      because allowing any users to issue this request at an unbounded rate
      would cause lots of log messages and which can lead to denial of service.
      
      On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
      the target backend logs its memory contexts at LOG_SERVER_ONLY level,
      so that these memory contexts will appear in the server log but not
      be sent to the client. It logs one message per memory context.
      Because if it buffers all memory contexts into StringInfo to log them
      as one message, which may require the buffer to be enlarged very much
      and lead to OOM error since there can be a large number of memory
      contexts in a backend.
      
      When a backend process is consuming huge memory, logging all its
      memory contexts might overrun available disk space. To prevent this,
      now this patch limits the number of child contexts to log per parent
      to 100. As with MemoryContextStats(), it supposes that practical cases
      where the log gets long will typically be huge numbers of siblings
      under the same parent context; while the additional debugging value
      from seeing details about individual siblings beyond 100 will not be large.
      
      There was another proposed patch to add the function to return
      the memory contexts of specified backend as the result sets,
      instead of logging them, in the discussion. However that patch is
      not included in this commit because it had several issues to address.
      
      Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra,
      Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion.
      
      Bump catalog version.
      
      Author: Atsushi Torikoshi
      Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao
      Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com
      43620e32
    • Michael Paquier's avatar
      Fix some issues with SSL and Kerberos tests · 5a71964a
      Michael Paquier authored
      The recent refactoring done in c50624cd accidentally broke a portion of
      the kerberos tests checking after a query, so add its functionality
      back.  Some inactive SSL tests had their arguments in an incorrect
      order, which would cause them to fail if they were to run.
      
      Author: Jacob Champion
      Discussion: https://postgr.es/m/4f5b0b3dc0b6fe9ae6a34886b4d4000f61eb567e.camel@vmware.com
      5a71964a
    • Amit Kapila's avatar
      Allow pgoutput to send logical decoding messages. · ac4645c0
      Amit Kapila authored
      The output plugin accepts a new parameter (messages) that controls if
      logical decoding messages are written into the replication stream. It is
      useful for those clients that use pgoutput as an output plugin and needs
      to process messages that were written by pg_logical_emit_message().
      
      Although logical streaming replication protocol supports logical
      decoding messages now, logical replication does not use this feature yet.
      
      Author: David Pirotte, Euler Taveira
      Reviewed-by: Euler Taveira, Andres Freund, Ashutosh Bapat, Amit Kapila
      Discussion: https://postgr.es/m/CADK3HHJ-+9SO7KuRLH=9Wa1rAo60Yreq1GFNkH_kd0=CdaWM+A@mail.gmail.com
      ac4645c0
    • Amit Kapila's avatar
      Refactor function parse_output_parameters. · 531737dd
      Amit Kapila authored
      Instead of using multiple parameters in parse_ouput_parameters function
      signature, use the struct PGOutputData that encapsulates all pgoutput
      options. It will be useful for future work where we need to add other
      options in pgoutput.
      
      Author: Euler Taveira
      Reviewed-by: Amit Kapila
      Discussion: https://postgr.es/m/CADK3HHJ-+9SO7KuRLH=9Wa1rAo60Yreq1GFNkH_kd0=CdaWM+A@mail.gmail.com
      531737dd
    • Michael Paquier's avatar
      Change PostgresNode::connect_fails() to never send down queries · 6d41dd04
      Michael Paquier authored
      This type of failure is similar to what has been fixed in c757a3da,
      where an authentication failure combined with psql pushing a command
      down its communication pipe causes a test failure.  This routine is
      designed to fail, so sending a query has little sense anyway.
      
      Per buildfarm members gaur and hoverfly, based on an analysis and fix
      from Tom Lane.
      
      Discussion: https://postgr.es/m/513200.1617634642@sss.pgh.pa.us
      6d41dd04
    • Peter Geoghegan's avatar
      Allocate access strategy in parallel VACUUM workers. · f6b8f19a
      Peter Geoghegan authored
      Commit 49f49def took entirely the wrong approach to fixing this issue.
      Just allocate a local buffer access strategy in each individual worker
      instead of trying to propagate state.  This state was never propagated
      by parallel VACUUM in the first place.
      
      It looks like the only reason that this worked following commit 40d964ec
      was that it involved static global variables, which are initialized to 0
      per the C standard.
      
      A more comprehensive fix may be necessary, even on HEAD.  This fix
      should at least get the buildfarm green once again.
      
      Thanks once again to Thomas Munro for continued off-list assistance with
      the issue.
      f6b8f19a
  2. 05 Apr, 2021 9 commits
    • Tom Lane's avatar
      Support INCLUDE'd columns in SP-GiST. · 09c1c6ab
      Tom Lane authored
      Not much to say here: does what it says on the tin.
      We steal a previously-always-zero bit from the nextOffset
      field of leaf index tuples in order to track whether there
      is a nulls bitmap.  Otherwise it works about like included
      columns in other index types.
      
      Pavel Borisov, reviewed by Andrey Borodin and Anastasia Lubennikova,
      and rather heavily editorialized on by me
      
      Discussion: https://postgr.es/m/CALT9ZEFi-vMp4faht9f9Junb1nO3NOSjhpxTmbm1UGLMsLqiEQ@mail.gmail.com
      09c1c6ab
    • Peter Geoghegan's avatar
      Propagate parallel VACUUM's buffer access strategy. · 49f49def
      Peter Geoghegan authored
      Parallel VACUUM relied on global variable state from the leader process
      being propagated to workers on fork().  Commit b4af70cb removed most
      uses of global variables inside vacuumlazy.c, but did not account for
      the buffer access strategy state.
      
      To fix, propagate the state through shared memory instead.
      
      Per buildfarm failures on elver, curculio, and morepork.
      
      Many thanks to Thomas Munro for off-list assistance with this issue.
      49f49def
    • Peter Geoghegan's avatar
      Simplify state managed by VACUUM. · b4af70cb
      Peter Geoghegan authored
      Reorganize the state struct used by VACUUM -- group related items
      together to make it easier to understand.  Also stop relying on stack
      variables inside lazy_scan_heap() -- move those into the state struct
      instead.  Doing things this way simplifies large groups of related
      functions whose function signatures had a lot of unnecessary redundancy.
      
      Switch over to using int64 for the struct fields used to count things
      that are reported to the user via log_autovacuum and VACUUM VERBOSE
      output.  We were using double, but that doesn't seem to have any
      advantages.  Using int64 makes it possible to add assertions that verify
      that the first pass over the heap (pruning) encounters precisely the
      same number of LP_DEAD items that get deleted from indexes later on, in
      the second pass over the heap.  These assertions will be added in later
      commits.
      
      Finally, adjust the signatures of functions with IndexBulkDeleteResult
      pointer arguments in cases where there was ambiguity about whether or
      not the argument relates to a single index or all indexes.  Functions
      now use the idiom that both ambulkdelete() and amvacuumcleanup() have
      always used (where appropriate): accept a mutable IndexBulkDeleteResult
      pointer argument, and return a result IndexBulkDeleteResult pointer to
      caller.
      
      Author: Peter Geoghegan <pg@bowt.ie>
      Reviewed-By: default avatarMasahiko Sawada <sawada.mshk@gmail.com>
      Reviewed-By: default avatarRobert Haas <robertmhaas@gmail.com>
      Discussion: https://postgr.es/m/CAH2-WzkeOSYwC6KNckbhk2b1aNnWum6Yyn0NKP9D-Hq1LGTDPw@mail.gmail.com
      b4af70cb
    • Stephen Frost's avatar
      Add pg_read_all_data and pg_write_all_data roles · 6c3ffd69
      Stephen Frost authored
      A commonly requested use-case is to have a role who can run an
      unfettered pg_dump without having to explicitly GRANT that user access
      to all tables, schemas, et al, without that role being a superuser.
      This address that by adding a "pg_read_all_data" role which implicitly
      gives any member of this role SELECT rights on all tables, views and
      sequences, and USAGE rights on all schemas.
      
      As there may be cases where it's also useful to have a role who has
      write access to all objects, pg_write_all_data is also introduced and
      gives users implicit INSERT, UPDATE and DELETE rights on all tables,
      views and sequences.
      
      These roles can not be logged into directly but instead should be
      GRANT'd to a role which is able to log in.  As noted in the
      documentation, if RLS is being used then an administrator may (or may
      not) wish to set BYPASSRLS on the login role which these predefined
      roles are GRANT'd to.
      
      Reviewed-by: Georgios Kokolatos
      Discussion: https://postgr.es/m/20200828003023.GU29590@tamriel.snowman.net
      6c3ffd69
    • Fujii Masao's avatar
      Shut down transaction tracking at startup process exit. · ad8b6749
      Fujii Masao authored
      Maxim Orlov reported that the shutdown of standby server could result in
      the following assertion failure. The cause of this issue was that,
      when the shutdown caused the startup process to exit, recovery-time
      transaction tracking was not shut down even if it's already initialized,
      and some locks the tracked transactions were holding could not be released.
      At this situation, if other process was invoked and the PGPROC entry that
      the startup process used was assigned to it, it found such unreleased locks
      and caused the assertion failure, during the initialization of it.
      
          TRAP: FailedAssertion("SHMQueueEmpty(&(MyProc->myProcLocks[i]))"
      
      This commit fixes this issue by making the startup process shut down
      transaction tracking and release all locks, at the exit of it.
      
      Back-patch to all supported branches.
      
      Reported-by: Maxim Orlov
      Author: Fujii Masao
      Reviewed-by: Maxim Orlov
      Discussion: https://postgr.es/m/ad4ce692cc1d89a093b471ab1d969b0b@postgrespro.ru
      ad8b6749
    • Alvaro Herrera's avatar
      Align some terms in arch-dev.sgml to glossary · 6734e806
      Alvaro Herrera authored
      This mostly adds links to the glossary to the existing text, instead of
      using <firstterm>.  Heikki left this out of 29ad6595 out of
      stylistic concerns; these have since been addressed.
      
      Author: Jürgen Purtz <juergen@purtz.de>
      Discussion: https://postgr.es/m/67d7240f-8596-83fc-5e15-af06c128a0f5@purtz.de
      6734e806
    • Peter Eisentraut's avatar
      Renumber cursor option flags · a63dd8af
      Peter Eisentraut authored
      Move the planner-control flags up so that there is more room for parse
      options.  Some pending patches need some room there, so do this
      renumbering separately so that there is less potential for conflicts.
      a63dd8af
    • Michael Paquier's avatar
      Fix typo in collationcmds.c · 9f6f1f9b
      Michael Paquier authored
      Introduced by 51e225da.
      
      Author: Anton Voloshin
      Discussion: https://postgr.es/m/05477da0-703c-7de7-998c-5879738e8f39@postgrespro.ru
      9f6f1f9b
    • Michael Paquier's avatar
      Refactor all TAP test suites doing connection checks · c50624cd
      Michael Paquier authored
      This commit refactors more TAP tests to adapt with the recent
      introduction of connect_ok() and connect_fails() in PostgresNode,
      introduced by 0d1a3343.  This changes the following test suites to use
      the same code paths for connection checks:
      - Kerberos
      - LDAP
      - SSL
      - Authentication
      
      Those routines are extended to be able to handle optional parameters
      that are set depending on each suite's needs, as of:
      - custom SQL query.
      - expected stderr matching pattern.
      - expected stdout matching pattern.
      The new design is extensible with more parameters, and there are some
      plans for those routines in the future with checks based on the contents
      of the backend logs.
      
      Author: Jacob Champion, Michael Paquier
      Discussion: https://postgr.es/m/d17b919e27474abfa55d97786cb9cfadfe2b59e9.camel@vmware.com
      c50624cd
  3. 04 Apr, 2021 8 commits
    • Tom Lane's avatar
      Fix more confusion in SP-GiST. · dfc843d4
      Tom Lane authored
      spg_box_quad_leaf_consistent unconditionally returned the leaf
      datum as leafValue, even though in its usage for poly_ops that
      value is of completely the wrong type.
      
      In versions before 12, that was harmless because the core code did
      nothing with leafValue in non-index-only scans ... but since commit
      2a636834, if we were doing a KNN-style scan, spgNewHeapItem would
      unconditionally try to copy the value using the wrong datatype
      parameters.  Said copying is a waste of time and space if we're not
      going to return the data, but it accidentally failed to fail until
      I fixed the datatype confusion in ac9099fc.
      
      Hence, change spgNewHeapItem to not copy the datum unless we're
      actually going to return it later.  This saves cycles and dodges
      the question of whether lossy opclasses are returning the right
      type.  Also change spg_box_quad_leaf_consistent to not return
      data that might be of the wrong type, as insurance against
      somebody introducing a similar bug into the core code in future.
      
      It seems like a good idea to back-patch these two changes into
      v12 and v13, although I'm afraid to change spgNewHeapItem's
      mistaken idea of which datatype to use in those branches.
      
      Per buildfarm results from ac9099fc.
      
      Discussion: https://postgr.es/m/3728741.1617381471@sss.pgh.pa.us
      dfc843d4
    • Tom Lane's avatar
      Fix confusion in SP-GiST between attribute type and leaf storage type. · ac9099fc
      Tom Lane authored
      According to the documentation, the attType passed to the opclass
      config function (and also relied on by the core code) is the type
      of the heap column or expression being indexed.  But what was
      actually being passed was the type stored for the index column.
      This made no difference for user-defined SP-GiST opclasses,
      because we weren't allowing the STORAGE clause of CREATE OPCLASS
      to be used, so the two types would be the same.  But it's silly
      not to allow that, seeing that the built-in poly_ops opclass
      has a different value for opckeytype than opcintype, and that if you
      want to do lossy storage then the types must really be different.
      (Thus, user-defined opclasses doing lossy storage had to lie about
      what type is in the index.)  Hence, remove the restriction, and make
      sure that we use the input column type not opckeytype where relevant.
      
      For reasons of backwards compatibility with existing user-defined
      opclasses, we can't quite insist that the specified leafType match
      the STORAGE clause; instead just add an amvalidate() warning if
      they don't match.
      
      Also fix some bugs that would only manifest when trying to return
      index entries when attType is different from attLeafType.  It's not
      too surprising that these have not been reported, because the only
      usual reason for such a difference is to store the leaf value
      lossily, rendering index-only scans impossible.
      
      Add a src/test/modules module to exercise cases where attType is
      different from attLeafType and yet index-only scan is supported.
      
      Discussion: https://postgr.es/m/3728741.1617381471@sss.pgh.pa.us
      ac9099fc
    • Tomas Vondra's avatar
      Fix bug in brin_minmax_multi_union · d9c5b9a9
      Tomas Vondra authored
      When calling sort_expanded_ranges() we need to remember the return
      value, because the function sorts and also deduplicates the ranges. So
      the number of ranges may decrease. brin_minmax_multi_union failed to do
      that, which resulted in crashes due to bogus ranges (equal minval/maxval
      but not marked as compacted).
      
      Reported-by: Jaime Casanova
      Discussion: https://postgr.es/m/20210404052550.GA4376%40ahch-to
      d9c5b9a9
    • Tomas Vondra's avatar
      Add regression test for minmax-multi macaddr8 type · 4908684d
      Tomas Vondra authored
      The regression test for BRIN minmax-multi opclasses tested almost all
      supported data types, with the exception of macaddr8. So this adds it.
      4908684d
    • Tomas Vondra's avatar
      Fix order of parameters in BRIN minmax-multi calls · 1dad2a5e
      Tomas Vondra authored
      The BRIN minmax-multi consistent function incorrectly assumed it can
      lookup an operator, and then swap the arguments to get the commutator.
      For example <(a,b) would be called as <(b,a) to get >(a,b). This works
      when the arguments are of the same type, but with cross-type opclasses
      this fails. We can't swap <(float4,float8) arguments, for example.
      
      Fixed by passing arguments in the right order.
      
      Discussion: https://postgr.es/m/CAJKUy5jLZFLCxyxfT%3DMfK5mtPfSzHA1rVLowR-j4RRsFVvKm7A%40mail.gmail.com
      1dad2a5e
    • Tomas Vondra's avatar
      Fix BRIN minmax-multi distance for inet type · e1fbe118
      Tomas Vondra authored
      The distance calculation ignored the mask, unlike the inet comparator,
      which resulted in negative distance in some cases. Fixed by applying the
      mask in brin_minmax_multi_distance_inet. I've considered simply calling
      inetmi() to calculate the delta, but that does not consider mask either.
      
      Reviewed-by: Zhihong Yu
      Discussion: https://postgr.es/m/1a0a7b9d-9bda-e3a2-7fa4-88f15042a051%40enterprisedb.com
      e1fbe118
    • Tomas Vondra's avatar
      Fix BRIN minmax-multi distance for timetz type · 7262f242
      Tomas Vondra authored
      The distance calculation ignored the time zone, so the result of (b-a)
      might have ended negative even if (b > a). Fixed by considering the time
      zone difference.
      
      Reported-by: Jaime Casanova
      Discussion: https://postgr.es/m/CAJKUy5jLZFLCxyxfT%3DMfK5mtPfSzHA1rVLowR-j4RRsFVvKm7A%40mail.gmail.com
      7262f242
    • Tomas Vondra's avatar
      Fix BRIN minmax-multi distance for interval type · 2b10e0e3
      Tomas Vondra authored
      The distance calculation for interval type was treating months as having
      31 days, which is inconsistent with the interval comparator (using 30
      days). Due to this it was possible to get negative distance (b-a) when
      (a<b), trigerring an assert.
      
      Fixed by adopting the same logic as interval_cmp_value.
      
      Reported-by: Jaime Casanova
      Discussion: https://postgr.es/m/CAJKUy5jKH0Xhneau2mNftNPtTy-BVgQfXc8zQkEvRvBHfeUThQ%40mail.gmail.com
      2b10e0e3
  4. 03 Apr, 2021 3 commits
    • Tom Lane's avatar
      Improve psql's behavior when the editor is exited without saving. · 55873a00
      Tom Lane authored
      When editing the previous query buffer, if the editor is exited
      without modifying the temp file then clear the query buffer,
      rather than re-loading (and probably re-executing) the previous
      query buffer.  This reduces the probability of accidentally
      re-executing something you didn't intend to.
      
      Similarly, in "\e file", if the file isn't actually modified
      then don't load it into the query buffer.  And in "\ef" and
      "\ev", if no changes are made then clear the query buffer
      instead of loading the function or view definition into it.
      
      Cases where we fail to invoke the editor at all, or it returns
      a nonzero status, are treated like the no-file-modification case.
      
      Laurenz Albe, reviewed by Jacob Champion
      
      Discussion: https://postgr.es/m/0ba3f2a658bac6546d9934ab6ba63a805d46a49b.camel@cybertec.at
      55873a00
    • Andres Freund's avatar
      Improve efficiency of wait event reporting, remove proc.h dependency. · 225a22b1
      Andres Freund authored
      pgstat_report_wait_start() and pgstat_report_wait_end() required two
      conditional branches so far. One to check if MyProc is NULL, the other to
      check if pgstat_track_activities is set. As wait events are used around
      comparatively lightweight operations, and are inlined (reducing branch
      predictor effectiveness), that's not great.
      
      The dependency on MyProc has a second disadvantage: Low-level subsystems, like
      storage/file/fd.c, report wait events, but architecturally it is preferable
      for them to not depend on inter-process subsystems like proc.h (defining
      PGPROC).  After this change including pgstat.h (nor obviously its
      sub-components like backend_status.h, wait_event.h, ...) does not pull in IPC
      related headers anymore.
      
      These goals, efficiency and abstraction, are achieved by having
      pgstat_report_wait_start/end() not interact with MyProc, but instead a new
      my_wait_event_info variable. At backend startup it points to a local variable,
      removing the need to check for MyProc being NULL. During process
      initialization my_wait_event_info is redirected to MyProc->wait_event_info. At
      shutdown this is reversed. Because wait event reporting now does not need to
      know about where the wait event is stored, it does not need to know about
      PGPROC anymore.
      
      The removal of the branch for checking pgstat_track_activities is simpler:
      Don't check anymore. The cost due to the branch are often higher than the
      store - and even if not, pgstat_track_activities is rarely disabled.
      
      The main motivator to commit this work now is that removing the (indirect)
      pgproc.h include from pgstat.h simplifies a patch to move statistics reporting
      to shared memory (which still has a chance to get into 14).
      
      Author: Andres Freund <andres@anarazel.de>
      Discussion: https://postgr.es/m/20210402194458.2vu324hkk2djq6ce@alap3.anarazel.de
      225a22b1
    • Andres Freund's avatar
      Split backend status and progress related functionality out of pgstat.c. · e1025044
      Andres Freund authored
      Backend status (supporting pg_stat_activity) and command
      progress (supporting pg_stat_progress*) related code is largely
      independent from the rest of pgstat.[ch] (supporting views like
      pg_stat_all_tables that accumulate data over time). See also
      a333476b.
      
      This commit doesn't rename the function names to make the distinction
      from the rest of pgstat_ clearer - that'd be more invasive and not
      clearly beneficial. If we were to decide to do such a rename at some
      point, it's better done separately from moving the code as well.
      
      Robert's review was of an earlier version.
      Reviewed-By: default avatarRobert Haas <robertmhaas@gmail.com>
      Discussion: https://postgr.es/m/20210316195440.twxmlov24rr2nxrg@alap3.anarazel.de
      e1025044