1. 13 Jan, 2021 10 commits
    • Peter Geoghegan's avatar
      Enhance nbtree index tuple deletion. · d168b666
      Peter Geoghegan authored
      Teach nbtree and heapam to cooperate in order to eagerly remove
      duplicate tuples representing dead MVCC versions.  This is "bottom-up
      deletion".  Each bottom-up deletion pass is triggered lazily in response
      to a flood of versions on an nbtree leaf page.  This usually involves a
      "logically unchanged index" hint (these are produced by the executor
      mechanism added by commit 9dc718bd).
      
      The immediate goal of bottom-up index deletion is to avoid "unnecessary"
      page splits caused entirely by version duplicates.  It naturally has an
      even more useful effect, though: it acts as a backstop against
      accumulating an excessive number of index tuple versions for any given
      _logical row_.  Bottom-up index deletion complements what we might now
      call "top-down index deletion": index vacuuming performed by VACUUM.
      Bottom-up index deletion responds to the immediate local needs of
      queries, while leaving it up to autovacuum to perform infrequent clean
      sweeps of the index.  The overall effect is to avoid certain
      pathological performance issues related to "version churn" from UPDATEs.
      
      The previous tableam interface used by index AMs to perform tuple
      deletion (the table_compute_xid_horizon_for_tuples() function) has been
      replaced with a new interface that supports certain new requirements.
      Many (perhaps all) of the capabilities added to nbtree by this commit
      could also be extended to other index AMs.  That is left as work for a
      later commit.
      
      Extend deletion of LP_DEAD-marked index tuples in nbtree by adding logic
      to consider extra index tuples (that are not LP_DEAD-marked) for
      deletion in passing.  This increases the number of index tuples deleted
      significantly in many cases.  The LP_DEAD deletion process (which is now
      called "simple deletion" to clearly distinguish it from bottom-up
      deletion) won't usually need to visit any extra table blocks to check
      these extra tuples.  We have to visit the same table blocks anyway to
      generate a latestRemovedXid value (at least in the common case where the
      index deletion operation's WAL record needs such a value).
      
      Testing has shown that the "extra tuples" simple deletion enhancement
      increases the number of index tuples deleted with almost any workload
      that has LP_DEAD bits set in leaf pages.  That is, it almost never fails
      to delete at least a few extra index tuples.  It helps most of all in
      cases that happen to naturally have a lot of delete-safe tuples.  It's
      not uncommon for an individual deletion operation to end up deleting an
      order of magnitude more index tuples compared to the old naive approach
      (e.g., custom instrumentation of the patch shows that this happens
      fairly often when the regression tests are run).
      
      Add a further enhancement that augments simple deletion and bottom-up
      deletion in indexes that make use of deduplication: Teach nbtree's
      _bt_delitems_delete() function to support granular TID deletion in
      posting list tuples.  It is now possible to delete individual TIDs from
      posting list tuples provided the TIDs have a tableam block number of a
      table block that gets visited as part of the deletion process (visiting
      the table block can be triggered directly or indirectly).  Setting the
      LP_DEAD bit of a posting list tuple is still an all-or-nothing thing,
      but that matters much less now that deletion only needs to start out
      with the right _general_ idea about which index tuples are deletable.
      
      Bump XLOG_PAGE_MAGIC because xl_btree_delete changed.
      
      No bump in BTREE_VERSION, since there are no changes to the on-disk
      representation of nbtree indexes.  Indexes built on PostgreSQL 12 or
      PostgreSQL 13 will automatically benefit from bottom-up index deletion
      (i.e. no reindexing required) following a pg_upgrade.  The enhancement
      to simple deletion is available with all B-Tree indexes following a
      pg_upgrade, no matter what PostgreSQL version the user upgrades from.
      
      Author: Peter Geoghegan <pg@bowt.ie>
      Reviewed-By: default avatarHeikki Linnakangas <hlinnaka@iki.fi>
      Reviewed-By: default avatarVictor Yegorov <vyegorov@gmail.com>
      Discussion: https://postgr.es/m/CAH2-Wzm+maE3apHB8NOtmM=p-DO65j2V5GzAWCOEEuy3JZgb2g@mail.gmail.com
      d168b666
    • Peter Geoghegan's avatar
      Pass down "logically unchanged index" hint. · 9dc718bd
      Peter Geoghegan authored
      Add an executor aminsert() hint mechanism that informs index AMs that
      the incoming index tuple (the tuple that accompanies the hint) is not
      being inserted by execution of an SQL statement that logically modifies
      any of the index's key columns.
      
      The hint is received by indexes when an UPDATE takes place that does not
      apply an optimization like heapam's HOT (though only for indexes where
      all key columns are logically unchanged).  Any index tuple that receives
      the hint on insert is expected to be a duplicate of at least one
      existing older version that is needed for the same logical row.  Related
      versions will typically be stored on the same index page, at least
      within index AMs that apply the hint.
      
      Recognizing the difference between MVCC version churn duplicates and
      true logical row duplicates at the index AM level can help with cleanup
      of garbage index tuples.  Cleanup can intelligently target tuples that
      are likely to be garbage, without wasting too many cycles on less
      promising tuples/pages (index pages with little or no version churn).
      
      This is infrastructure for an upcoming commit that will teach nbtree to
      perform bottom-up index deletion.  No index AM actually applies the hint
      just yet.
      
      Author: Peter Geoghegan <pg@bowt.ie>
      Reviewed-By: default avatarVictor Yegorov <vyegorov@gmail.com>
      Discussion: https://postgr.es/m/CAH2-Wz=CEKFa74EScx_hFVshCOn6AA5T-ajFASTdzipdkLTNQQ@mail.gmail.com
      9dc718bd
    • Fujii Masao's avatar
      Log long wait time on recovery conflict when it's resolved. · 39b03690
      Fujii Masao authored
      This is a follow-up of the work done in commit 0650ff23. This commit
      extends log_recovery_conflict_waits so that a log message is produced
      also when recovery conflict has already been resolved after deadlock_timeout
      passes, i.e., when the startup process finishes waiting for recovery
      conflict after deadlock_timeout. This is useful in investigating how long
      recovery conflicts prevented the recovery from applying WAL.
      
      Author: Fujii Masao
      Reviewed-by: Kyotaro Horiguchi, Bertrand Drouvot
      Discussion: https://postgr.es/m/9a60178c-a853-1440-2cdc-c3af916cff59@amazon.com
      39b03690
    • Heikki Linnakangas's avatar
      Fix portability issues in the new gist pageinspect test. · 6ecaaf81
      Heikki Linnakangas authored
      1. The raw bytea representation of the point-type keys used in the test
         depends on endianess. Remove the raw key_data column from the test.
      
      2. The items stored on non-leftmost gist page depends on how many items
         git on the other pages. This showed up as a failure on 32-bit i386
         systems. To fix, only test the gist_page_items() function on the
         leftmost leaf page.
      
      Per Andrey Borodin and the buildfarm.
      
      Discussion: https://www.postgresql.org/message-id/9FCEC1DC-86FB-4A57-88EF-DD13663B36AF%40yandex-team.ru
      6ecaaf81
    • Magnus Hagander's avatar
      Remove incorrect markup · e6eeb8d7
      Magnus Hagander authored
      Seems 737d69ff made a copy/paste or automation error resulting in two
      extra right-parenthesis.
      
      Reported-By: Michael Vastola
      Backpatch-through: 13
      Discussion: https://postgr.es/m/161051035421.12224.1741822783166533529@wrigleys.postgresql.org
      e6eeb8d7
    • Heikki Linnakangas's avatar
    • Thomas Munro's avatar
      Don't use elog() in src/port/pwrite.c. · df10ac62
      Thomas Munro authored
      Nothing broke because of this oversight yet, but it would fail to link
      if we tried to use pg_pwrite() in frontend code on a system that lacks
      pwrite().  Use an assertion instead.  Also pgindent while here.
      
      Discussion: https://postgr.es/m/CA%2BhUKGL57RvoQsS35TVPnQoPYqbtBixsdRhynB8NpcUKpHTTtg%40mail.gmail.com
      df10ac62
    • Amit Kapila's avatar
      Fix memory leak in SnapBuildSerialize. · ee1b38f6
      Amit Kapila authored
      The memory for the snapshot was leaked while serializing it to disk during
      logical decoding. This memory will be freed only once walsender stops
      streaming the changes. This can lead to a huge memory increase when master
      logs Standby Snapshot too frequently say when the user is trying to create
      many replication slots.
      
      Reported-by: funnyxj.fxj@alibaba-inc.com
      Diagnosed-by: funnyxj.fxj@alibaba-inc.com
      Author: Amit Kapila
      Backpatch-through: 9.5
      Discussion: https://postgr.es/m/033ab54c-6393-42ee-8ec9-2b399b5d8cde.funnyxj.fxj@alibaba-inc.com
      ee1b38f6
    • Amit Kapila's avatar
      Optimize DropRelFileNodesAllBuffers() for recovery. · bea449c6
      Amit Kapila authored
      Similar to commit d6ad34f3, this patch optimizes
      DropRelFileNodesAllBuffers() by avoiding the complete buffer pool scan and
      instead find the buffers to be invalidated by doing lookups in the
      BufMapping table.
      
      This optimization helps operations where the relation files need to be
      removed like Truncate, Drop, Abort of Create Table, etc.
      
      Author: Kirk Jamison
      Reviewed-by: Kyotaro Horiguchi, Takayuki Tsunakawa, and Amit Kapila
      Tested-By: Haiying Tang
      Discussion: https://postgr.es/m/OSBPR01MB3207DCA7EC725FDD661B3EDAEF660@OSBPR01MB3207.jpnprd01.prod.outlook.com
      bea449c6
    • Michael Paquier's avatar
  2. 12 Jan, 2021 8 commits
    • Alvaro Herrera's avatar
      Invent struct ReindexIndexInfo · c6c4b373
      Alvaro Herrera authored
      This struct is used by ReindexRelationConcurrently to keep track of the
      relations to process.  This saves having to obtain some data repeatedly,
      and has future uses as well.
      Reviewed-by: default avatarDmitry Dolgov <9erthalion6@gmail.com>
      Reviewed-by: default avatarHamid Akhtar <hamid.akhtar@gmail.com>
      Reviewed-by: default avatarMasahiko Sawada <sawada.mshk@gmail.com>
      Discussion: https://postgr.es/m/20201130195439.GA24598@alvherre.pgsql
      c6c4b373
    • Tom Lane's avatar
      pg_dump: label INDEX ATTACH ArchiveEntries with an owner. · 9eabfe30
      Tom Lane authored
      Although a partitioned index's attachment to its parent doesn't
      have separate ownership, the ArchiveEntry for it needs to be
      marked with an owner anyway, to ensure that the ALTER command
      is run by the appropriate role when restoring with
      --use-set-session-authorization.  Without this, the ALTER will
      be run by the role that started the restore session, which will
      usually work but it's formally the wrong thing.
      
      Back-patch to v11 where this type of ArchiveEntry was added.
      In HEAD, add equivalent commentary to the just-added TABLE ATTACH
      case, which I'd made do the right thing already.
      
      Discussion: https://postgr.es/m/1094034.1610418498@sss.pgh.pa.us
      9eabfe30
    • Tom Lane's avatar
      Doc: fix description of privileges needed for ALTER PUBLICATION. · cc865c0f
      Tom Lane authored
      Adding a table to a publication requires ownership of the table
      (in addition to ownership of the publication).  This was mentioned
      nowhere.
      cc865c0f
    • Alvaro Herrera's avatar
      Fix thinko in comment · a3e51a36
      Alvaro Herrera authored
      This comment has been wrong since its introduction in commit
      2c03216d.
      
      Author: Masahiko Sawada <sawada.mshk@gmail.com>
      Discussion: https://postgr.es/m/CAD21AoAzz6qipFJBbGEaHmyWxvvNDp8httbwLR9tUQWaTjUs2Q@mail.gmail.com
      a3e51a36
    • Amit Kapila's avatar
      Fix relation descriptor leak. · 044aa9e7
      Amit Kapila authored
      We missed closing the relation descriptor while sending changes via the
      root of partitioned relations during logical replication.
      
      Author: Amit Langote and Mark Zhao
      Reviewed-by: Amit Kapila and Ashutosh Bapat
      Backpatch-through: 13, where it was introduced
      Discussion: https://postgr.es/m/tencent_41FEA657C206F19AB4F406BE9252A0F69C06@qq.com
      Discussion: https://postgr.es/m/tencent_6E296D2F7D70AFC90D83353B69187C3AA507@qq.com
      044aa9e7
    • Amit Kapila's avatar
      Optimize DropRelFileNodeBuffers() for recovery. · d6ad34f3
      Amit Kapila authored
      The recovery path of DropRelFileNodeBuffers() is optimized so that
      scanning of the whole buffer pool can be avoided when the number of
      blocks to be truncated in a relation is below a certain threshold. For
      such cases, we find the buffers by doing lookups in BufMapping table.
      This improves the performance by more than 100 times in many cases
      when several small tables (tested with 1000 relations) are truncated
      and where the server is configured with a large value of shared
      buffers (greater than equal to 100GB).
      
      This optimization helps cases (a) when vacuum or autovacuum truncated off
      any of the empty pages at the end of a relation, or (b) when the relation is
      truncated in the same transaction in which it was created.
      
      This commit introduces a new API smgrnblocks_cached which returns a cached
      value for the number of blocks in a relation fork. This helps us to determine
      the exact size of relation which is required to apply this optimization. The
      exact size is required to ensure that we don't leave any buffer for the
      relation being dropped as otherwise the background writer or checkpointer
      can lead to a PANIC error while flushing buffers corresponding to files that
      don't exist.
      
      Author: Kirk Jamison based on ideas by Amit Kapila
      Reviewed-by: Kyotaro Horiguchi, Takayuki Tsunakawa, and Amit Kapila
      Tested-By: Haiying Tang
      Discussion: https://postgr.es/m/OSBPR01MB3207DCA7EC725FDD661B3EDAEF660@OSBPR01MB3207.jpnprd01.prod.outlook.com
      d6ad34f3
    • Tom Lane's avatar
      Dump ALTER TABLE ... ATTACH PARTITION as a separate ArchiveEntry. · 9a4c0e36
      Tom Lane authored
      Previously, we emitted the ATTACH PARTITION command as part of
      the child table's ArchiveEntry.  This was a poor choice since it
      complicates restoring the partition as a standalone table; you have
      to ignore the error from the ATTACH, which isn't even an option when
      restoring direct-to-database with pg_restore.  (pg_restore will issue
      the whole ArchiveEntry as one PQexec, so that any error rolls back
      the table creation as well.)  Hence, separate it out as its own
      ArchiveEntry, as indeed we already did for index ATTACH PARTITION
      commands.
      
      Justin Pryzby
      
      Discussion: https://postgr.es/m/20201023052940.GE9241@telsasoft.com
      9a4c0e36
    • Tom Lane's avatar
      Make pg_dump's table of object-type priorities more maintainable. · d5ab79d8
      Tom Lane authored
      Wedging a new object type into this table has historically required
      manually renumbering a lot of existing entries.  (Although it appears
      that some people got lazy and re-used the priority level of an
      existing object type, even if it wasn't particularly related.)
      We can let the compiler do the counting by inventing an enum type that
      lists the desired priority levels in order.  Now, if you want to add
      or remove a priority level, that's a one-liner.
      
      This patch is not purely cosmetic, because I split apart the priorities
      of DO_COLLATION and DO_TRANSFORM, as well as those of DO_ACCESS_METHOD
      and DO_OPERATOR, which look to me to have been merged out of expediency
      rather than because it was a good idea.  Shell types continue to be
      sorted interchangeably with full types, and opclasses interchangeably
      with opfamilies.
      d5ab79d8
  3. 11 Jan, 2021 8 commits
    • Thomas Munro's avatar
      Fix function prototypes in dependency.h. · f315205f
      Thomas Munro authored
      Commit 257836a7 accidentally deleted a couple of
      redundant-but-conventional "extern" keywords on function prototypes.
      Put them back.
      Reported-by: default avatarAlvaro Herrera <alvherre@alvh.no-ip.org>
      f315205f
    • Tom Lane's avatar
      Rethink SQLSTATE code for ERRCODE_IDLE_SESSION_TIMEOUT. · 4edf9684
      Tom Lane authored
      Move it to class 57 (Operator Intervention), which seems like a
      better choice given that from the client's standpoint it behaves
      a heck of a lot like, e.g., ERRCODE_ADMIN_SHUTDOWN.
      
      In a green field I'd put ERRCODE_IDLE_IN_TRANSACTION_SESSION_TIMEOUT
      here as well.  But that's been around for a few years, so it's
      probably too late to change its SQLSTATE code.
      
      Discussion: https://postgr.es/m/763A0689-F189-459E-946F-F0EC4458980B@hotmail.com
      4edf9684
    • Tom Lane's avatar
      Try next host after a "cannot connect now" failure. · c1d58957
      Tom Lane authored
      If a server returns ERRCODE_CANNOT_CONNECT_NOW, try the next host,
      if multiple host names have been provided.  This allows dealing
      gracefully with standby servers that might not be in hot standby mode
      yet.
      
      In the wake of the preceding commit, it might be plausible to retry
      many more error cases than we do now, but I (tgl) am hesitant to
      move too aggressively on that --- it's not clear it'd be desirable
      for cases such as bad-password, for example.  But this case seems
      safe enough.
      
      Hubert Zhang, reviewed by Takayuki Tsunakawa
      
      Discussion: https://postgr.es/m/BN6PR05MB3492948E4FD76C156E747E8BC9160@BN6PR05MB3492.namprd05.prod.outlook.com
      c1d58957
    • Tom Lane's avatar
      Uniformly identify the target host in libpq connection failure reports. · 52a10224
      Tom Lane authored
      Prefix "could not connect to host-or-socket-path:" to all connection
      failure cases that occur after the socket() call, and remove the
      ad-hoc server identity data that was appended to a few of these
      messages.  This should produce much more intelligible error reports
      in multiple-target-host situations, especially for error cases that
      are off the beaten track to any degree (because none of those provided
      any server identity info).
      
      As an example of the change, formerly a connection attempt with a bad
      port number such as "psql -p 12345 -h localhost,/tmp" might produce
      
      psql: error: could not connect to server: Connection refused
              Is the server running on host "localhost" (::1) and accepting
              TCP/IP connections on port 12345?
      could not connect to server: Connection refused
              Is the server running on host "localhost" (127.0.0.1) and accepting
              TCP/IP connections on port 12345?
      could not connect to server: No such file or directory
              Is the server running locally and accepting
              connections on Unix domain socket "/tmp/.s.PGSQL.12345"?
      
      Now it looks like
      
      psql: error: could not connect to host "localhost" (::1), port 12345: Connection refused
              Is the server running on that host and accepting TCP/IP connections?
      could not connect to host "localhost" (127.0.0.1), port 12345: Connection refused
              Is the server running on that host and accepting TCP/IP connections?
      could not connect to socket "/tmp/.s.PGSQL.12345": No such file or directory
              Is the server running locally and accepting connections on that socket?
      
      This requires adjusting a couple of regression tests to allow for
      variation in the contents of a connection failure message.
      
      Discussion: https://postgr.es/m/BN6PR05MB3492948E4FD76C156E747E8BC9160@BN6PR05MB3492.namprd05.prod.outlook.com
      52a10224
    • Tom Lane's avatar
      Allow pg_regress.c wrappers to postprocess test result files. · 800d93f3
      Tom Lane authored
      Add an optional callback to regression_main() that, if provided,
      is invoked on each test output file before we try to compare it
      to the expected-result file.
      
      The main and isolation test programs don't need this (yet).
      In pg_regress_ecpg, add a filter that eliminates target-host
      details from "could not connect" error reports.  This filter
      doesn't do anything as of this commit, but it will be needed
      by the next one.
      
      In the long run we might want to provide some more general,
      perhaps pattern-based, filtering mechanism for test output.
      For now, this will solve the immediate problem.
      
      Discussion: https://postgr.es/m/BN6PR05MB3492948E4FD76C156E747E8BC9160@BN6PR05MB3492.namprd05.prod.outlook.com
      800d93f3
    • Tom Lane's avatar
      In libpq, always append new error messages to conn->errorMessage. · ffa2e467
      Tom Lane authored
      Previously, we had an undisciplined mish-mash of printfPQExpBuffer and
      appendPQExpBuffer calls to report errors within libpq.  This commit
      establishes a uniform rule that appendPQExpBuffer[Str] should be used.
      conn->errorMessage is reset only at the start of an application request,
      and then accumulates messages till we're done.  We can remove no less
      than three different ad-hoc mechanisms that were used to get the effect
      of concatenation of error messages within a sequence of operations.
      
      Although this makes things quite a bit cleaner conceptually, the main
      reason to do it is to make the world safer for the multiple-target-host
      feature that was added awhile back.  Previously, there were many cases
      in which an error occurring during an individual host connection attempt
      would wipe out the record of what had happened during previous attempts.
      (The reporting is still inadequate, in that it can be hard to tell which
      host got the failure, but that seems like a matter for a separate commit.)
      
      Currently, lo_import and lo_export contain exceptions to the "never
      use printfPQExpBuffer" rule.  If we changed them, we'd risk reporting
      an incidental lo_close failure before the actual read or write
      failure, which would be confusing, not least because lo_close happened
      after the main failure.  We could improve this by inventing an
      internal version of lo_close that doesn't reset the errorMessage; but
      we'd also need a version of PQfn() that does that, and it didn't quite
      seem worth the trouble for now.
      
      Discussion: https://postgr.es/m/BN6PR05MB3492948E4FD76C156E747E8BC9160@BN6PR05MB3492.namprd05.prod.outlook.com
      ffa2e467
    • Thomas Munro's avatar
      Use vectored I/O to fill new WAL segments. · ce6a71fa
      Thomas Munro authored
      Instead of making many block-sized write() calls to fill a new WAL file
      with zeroes, make a smaller number of pwritev() calls (or various
      emulations).  The actual number depends on the OS's IOV_MAX, which
      PG_IOV_MAX currently caps at 32.  That means we'll write 256kB per call
      on typical systems.  We may want to tune the number later with more
      experience.
      Reviewed-by: default avatarTom Lane <tgl@sss.pgh.pa.us>
      Reviewed-by: default avatarAndres Freund <andres@anarazel.de>
      Discussion: https://postgr.es/m/CA%2BhUKGJA%2Bu-220VONeoREBXJ9P3S94Y7J%2BkqCnTYmahvZJwM%3Dg%40mail.gmail.com
      ce6a71fa
    • Thomas Munro's avatar
      Provide pg_preadv() and pg_pwritev(). · 13a021f3
      Thomas Munro authored
      Provide synchronous vectored file I/O routines.  These map to preadv()
      and pwritev(), with fallback implementations for systems that don't have
      them.  Also provide a wrapper pg_pwritev_with_retry() that automatically
      retries on short writes.
      Reviewed-by: default avatarTom Lane <tgl@sss.pgh.pa.us>
      Reviewed-by: default avatarAndres Freund <andres@anarazel.de>
      Discussion: https://postgr.es/m/CA%2BhUKGJA%2Bu-220VONeoREBXJ9P3S94Y7J%2BkqCnTYmahvZJwM%3Dg%40mail.gmail.com
      13a021f3
  4. 09 Jan, 2021 2 commits
  5. 08 Jan, 2021 4 commits
    • Tom Lane's avatar
      Fix plpgsql tests for debug_invalidate_system_caches_always. · 39d4a153
      Tom Lane authored
      Commit c9d52984 resulted in having a couple more places where
      the error context stack for a failure varies depending on
      debug_invalidate_system_caches_always (nee CLOBBER_CACHE_ALWAYS).
      This is not very surprising, since we have to re-parse cached
      plans if the plan cache is clobbered.  Stabilize the expected
      test output by hiding the context stack in these places,
      as we've done elsewhere in this test script.
      
      (Another idea worth considering, now that we have
      debug_invalidate_system_caches_always, is to force it to zero for
      these test cases.  That seems like it'd risk reducing the coverage
      of cache-clobber testing, which might or might not be worth being
      able to verify that we get the expected error output in normal
      cases.  For the moment I just stuck with the existing technique.)
      
      In passing, update comments that referred to CLOBBER_CACHE_ALWAYS.
      
      Per buildfarm member hyrax.
      39d4a153
    • Tom Lane's avatar
      Fix ancient bug in parsing of BRE-mode regular expressions. · afcc8772
      Tom Lane authored
      brenext(), when parsing a '*' quantifier, forgot to return any "value"
      for the token; per the equivalent case in next(), it should return
      value 1 to indicate that greedy rather than non-greedy behavior is
      wanted.  The result is that the compiled regexp could behave like 'x*?'
      rather than the intended 'x*', if we were unlucky enough to have
      a zero in v->nextvalue at this point.  That seems to happen with some
      reliability if we have '.*' at the beginning of a BRE-mode regexp,
      although that depends on the initial contents of a stack-allocated
      struct, so it's not guaranteed to fail.
      
      Found by Alexander Lakhin using valgrind testing.  This bug seems
      to be aboriginal in Spencer's code, so back-patch all the way.
      
      Discussion: https://postgr.es/m/16814-6c5e3edd2bdf0d50@postgresql.org
      afcc8772
    • Michael Paquier's avatar
      Fix and simplify some code related to cryptohashes · 15b824da
      Michael Paquier authored
      This commit addresses two issues:
      - In pgcrypto, MD5 computation called pg_cryptohash_{init,update,final}
      without checking for the result status.
      - Simplify pg_checksum_raw_context to use only one variable for all the
      SHA2 options available in checksum manifests.
      
      Reported-by: Heikki Linnakangas
      Discussion: https://postgr.es/m/f62f26bb-47a5-8411-46e5-4350823e06a5@iki.fi
      15b824da
    • Tom Lane's avatar
      Adjust createdb TAP tests to work on recent OpenBSD. · 9ffe2278
      Tom Lane authored
      We found last February that the error-case tests added by commit
      008cf040 failed on OpenBSD, because that platform doesn't really
      check locale names.  At the time it seemed that that was only an issue
      for LC_CTYPE, but testing on a more recent version of OpenBSD shows
      that it's now equally lax about LC_COLLATE.
      
      Rather than dropping the LC_COLLATE test too, put back LC_CTYPE
      (reverting c4b0edb0), and adjust these tests to accept the different
      error message that we get if setlocale() doesn't reject a bogus locale
      name.  The point of these tests is not really what the backend does
      with the locale name, but to show that createdb quotes funny locale
      names safely; so we're not losing test reliability this way.
      
      Back-patch as appropriate.
      
      Discussion: https://postgr.es/m/231373.1610058324@sss.pgh.pa.us
      9ffe2278
  6. 07 Jan, 2021 6 commits
  7. 06 Jan, 2021 2 commits
    • Tom Lane's avatar
      Add idle_session_timeout. · 9877374b
      Tom Lane authored
      This GUC variable works much like idle_in_transaction_session_timeout,
      in that it kills sessions that have waited too long for a new client
      query.  But it applies when we're not in a transaction, rather than
      when we are.
      
      Li Japin, reviewed by David Johnston and Hayato Kuroda, some
      fixes by me
      
      Discussion: https://postgr.es/m/763A0689-F189-459E-946F-F0EC4458980B@hotmail.com
      9877374b
    • Tom Lane's avatar
      Improve timeout.c's handling of repeated timeout set/cancel. · 09cf1d52
      Tom Lane authored
      A very common usage pattern is that we set a timeout that we don't
      expect to reach, cancel it after a little bit, and later repeat.
      With the original implementation of timeout.c, this results in one
      setitimer() call per timeout set or cancel.  We can do a lot better
      by being lazy about changing the timeout interrupt request, namely:
      (1) never cancel the outstanding interrupt, even when we have no
      active timeout events;
      (2) if we need to set an interrupt, but there already is one pending
      at or before the required time, leave it alone.  When the interrupt
      happens, the signal handler will reschedule it at whatever time is
      then needed.
      
      For example, with a one-second setting for statement_timeout, this
      method results in having to interact with the kernel only a little
      more than once a second, no matter how many statements we execute
      in between.  The mainline code might never call setitimer() at all
      after the first time, while each time the signal handler fires,
      it sees that the then-pending request is most of a second away,
      and that's when it sets the next interrupt request for.  Each
      mainline timeout-set request after that will observe that the time
      it wants is past the pending interrupt request time, and do nothing.
      
      This also works pretty well for cases where a few different timeout
      lengths are in use, as long as none of them are very short.  But
      that describes our usage well.
      
      Idea and original patch by Thomas Munro; I fixed a race condition
      and improved the comments.
      
      Discussion: https://postgr.es/m/CA+hUKG+o6pbuHBJSGnud=TadsuXySWA7CCcPgCt2QE9F6_4iHQ@mail.gmail.com
      09cf1d52