1. 01 Nov, 2020 4 commits
    • Tom Lane's avatar
      Fix two issues in TOAST decompression. · dfc79773
      Tom Lane authored
      pglz_maximum_compressed_size() potentially underestimated the amount
      of compressed data required to produce N bytes of decompressed data;
      this is a fault in commit 11a078cf.
      
      Separately from that, pglz_decompress() failed to protect itself
      against corrupt compressed data, particularly off == 0 in a match
      tag.  Commit c60e520f turned such a situation into an infinite loop,
      where before it'd just have resulted in garbage output.
      
      The combination of these two bugs seems like it may explain bug #16694
      from Tom Vijlbrief, though it's impossible to be quite sure without
      direct inspection of the failing session.  (One needs to assume that
      the pglz_maximum_compressed_size() bug caused us to fail to fetch the
      second byte of a match tag, and what happened to be there instead was
      a zero.  The reported infinite loop is hard to explain without off == 0,
      though.)
      
      Aside from fixing the bugs, rewrite associated comments for more
      clarity.
      
      Back-patch to v13 where both these commits landed.
      
      Discussion: https://postgr.es/m/16694-f107871e499ec114@postgresql.org
      dfc79773
    • Tom Lane's avatar
      Avoid null pointer dereference if error result lacks SQLSTATE. · 7f423503
      Tom Lane authored
      Although error results received from the backend should always have
      a SQLSTATE field, ones generated by libpq won't, making this code
      vulnerable to a crash after, say, untimely loss of connection.
      Noted by Coverity.
      
      Oversight in commit 403a3d91.  Back-patch to 9.5, as that was.
      7f423503
    • Michael Paquier's avatar
      Preserve index data in pg_statistic across REINDEX CONCURRENTLY · b17ff07a
      Michael Paquier authored
      Statistics associated to an index got lost after running REINDEX
      CONCURRENTLY, while the non-concurrent case preserves these correctly.
      The concurrent and non-concurrent operations need to be consistent for
      the end-user, and missing statistics would force to wait for a new
      analyze to happen, which could take some time depending on the activity
      of the existing autovacuum workers.  This issue is fixed by copying any
      existing entries in pg_statistic associated to the old index to the new
      one.  Note that this copy is already done with the data of the index in
      the stats collector.
      
      Reported-by: Fabrízio de Royes Mello
      Author: Michael Paquier, Fabrízio de Royes Mello
      Reviewed-by: Justin Pryzby
      Discussion: https://postgr.es/m/CAFcNs+qpFPmiHd1oTXvcPdvAHicJDA9qBUSujgAhUMJyUMb+SA@mail.gmail.com
      Backpatch-through: 12
      b17ff07a
    • Michael Paquier's avatar
      Add error code for encryption failure in pgcrypto · aecaa044
      Michael Paquier authored
      PXE_DECRYPT_FAILED exists already for decryption errors, and an
      equivalent for encryption did not exist.  There is one code path that
      deals with such failures for OpenSSL but it used PXE_ERR_GENERIC, which
      was inconsistent.  This switches this code path to use the new error
      PXE_ENCRYPT_FAILED instead of PXE_ERR_GENERIC, making the code used for
      encryption more consistent with the decryption.
      
      Author: Daniel Gustafsson
      Discussion: https://postgr.es/m/03049139-CB7A-436E-B71B-42696D3E2EF7@yesql.se
      aecaa044
  2. 31 Oct, 2020 2 commits
  3. 30 Oct, 2020 3 commits
  4. 29 Oct, 2020 6 commits
    • Tom Lane's avatar
      Stabilize timetz test across DST transitions. · 4a071afb
      Tom Lane authored
      The timetz test cases I added in commit a9632830 were unintentionally
      sensitive to whether or not DST is active in the PST8PDT time zone.
      Thus, they'll start failing this coming weekend, as reported by
      Bernhard M. Wiedemann in bug #16689.  Fortunately, DST-awareness is
      not significant to the purpose of these test cases, so we can just
      force them all to PDT (DST hours) to preserve stability of the
      results.
      
      Back-patch to v10, as the prior patch was.
      
      Discussion: https://postgr.es/m/16689-57701daa23b377bf@postgresql.org
      4a071afb
    • Tom Lane's avatar
      Don't use custom OID symbols in pg_type.dat, either. · f90149e6
      Tom Lane authored
      On the same reasoning as in commit 36b93121, forbid using custom
      oid_symbol macros in pg_type as well as pg_proc, so that we always
      rely on the predictable macro names generated by genbki.pl.
      
      We do continue to grant grandfather status to the names CASHOID and
      LSNOID, although those are now considered deprecated aliases for the
      preferred names MONEYOID and PG_LSNOID.  This is because there's
      likely to be client-side code using the old names, and this bout of
      neatnik-ism doesn't quite seem worth breaking client code.
      
      There might be a case for grandfathering EVTTRIGGEROID, too, since
      externally-maintained PLs may reference that symbol.  But renaming
      such references to EVENT_TRIGGEROID doesn't seem like a particularly
      heavy lift --- we make far more significant backend API changes in
      every major release.  For now I didn't add that, but we could
      reconsider if there's pushback.
      
      The other names changed here seem pretty unlikely to have any outside
      uses.  Again, we could add alias macros if there are complaints, but
      for now I didn't.
      
      As before, no need for a catversion bump.
      
      John Naylor
      
      Discussion: https://postgr.es/m/CAFBsxsHpCbjfoddNGpnnnY5pHwckWfiYkMYSF74PmP1su0+ZOw@mail.gmail.com
      f90149e6
    • Andres Freund's avatar
      Fix wrong data table horizon computation during backend startup. · 1c7675a7
      Andres Freund authored
      When ComputeXidHorizons() was called before MyDatabaseOid is set,
      e.g. because a dead row in a shared relation is encountered during
      InitPostgres(), the horizon for normal tables was computed too
      aggressively, ignoring all backends connected to a database.
      
      During subsequent pruning in a data table the too aggressive horizon
      could end up still being used, possibly leading to still needed tuples
      being removed. Not good.
      
      This is a bug in dc7420c2, which the test added in 94bc27b5 made
      visible, if run with force_parallel_mode set to regress. In that case
      the bug is reliably triggered, because "pruning_query" is run in a
      parallel worker and the start of that parallel worker is likely to
      encounter a dead row in pg_database.
      
      The fix is trivial: Compute a more pessimistic data table horizon if
      MyDatabaseId is not yet known.
      
      Author: Andres Freund
      Discussion: https://postgr.es/m/20201029040030.p4osrmaywhqaesd4@alap3.anarazel.de
      1c7675a7
    • Amit Kapila's avatar
      Track statistics for streaming of changes from ReorderBuffer. · 8e90ec55
      Amit Kapila authored
      This adds the statistics about transactions streamed to the decoding
      output plugin from ReorderBuffer. Users can query the
      pg_stat_replication_slots view to check these stats and call
      pg_stat_reset_replication_slot to reset the stats of a particular slot.
      Users can pass NULL in pg_stat_reset_replication_slot to reset stats of
      all the slots.
      
      Commit 98681675 has added the basic infrastructure to capture the stats
      of slot and this commit extends the statistics collector to track
      additional information about slots.
      
      Bump the catversion as we have added new columns in the catalog entry.
      
      Author: Ajin Cherian and Amit Kapila
      Reviewed-by: Sawada Masahiko and Dilip Kumar
      Discussion: https://postgr.es/m/CAA4eK1+chpEomLzgSoky-D31qev19AmECNiEAietPQUGEFhtVA@mail.gmail.com
      8e90ec55
    • Andres Freund's avatar
      Centralize horizon determination for temp tables, fixing bug due to skew. · 94bc27b5
      Andres Freund authored
      This fixes a bug in the edge case where, for a temp table, heap_page_prune()
      can end up with a different horizon than heap_vacuum_rel(). Which can trigger
      errors like "ERROR: cannot freeze committed xmax ...".
      
      The bug was introduced due to interaction of a7212be8 "Set cutoff xmin more
      aggressively when vacuuming a temporary table." with dc7420c2 "snapshot
      scalability: Don't compute global horizons while building snapshots.".
      
      The problem is caused by lazy_scan_heap() assuming that the only reason its
      HeapTupleSatisfiesVacuum() call would return HEAPTUPLE_DEAD is if the tuple is
      a HOT tuple, or if the tuple's inserting transaction has aborted since the
      heap_page_prune() call. But after a7212be8 that was also possible in other
      cases for temp tables, because heap_page_prune() uses a different visibility
      test after dc7420c2.
      
      The fix is fairly simple: Move the special case logic for temp tables from
      vacuum_set_xid_limits() to the infrastructure introduced in dc7420c2. That
      ensures that the horizon used for pruning is at least as aggressive as the one
      used by lazy_scan_heap(). The concrete horizon used for temp tables is
      slightly different than the logic in dc7420c2, but should always be as
      aggressive as before (see comments).
      
      A significant benefit to centralizing the logic procarray.c is that now the
      more aggressive horizons for temp tables does not just apply to VACUUM but
      also to e.g. HOT pruning and the nbtree killtuples logic.
      
      Because isTopLevel is not needed by vacuum_set_xid_limits() anymore, I
      undid the the related changes from a7212be8.
      
      This commit also adds an isolation test ensuring that the more aggressive
      vacuuming and pruning of temp tables keeps working.
      Debugged-By: default avatarAmit Kapila <amit.kapila16@gmail.com>
      Debugged-By: default avatarTom Lane <tgl@sss.pgh.pa.us>
      Debugged-By: default avatarAshutosh Sharma <ashu.coek88@gmail.com>
      Author: Andres Freund <andres@anarazel.de>
      Discussion: https://postgr.es/m/20201014203103.72oke6hqywcyhx7s@alap3.anarazel.de
      Discussion: https://postgr.es/m/20201015083735.derdzysdtqdvxshp@alap3.anarazel.de
      94bc27b5
    • Michael Paquier's avatar
      Fix incorrect placement of pfree() in pg_relation_check_pages() · 60a51c6b
      Michael Paquier authored
      This would cause the function to crash when more than one page is
      considered as broken and reported in the SRF.
      
      Reported-by: Noriyoshi Shinoda
      Discussion: https://postgr.es/m/TU4PR8401MB11523D42C315AAF822E74275EE170@TU4PR8401MB1152.NAMPRD84.PROD.OUTLOOK.COM
      60a51c6b
  5. 28 Oct, 2020 10 commits
    • Tom Lane's avatar
      Doc: clean up pg_relation_check_pages() documentation. · b787d4ce
      Tom Lane authored
      Commit f2b88396 did not get the memo about the new formatting
      style for tables documenting built-in functions.  I noticed because
      of a PDF build warning about an overwidth table.
      b787d4ce
    • Tom Lane's avatar
      Doc: clean up verify_heapam() documentation. · 4c49d8fc
      Tom Lane authored
      I started with the intention of just suppressing a PDF build warning
      by removing the example output, but ended up doing more: correcting
      factual errors in the function's signature, moving a bunch of
      generalized handwaving into the "Using amcheck Effectively" section
      which seemed a better place for it, and improving wording and markup
      a little bit.
      
      Discussion: https://postgr.es/m/732904.1603728748@sss.pgh.pa.us
      4c49d8fc
    • Tom Lane's avatar
      Use mode "r" for popen() in psql's evaluate_backtick(). · 66f8687a
      Tom Lane authored
      In almost all other places, we use plain "r" or "w" mode in popen()
      calls (the exceptions being for COPY data).  This one has been
      overlooked (possibly because it's buried in a ".l" flex file?),
      but it's using PG_BINARY_R.
      
      Kensuke Okamura complained in bug #16688 that we fail to strip \r
      when stripping the trailing newline from a backtick result string.
      That's true enough, but we'd also fail to convert embedded \r\n
      cleanly, which also seems undesirable.  Fixing the popen() mode
      seems like the best way to deal with this.
      
      It's been like this for a long time, so back-patch to all supported
      branches.
      
      Discussion: https://postgr.es/m/16688-c649c7b69cd7e6f8@postgresql.org
      66f8687a
    • Tom Lane's avatar
      Calculate extraUpdatedCols in query rewriter, not parser. · ad77039f
      Tom Lane authored
      It's unsafe to do this at parse time because addition of generated
      columns to a table would not invalidate stored rules containing
      UPDATEs on the table ... but there might now be dependent generated
      columns that were not there when the rule was made.  This also fixes
      an oversight that rewriteTargetView failed to update extraUpdatedCols
      when transforming an UPDATE on an updatable view.  (Since the new
      calculation is downstream of that, rewriteTargetView doesn't actually
      need to do anything; but before, there was a demonstrable bug there.)
      
      In v13 and HEAD, this leads to easily-visible bugs because (since
      commit c6679e4f) we won't recalculate generated columns that aren't
      listed in extraUpdatedCols.  In v12 this bitmap is mostly just used
      for trigger-firing decisions, so you'd only notice a problem if a
      trigger cared whether a generated column had been updated.
      
      I'd complained about this back in May, but then forgot about it
      until bug #16671 from Michael Paul Killian revived the issue.
      
      Back-patch to v12 where this field was introduced.  If existing
      stored rules contain any extraUpdatedCols values, they'll be
      ignored because the rewriter will overwrite them, so the bug will
      be fixed even for existing rules.  (But note that if someone were
      to update to 13.1 or 12.5, store some rules with UPDATEs on tables
      having generated columns, and then downgrade to a prior minor version,
      they might observe issues similar to what this patch fixes.  That
      seems unlikely enough to not be worth going to a lot of effort to fix.)
      
      Discussion: https://postgr.es/m/10206.1588964727@sss.pgh.pa.us
      Discussion: https://postgr.es/m/16671-2fa55851859fb166@postgresql.org
      ad77039f
    • Tom Lane's avatar
      Don't use custom OID symbols in pg_proc.dat. · 36b93121
      Tom Lane authored
      We have a perfectly good convention for OID macros for built-in functions
      already, so making custom symbols is just introducing unnecessary
      deviation from the convention.  Remove the one case that had snuck in,
      and add an error check in genbki.pl to discourage future instances.
      
      Although this touches pg_proc.dat, there's no need for a catversion
      bump since the actual catalog data isn't changed.
      
      John Naylor
      
      Discussion: https://postgr.es/m/CAFBsxsHpCbjfoddNGpnnnY5pHwckWfiYkMYSF74PmP1su0+ZOw@mail.gmail.com
      36b93121
    • Tom Lane's avatar
      Fix foreign-key selectivity estimation in the presence of constants. · ad1c36b0
      Tom Lane authored
      get_foreign_key_join_selectivity() looks for join clauses that equate
      the two sides of the FK constraint.  However, if we have a query like
      "WHERE fktab.a = pktab.a and fktab.a = 1", it won't find any such join
      clause, because equivclass.c replaces the given clauses with "fktab.a
      = 1 and pktab.a = 1", which can be enforced at the scan level, leaving
      nothing to be done for column "a" at the join level.
      
      We can fix that expectation without much trouble, but then a new problem
      arises: applying the foreign-key-based selectivity rule produces a
      rowcount underestimate, because we're effectively double-counting the
      selectivity of the "fktab.a = 1" clause.  So we have to cancel that
      selectivity out of the estimate.
      
      To fix, refactor process_implied_equality() so that it can pass back the
      new RestrictInfo to its callers in equivclass.c, allowing the generated
      "fktab.a = 1" clause to be saved in the EquivalenceClass's ec_derives
      list.  Then it's not much trouble to dig out the relevant RestrictInfo
      when we need to adjust an FK selectivity estimate.  (While at it, we
      can also remove the expensive use of initialize_mergeclause_eclasses()
      to set up the new RestrictInfo's left_ec and right_ec pointers.
      The equivclass.c code can set those basically for free.)
      
      This seems like clearly a bug fix, but I'm hesitant to back-patch it,
      first because there's some API/ABI risk for extensions and second because
      we're usually loath to destabilize plan choices in stable branches.
      
      Per report from Sigrid Ehrenreich.
      
      Discussion: https://postgr.es/m/1019549.1603770457@sss.pgh.pa.us
      Discussion: https://postgr.es/m/AM6PR02MB5287A0ADD936C1FA80973E72AB190@AM6PR02MB5287.eurprd02.prod.outlook.com
      ad1c36b0
    • Michael Paquier's avatar
      Use correct GetDatum() in pg_relation_check_pages() · ce7f772c
      Michael Paquier authored
      UInt32GetDatum() was getting used, while the result needs
      Int64GetDatum().  Oversight in f2b88396.
      
      Per buildfarm member florican.
      
      Discussion: https://postgr.es/m/1226629.1603859189@sss.pgh.pa.us
      ce7f772c
    • Michael Paquier's avatar
      Add pg_relation_check_pages() to check on-disk pages of a relation · f2b88396
      Michael Paquier authored
      This makes use of CheckBuffer() introduced in c780a7a9, adding a SQL
      wrapper able to do checks for all the pages of a relation.  By default,
      all the fork types of a relation are checked, and it is possible to
      check only a given relation fork.  Note that if the relation given in
      input has no physical storage or is temporary, then no errors are
      generated, allowing full-database checks when coupled with a simple scan
      of pg_class for example.  This is not limited to clusters with data
      checksums enabled, as clusters without data checksums can still apply
      checks on pages using the page headers or for the case of a page full of
      zeros.
      
      This function returns a set of tuples consisting of:
      - The physical file where a broken page has been detected (without the
      segment number as that can be AM-dependent, which can be guessed from
      the block number for heap).  A relative path from PGPATH is used.
      - The block number of the broken page.
      
      By default, only superusers have an access to this function but
      execution rights can be granted to other users.
      
      The feature introduced here is still minimal, and more improvements
      could be done, like:
      - Addition of a start and end block number to run checks on a range
      of blocks, which would apply only if one fork type is checked.
      - Addition of some progress reporting.
      - Throttling, with configuration parameters in function input or
      potentially some cost-based GUCs.
      
      Regression tests are added for positive cases in the main regression
      test suite, and TAP tests are added for cases involving the emulation of
      page corruptions.
      
      Bump catalog version.
      
      Author: Julien Rouhaud, Michael Paquier
      Reviewed-by: Masahiko Sawada, Justin Pryzby
      Discussion: https://postgr.es/m/CAOBaU_aVvMjQn=ge5qPiJOPMmOj5=ii3st5Q0Y+WuLML5sR17w@mail.gmail.com
      f2b88396
    • Michael Paquier's avatar
      Add CheckBuffer() to check on-disk pages without shared buffer loading · c780a7a9
      Michael Paquier authored
      CheckBuffer() is designed to be a concurrent-safe function able to run
      sanity checks on a relation page without loading it into the shared
      buffers.  The operation is done using a lock on the partition involved
      in the shared buffer mapping hashtable and an I/O lock for the buffer
      itself, preventing the risk of false positives due to any concurrent
      activity.
      
      The primary use of this function is the detection of on-disk corruptions
      for relation pages.  If a page is found in shared buffers, the on-disk
      page is checked if not dirty (a follow-up checkpoint would flush a valid
      version of the page if dirty anyway), as it could be possible that a
      page was present for a long time in shared buffers with its on-disk
      version corrupted.  Such a scenario could lead to a corrupted cluster if
      a host is plugged off for example.  If the page is not found in shared
      buffers, its on-disk state is checked.  PageIsVerifiedExtended() is used
      to apply the same sanity checks as when a page gets loaded into shared
      buffers.
      
      This function will be used by an upcoming patch able to check the state
      of on-disk relation pages using a SQL function.
      
      Author: Julien Rouhaud, Michael Paquier
      Reviewed-by:  Masahiko Sawada
      Discussion: https://postgr.es/m/CAOBaU_aVvMjQn=ge5qPiJOPMmOj5=ii3st5Q0Y+WuLML5sR17w@mail.gmail.com
      c780a7a9
    • Amit Kapila's avatar
  6. 27 Oct, 2020 11 commits
  7. 26 Oct, 2020 4 commits