1. 17 Feb, 2021 2 commits
  2. 16 Feb, 2021 4 commits
    • Tom Lane's avatar
      Convert tsginidx.c's GIN indexing logic to fully ternary operation. · 38bb3aef
      Tom Lane authored
      Commit 2f2007fb did this partially, but there were two remaining
      warts.  checkcondition_gin handled some uncertain cases by setting
      the out-of-band recheck flag, some by returning TS_MAYBE, and some
      by doing both.  Meanwhile, TS_execute arbitrarily converted a
      TS_MAYBE result to TS_YES.  Thus, if checkcondition_gin chose to
      only return TS_MAYBE, the outcome would be TS_YES with no recheck
      flag, potentially resulting in wrong query outputs.
      
      The case where this'd happen is if there were GIN_MAYBE entries
      in the indexscan results passed to gin_tsquery_[tri]consistent,
      which so far as I can see would only happen if the tidbitmap used
      to accumulate indexscan results grew large enough to become lossy.
      
      I initially thought of fixing this by ensuring we always set the
      recheck flag as well as returning TS_MAYBE in uncertain cases.
      But that errs in the other direction, potentially forcing rechecks
      of rows that provably match the query (since the recheck flag
      remains set even if TS_execute later finds that the answer must be
      TS_YES).  Instead, let's get rid of the out-of-band recheck flag
      altogether and rely on returning TS_MAYBE.  This requires exporting
      a version of TS_execute that will actually return the full ternary
      result of the evaluation ... but we likely should have done that
      to start with.
      
      Unfortunately it doesn't seem practical to add a regression test case
      that covers this: the amount of data needed to cause the GIN bitmap to
      become lossy results in a longer runtime than I think we want to have
      in the tests.  (I'm wondering about allowing smaller work_mem settings
      to ameliorate that, but it'd be a matter for a separate patch.)
      
      Per bug #16865 from Dimitri Nüscheler.  Back-patch to v13 where
      the faulty commit came in.
      
      Discussion: https://postgr.es/m/16865-4ffdc3e682e6d75b@postgresql.org
      38bb3aef
    • Amit Kapila's avatar
      Remove the unnecessary PrepareWrite in pgoutput. · f672df5f
      Amit Kapila authored
      This issue exists from the inception of this code (PG-10) but got exposed
      by the recent commit ce0fdbfe where we are using origins in tablesync
      workers. The problem was that we were sometimes sending the prepare_write
      ('w') message but then the actual message was not being sent and on the
      subscriber side, we always expect a message after prepare_write message
      which led to this bug.
      
      I refrained from backpatching this because there is no way in the core
      code to hit this prior to commit ce0fdbfe and we haven't received any
      complaints so far.
      
      Reported-by: Erik Rijkers
      Author: Amit Kapila and Vignesh C
      Tested-by: Erik Rijkers
      Discussion: https://postgr.es/m/1295168140.139428.1613133237154@webmailclassic.xs4all.nl
      f672df5f
    • Andres Freund's avatar
      Fix heap_page_prune() parameter order confusion introduced in dc7420c2. · 8001cb77
      Andres Freund authored
      Both luckily and unluckily the passed values meant the same for all
      types. Luckily because that meant my confusion caused no harm,
      unluckily because otherwise the compiler might have warned...
      
      In passing, synchronize parameter names between definition and
      declaration.
      Reported-By: default avatarPeter Geoghegan <pg@bowt.ie>
      Author: Andres Freund <andres@anarazel.de>
      Discussion: https://postgr.es/m/CAH2-Wz=L=nBoepQdH9b5Qd0nMvepFT2CnT6sjWvvpOXa=K8HVQ@mail.gmail.com
      8001cb77
    • Andres Freund's avatar
      Remove backwards compat ugliness in snapbuild.c. · a975ff49
      Andres Freund authored
      In 955a684e we fixed a bug in initial snapshot creation. In the
      course of which several members of struct SnapBuild were obsoleted. As
      SnapBuild is serialized to disk we couldn't change the memory layout.
      
      Unfortunately I subsequently forgot about removing the backward compat
      gunk, but luckily Heikki just reminded me.
      
      This commit bumps SNAPBUILD_VERSION, therefore breaking existing
      slots (which is fine in a major release).
      
      Author: Andres Freund
      Reminded-By: default avatarHeikki Linnakangas <hlinnaka@iki.fi>
      Discussion: https://postgr.es/m/c94be044-818f-15e3-1ad3-7a7ae2dfed0a@iki.fi
      a975ff49
  3. 15 Feb, 2021 13 commits
    • Tom Lane's avatar
      Simplify loop logic in nodeIncrementalSort.c. · 0e529031
      Tom Lane authored
      The inner loop in switchToPresortedPrefixMode() can be implemented
      as a conventional integer-counter for() loop, removing a couple of
      redundant boolean state variables.  The old logic here was a remnant
      of earlier development, but as things now stand there's no reason
      for extra complexity.
      
      Also, annotate the test case added by 82e0e293 to explain why it
      manages to hit the corner case fixed in that commit, and add an
      EXPLAIN to verify that it's creating an incremental-sort plan.
      
      Back-patch to v13, like the previous patch.
      
      James Coleman and Tom Lane
      
      Discussion: https://postgr.es/m/16846-ae49f51ac379a4cb@postgresql.org
      0e529031
    • Heikki Linnakangas's avatar
      Make ExecGetInsertedCols() and friends more robust and improve comments. · 54e51dcd
      Heikki Linnakangas authored
      If ExecGetInsertedCols(), ExecGetUpdatedCols() or ExecGetExtraUpdatedCols()
      were called with a ResultRelInfo that's not in the range table and isn't a
      partition routing target, the functions would dereference a NULL pointer,
      relinfo->ri_RootResultRelInfo. Such ResultRelInfos are created when firing
      RI triggers in tables that are not modified directly. None of the current
      callers of these functions pass such relations, so this isn't a live bug,
      but let's make them more robust.
      
      Also update comment in ResultRelInfo; after commit 6214e2b2,
      ri_RangeTableIndex is zero for ResultRelInfos created for partition tuple
      routing.
      
      Noted by Coverity. Backpatch down to v11, like commit 6214e2b2.
      
      Reviewed-by: Tom Lane, Amit Langote
      54e51dcd
    • Fujii Masao's avatar
      Display the time when the process started waiting for the lock, in pg_locks, take 2 · 46d6e5f5
      Fujii Masao authored
      This commit adds new column "waitstart" into pg_locks view. This column
      reports the time when the server process started waiting for the lock
      if the lock is not held. This information is useful, for example, when
      examining the amount of time to wait on a lock by subtracting
      "waitstart" in pg_locks from the current time, and identify the lock
      that the processes are waiting for very long.
      
      This feature uses the current time obtained for the deadlock timeout
      timer as "waitstart" (i.e., the time when this process started waiting
      for the lock). Since getting the current time newly can cause overhead,
      we reuse the already-obtained time to avoid that overhead.
      
      Note that "waitstart" is updated without holding the lock table's
      partition lock, to avoid the overhead by additional lock acquisition.
      This can cause "waitstart" in pg_locks to become NULL for a very short
      period of time after the wait started even though "granted" is false.
      This is OK in practice because we can assume that users are likely to
      look at "waitstart" when waiting for the lock for a long time.
      
      The first attempt of this patch (commit 3b733fcd) caused the buildfarm
      member "rorqual" (built with --disable-atomics --disable-spinlocks) to report
      the failure of the regression test. It was reverted by commit 890d2182.
      The cause of this failure was that the atomic variable for "waitstart"
      in the dummy process entry created at the end of prepare transaction was
      not initialized. This second attempt fixes that issue.
      
      Bump catalog version.
      
      Author: Atsushi Torikoshi
      Reviewed-by: Ian Lawrence Barwick, Robert Haas, Justin Pryzby, Fujii Masao
      Discussion: https://postgr.es/m/a96013dc51cdc56b2a2b84fa8a16a993@oss.nttdata.com
      46d6e5f5
    • Peter Geoghegan's avatar
      Add "LP_DEAD item?" column to GiST pageinspect functions · 9e596b65
      Peter Geoghegan authored
      This brings gist_page_items() and gist_page_items_bytea() in line with
      nbtree's bt_page_items() function.
      
      Minor follow-up to commit 756ab291, which added the GiST functions.
      
      Author: Andrey Borodin <x4mmm@yandex-team.ru>
      Discussion: https://postgr.es/m/E0794687-7315-4C29-A9C7-EC54D448596D@yandex-team.ru
      9e596b65
    • Peter Geoghegan's avatar
      Avoid misinterpreting GiST pages in pageinspect. · fa41cf8f
      Peter Geoghegan authored
      GistPageSetDeleted() sets pd_lower when deleting a page, and sets the
      page contents to a GISTDeletedPageContents.  Avoid treating deleted GiST
      pages as regular slotted pages within pageinspect.
      
      Oversight in commit 756ab291.
      
      Author: Andrey Borodin <x4mmm@yandex-team.ru>
      fa41cf8f
    • Peter Geoghegan's avatar
      Adjust lazy_scan_heap() accounting comments. · 7cde6b13
      Peter Geoghegan authored
      Explain which particular LP_DEAD line pointers get accounted for by the
      tups_vacuumed variable.
      7cde6b13
    • Thomas Munro's avatar
      Default to wal_sync_method=fdatasync on FreeBSD. · f900a79e
      Thomas Munro authored
      FreeBSD 13 gained O_DSYNC, which would normally cause wal_sync_method to
      choose open_datasync as its default value.  That may not be a good
      choice for all systems, and performs worse than fdatasync in some
      scenarios.  Let's preserve the existing default behavior for now.
      
      Like commit 576477e7, which did the same for Linux, back-patch to all
      supported releases.
      
      Discussion: https://postgr.es/m/CA%2BhUKGLsAMXBQrCxCXoW-JsUYmdOL8ALYvaX%3DCrHqWxm-nWbGA%40mail.gmail.com
      f900a79e
    • Thomas Munro's avatar
      Use pg_pwrite() in pg_test_fsync. · 2c8b42b5
      Thomas Munro authored
      For consistency with the PostgreSQL behavior this test program is
      intended to simulate, use pwrite() instead of lseek() + write().
      
      Also fix the final "non-sync" test, which was opening and closing the
      file for every write.
      
      Discussion: https://postgr.es/m/CA%2BhUKGJjjid2BJsvjMALBTduo1ogdx2SPYaTQL3wAy8y2hc4nw%40mail.gmail.com
      2c8b42b5
    • Amit Kapila's avatar
      Fix the warnings introduced in commit ce0fdbfe. · d9b0767b
      Amit Kapila authored
      Author: Amit Kapila
      Reviewed-by: Tom Lane
      Discussion: https://postgr.es/m/1610789.1613170207@sss.pgh.pa.us
      d9b0767b
    • Thomas Munro's avatar
      Hold interrupts while running dsm_detach() callbacks. · 637668fb
      Thomas Munro authored
      While cleaning up after a parallel query or parallel index creation that
      created temporary files, we could be interrupted by a statement timeout.
      The error handling path would then fail to clean up the files when it
      ran dsm_detach() again, because the callback was already popped off the
      list.  Prevent this hazard by holding interrupts while the cleanup code
      runs.
      
      Thanks to Heikki Linnakangas for this suggestion, and also to Kyotaro
      Horiguchi, Masahiko Sawada, Justin Pryzby and Tom Lane for discussion of
      this and earlier ideas on how to fix the problem.
      
      Back-patch to all supported releases.
      Reported-by: default avatarJustin Pryzby <pryzby@telsasoft.com>
      Discussion: https://postgr.es/m/20191212180506.GR2082@telsasoft.com
      637668fb
    • Michael Paquier's avatar
      Add result size as argument of pg_cryptohash_final() for overflow checks · b83dcf79
      Michael Paquier authored
      With its current design, a careless use of pg_cryptohash_final() could
      would result in an out-of-bound write in memory as the size of the
      destination buffer to store the result digest is not known to the
      cryptohash internals, without the caller knowing about that.  This
      commit adds a new argument to pg_cryptohash_final() to allow such sanity
      checks, and implements such defenses.
      
      The internals of SCRAM for HMAC could be tightened a bit more, but as
      everything is based on SCRAM_KEY_LEN with uses particular to this code
      there is no need to complicate its interface more than necessary, and
      this comes back to the refactoring of HMAC in core.  Except that, this
      minimizes the uses of the existing DIGEST_LENGTH variables, relying
      instead on sizeof() for the result sizes.  In ossp-uuid, this also makes
      the code more defensive, as it already relied on dce_uuid_t being at
      least the size of a MD5 digest.
      
      This is in philosophy similar to cfc40d38 for base64.c and aef8948f for
      hex.c.
      
      Reported-by: Ranier Vilela
      Author: Michael Paquier, Ranier Vilela
      Reviewed-by: Kyotaro Horiguchi
      Discussion: https://postgr.es/m/CAEudQAoqEGmcff3J4sTSV-R_16Monuz-UpJFbf_dnVH=APr02Q@mail.gmail.com
      b83dcf79
    • Tom Lane's avatar
      Minor fixes to improve regex debugging code. · 2dd67331
      Tom Lane authored
      When REG_DEBUG is defined, ensure that an un-filled "struct cnfa"
      is all-zeroes, not just that it has nstates == 0.  This is mainly
      so that looking at "struct subre" structs in gdb doesn't distract
      one with a lot of garbage fields during regex compilation.
      
      Adjust some places that print debug output to have suitable fflush
      calls afterwards.
      
      In passing, correct an erroneous ancient comment: the concatenation
      subre-s created by parsebranch() have op == '.' not ','.
      
      Noted while fooling around with some regex performance improvements.
      2dd67331
    • Thomas Munro's avatar
      ReadNewTransactionId() -> ReadNextTransactionId(). · c7ecd6af
      Thomas Munro authored
      The new name conveys the effect better, is more consistent with similar
      functions ReadNextMultiXactId(), ReadNextFullTransactionId(), and
      matches the name of the variable that it reads.
      Reported-by: default avatarPeter Geoghegan <pg@bowt.ie>
      Discussion: https://postgr.es/m/CAH2-WzmVR4SakBXQUdhhPpMf1aYvZCnna5%3DHKa7DAgEmBAg%2B8g%40mail.gmail.com
      c7ecd6af
  4. 13 Feb, 2021 2 commits
  5. 12 Feb, 2021 6 commits
    • Tom Lane's avatar
      Tweak compiler version cutoff for no_sanitize("alignment") support. · ad2ad698
      Tom Lane authored
      Buildfarm results show that gcc up through 7.x produces annoying
      warnings for this construct (and, presumably, wouldn't do the right
      thing anyway).  clang seems okay with the cutoff we have, though.
      
      Discussion: https://postgr.es/m/CAPpHfdsne3%3DT%3DfMNU45PtxdhSL_J2PjLTeS8rwKnJzUR4YNd4w%40mail.gmail.com
      Discussion: https://postgr.es/m/475514.1612745257%40sss.pgh.pa.us
      ad2ad698
    • Tom Lane's avatar
      Avoid divide-by-zero in regex_selectivity() with long fixed prefix. · ae4867ec
      Tom Lane authored
      Given a regex pattern with a very long fixed prefix (approaching 500
      characters), the result of pow(FIXED_CHAR_SEL, fixed_prefix_len) can
      underflow to zero.  Typically the preceding selectivity calculation
      would have underflowed as well, so that we compute 0/0 and get NaN.
      In released branches this leads to an assertion failure later on.
      That doesn't happen in HEAD, for reasons I've not explored yet,
      but it's surely still a bug.
      
      To fix, just skip the division when the pow() result is zero, so
      that we'll (most likely) return a zero selectivity estimate.  In
      the edge cases where "sel" didn't yet underflow, perhaps this
      isn't desirable, but I'm not sure that the case is worth spending
      a lot of effort on.  The results of regex_selectivity_sub() are
      barely worth the electrons they're written on anyway :-(
      
      Per report from Alexander Lakhin.  Back-patch to all supported versions.
      
      Discussion: https://postgr.es/m/6de0a0c3-ada9-cd0c-3e4e-2fa9964b41e3@gmail.com
      ae4867ec
    • Alexander Korotkov's avatar
      pg_attribute_no_sanitize_alignment() macro · 993bdb9f
      Alexander Korotkov authored
      Modern gcc and clang compilers offer alignment sanitizers, which help to detect
      pointer misalignment.  However, our codebase already contains x86-specific
      crc32 computation code, which uses unalignment access.  Thankfully, those
      compilers also support the attribute, which disables alignment sanitizers at
      the function level.  This commit adds pg_attribute_no_sanitize_alignment(),
      which wraps this attribute, and applies it to pg_comp_crc32c_sse42() function.
      
      Discussion: https://postgr.es/m/CAPpHfdsne3%3DT%3DfMNU45PtxdhSL_J2PjLTeS8rwKnJzUR4YNd4w%40mail.gmail.com
      Discussion: https://postgr.es/m/475514.1612745257%40sss.pgh.pa.us
      Author: Alexander Korotkov, revised by Tom Lane
      Reviewed-by: Tom Lane
      993bdb9f
    • Amit Kapila's avatar
      Fix Subscription test added by commit ce0fdbfe. · c8b21b03
      Amit Kapila authored
      We want to test the variants of Alter Subscription that are not allowed in
      the transaction block but for that, we don't need to create a subscription
      that tries to connect to the publisher. As such, there is no problem with
      this test but it is good to allow such tests to run with
      wal_level = minimal and max_wal_senders = 0 so as to keep them consistent
      with other tests.
      
      Reported by buildfarm.
      
      Author: Amit Kapila
      Reviewed-by: Ajin Cherian
      Discussion: https://postgr.es/m/CAA4eK1Lw0V+e1JPGHDq=+hVACv=14H8sR+2eJ1k3PEgwKmU-jQ@mail.gmail.com
      c8b21b03
    • Amit Kapila's avatar
      Allow multiple xacts during table sync in logical replication. · ce0fdbfe
      Amit Kapila authored
      For the initial table data synchronization in logical replication, we use
      a single transaction to copy the entire table and then synchronize the
      position in the stream with the main apply worker.
      
      There are multiple downsides of this approach: (a) We have to perform the
      entire copy operation again if there is any error (network breakdown,
      error in the database operation, etc.) while we synchronize the WAL
      position between tablesync worker and apply worker; this will be onerous
      especially for large copies, (b) Using a single transaction in the
      synchronization-phase (where we can receive WAL from multiple
      transactions) will have the risk of exceeding the CID limit, (c) The slot
      will hold the WAL till the entire sync is complete because we never commit
      till the end.
      
      This patch solves all the above downsides by allowing multiple
      transactions during the tablesync phase. The initial copy is done in a
      single transaction and after that, we commit each transaction as we
      receive. To allow recovery after any error or crash, we use a permanent
      slot and origin to track the progress. The slot and origin will be removed
      once we finish the synchronization of the table. We also remove slot and
      origin of tablesync workers if the user performs DROP SUBSCRIPTION .. or
      ALTER SUBSCRIPTION .. REFERESH and some of the table syncs are still not
      finished.
      
      The commands ALTER SUBSCRIPTION ... REFRESH PUBLICATION and
      ALTER SUBSCRIPTION ... SET PUBLICATION ... with refresh option as true
      cannot be executed inside a transaction block because they can now drop
      the slots for which we have no provision to rollback.
      
      This will also open up the path for logical replication of 2PC
      transactions on the subscriber side. Previously, we can't do that because
      of the requirement of maintaining a single transaction in tablesync
      workers.
      
      Bump catalog version due to change of state in the catalog
      (pg_subscription_rel).
      
      Author: Peter Smith, Amit Kapila, and Takamichi Osumi
      Reviewed-by: Ajin Cherian, Petr Jelinek, Hou Zhijie and Amit Kapila
      Discussion: https://postgr.es/m/CAA4eK1KHJxaZS-fod-0fey=0tq3=Gkn4ho=8N4-5HWiCfu0H1A@mail.gmail.com
      ce0fdbfe
    • Peter Geoghegan's avatar
      Remove obsolete IndexBulkDeleteResult stats field. · 3063eb17
      Peter Geoghegan authored
      The pages_removed field is no longer used for anything.  It hasn't been
      possible for an index to physically shrink since old-style VACUUM FULL
      was removed by commit 0a469c87.
      3063eb17
  6. 11 Feb, 2021 5 commits
  7. 10 Feb, 2021 7 commits
  8. 09 Feb, 2021 1 commit