1. 15 May, 2021 3 commits
    • Alvaro Herrera's avatar
      Unbreak EXEC_BACKEND build · 354f32d0
      Alvaro Herrera authored
      Per buildfarm
      354f32d0
    • Alvaro Herrera's avatar
      Allow compute_query_id to be set to 'auto' and make it default · cafde58b
      Alvaro Herrera authored
      Allowing only on/off meant that all either all existing configuration
      guides would become obsolete if we disabled it by default, or that we
      would have to accept a performance loss in the default config if we
      enabled it by default.  By allowing 'auto' as a middle ground, the
      performance cost is only paid by those who enable pg_stat_statements and
      similar modules.
      
      I only edited the release notes to comment-out a paragraph that is now
      factually wrong; further edits are probably needed to describe the
      related change in more detail.
      
      Author: Julien Rouhaud <rjuju123@gmail.com>
      Reviewed-by: default avatarJustin Pryzby <pryzby@telsasoft.com>
      Discussion: https://postgr.es/m/20210513002623.eugftm4nk2lvvks3@nol
      cafde58b
    • Tom Lane's avatar
      Be more careful about barriers when releasing BackgroundWorkerSlots. · 30d8bad4
      Tom Lane authored
      ForgetBackgroundWorker lacked any memory barrier at all, while
      BackgroundWorkerStateChange had one but unaccountably did
      additional manipulation of the slot after the barrier.  AFAICS,
      the rule must be that the barrier is immediately before setting
      or clearing slot->in_use.
      
      It looks like back in 9.6 when ForgetBackgroundWorker was first
      written, there might have been some case for not needing a
      barrier there, but I'm not very convinced of that --- the fact
      that the load of bgw_notify_pid is in the caller doesn't seem
      to guarantee no memory ordering problem.  So patch 9.6 too.
      
      It's likely that this doesn't fix any observable bug on Intel
      hardware, but machines with weaker memory ordering rules could
      have problems here.
      
      Discussion: https://postgr.es/m/4046084.1620244003@sss.pgh.pa.us
      30d8bad4
  2. 14 May, 2021 9 commits
    • Peter Geoghegan's avatar
      Harden nbtree deduplication posting split code. · 8f72bbac
      Peter Geoghegan authored
      Add a defensive "can't happen" error to code that handles nbtree posting
      list splits (promote an existing assertion).  This avoids a segfault in
      the event of an insertion of a newitem that is somehow identical to an
      existing non-pivot tuple in the index.  An nbtree index should never
      have two index tuples with identical TIDs.
      
      This scenario is not particular unlikely in the event of any kind of
      corruption that leaves the index in an inconsistent state relative to
      the heap relation that is indexed.  There are two known reports of
      preventable hard crashes.  Doing nothing seems unacceptable given the
      general expectation that nbtree will cope reasonably well with corrupt
      data.
      
      Discussion: https://postgr.es/m/CAH2-Wz=Jr_d-dOYEEmwz0-ifojVNWho01eAqewfQXgKfoe114w@mail.gmail.com
      Backpatch: 13-, where nbtree deduplication was introduced.
      8f72bbac
    • Tom Lane's avatar
      Prevent infinite insertion loops in spgdoinsert(). · c3c35a73
      Tom Lane authored
      Formerly we just relied on operator classes that assert longValuesOK
      to eventually shorten the leaf value enough to fit on an index page.
      That fails since the introduction of INCLUDE-column support (commit
      09c1c6ab), because the INCLUDE columns might alone take up more
      than a page, meaning no amount of leaf-datum compaction will get
      the job done.  At least with spgtextproc.c, that leads to an infinite
      loop, since spgtextproc.c won't throw an error for not being able
      to shorten the leaf datum anymore.
      
      To fix without breaking cases that would otherwise work, add logic
      to spgdoinsert() to verify that the leaf tuple size is decreasing
      after each "choose" step.  Some opclasses might not decrease the
      size on every single cycle, and in any case, alignment roundoff
      of the tuple size could obscure small gains.  Therefore, allow
      up to 10 cycles without additional savings before throwing an
      error.  (Perhaps this number will need adjustment, but it seems
      quite generous right now.)
      
      As long as we've developed this logic, let's back-patch it.
      The back branches don't have INCLUDE columns to worry about, but
      this seems like a good defense against possible bugs in operator
      classes.  We already know that an infinite loop here is pretty
      unpleasant, so having a defense seems to outweigh the risk of
      breaking things.  (Note that spgtextproc.c is actually the only
      known opclass with longValuesOK support, so that this is all moot
      for known non-core opclasses anyway.)
      
      Per report from Dilip Kumar.
      
      Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com
      c3c35a73
    • Tom Lane's avatar
      Fix query-cancel handling in spgdoinsert(). · eb7a6b92
      Tom Lane authored
      Knowing that a buggy opclass could cause an infinite insertion loop,
      spgdoinsert() intended to allow its loop to be interrupted by query
      cancel.  However, that never actually worked, because in iterations
      after the first, we'd be holding buffer lock(s) which would cause
      InterruptHoldoffCount to be positive, preventing servicing of the
      interrupt.
      
      To fix, check if an interrupt is pending, and if so fall out of
      the insertion loop and service the interrupt after we've released
      the buffers.  If it was indeed a query cancel, that's the end of
      the matter.  If it was a non-canceling interrupt reason, make use
      of the existing provision to retry the whole insertion.  (This isn't
      as wasteful as it might seem, since any upper-level index tuples we
      already created should be usable in the next attempt.)
      
      While there's no known instance of such a bug in existing release
      branches, it still seems like a good idea to back-patch this to
      all supported branches, since the behavior is fairly nasty if a
      loop does happen --- not only is it uncancelable, but it will
      quickly consume memory to the point of an OOM failure.  In any
      case, this code is certainly not working as intended.
      
      Per report from Dilip Kumar.
      
      Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com
      eb7a6b92
    • Tom Lane's avatar
      Refactor CHECK_FOR_INTERRUPTS() to add flexibility. · e47f93f9
      Tom Lane authored
      Split up CHECK_FOR_INTERRUPTS() to provide an additional macro
      INTERRUPTS_PENDING_CONDITION(), which just tests whether an
      interrupt is pending without attempting to service it.  This is
      useful in situations where the caller knows that interrupts are
      blocked, and would like to find out if it's worth the trouble
      to unblock them.
      
      Also add INTERRUPTS_CAN_BE_PROCESSED(), which indicates whether
      CHECK_FOR_INTERRUPTS() can be relied on to clear the pending interrupt.
      
      This commit doesn't actually add any uses of the new macros,
      but a follow-on bug fix will do so.  Back-patch to all supported
      branches to provide infrastructure for that fix.
      
      Alvaro Herrera and Tom Lane
      
      Discussion: https://postgr.es/m/20210513155351.GA7848@alvherre.pgsql
      e47f93f9
    • Alvaro Herrera's avatar
      Describe (auto-)analyze behavior for partitioned tables · 1b5617eb
      Alvaro Herrera authored
      This explains the new behavior introduced by 0827e8af as well as
      preexisting.
      
      Author: Justin Pryzby <pryzby@telsasoft.com>
      Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
      Discussion: https://postgr.es/m/20210423180152.GA17270@telsasoft.com
      1b5617eb
    • Bruce Momjian's avatar
      5eb1b27d
    • Peter Eisentraut's avatar
      Message style improvements · 09ae3299
      Peter Eisentraut authored
      09ae3299
    • Bruce Momjian's avatar
      521d08a2
    • David Rowley's avatar
      Convert misleading while loop into an if condition · 6cb93bed
      David Rowley authored
      This seems to be leftover from ea15e186 and from when we used to evaluate
      SRFs at each node.
      
      Since there is an unconditional "return" at the end of the loop body, only
      1 loop is ever possible, so we can just change this into an if condition.
      
      There is no actual bug being fixed here so no back-patch. It seems fine to
      just fix this anomaly in master only.
      
      Author: Greg Nancarrow
      Discussion: https://postgr.es/m/CAJcOf-d7T1q0az-D8evWXnsuBZjigT04WkV5hCAOEJQZRWy28w@mail.gmail.com
      6cb93bed
  3. 13 May, 2021 8 commits
  4. 12 May, 2021 11 commits
  5. 11 May, 2021 8 commits
  6. 10 May, 2021 1 commit
    • Tom Lane's avatar
      Fix mishandling of resjunk columns in ON CONFLICT ... UPDATE tlists. · 049e1e2e
      Tom Lane authored
      It's unusual to have any resjunk columns in an ON CONFLICT ... UPDATE
      list, but it can happen when MULTIEXPR_SUBLINK SubPlans are present.
      If it happens, the ON CONFLICT UPDATE code path would end up storing
      tuples that include the values of the extra resjunk columns.  That's
      fairly harmless in the short run, but if new columns are added to
      the table then the values would become accessible, possibly leading
      to malfunctions if they don't match the datatypes of the new columns.
      
      This had escaped notice through a confluence of missing sanity checks,
      including
      
      * There's no cross-check that a tuple presented to heap_insert or
      heap_update matches the table rowtype.  While it's difficult to
      check that fully at reasonable cost, we can easily add assertions
      that there aren't too many columns.
      
      * The output-column-assignment cases in execExprInterp.c lacked
      any sanity checks on the output column numbers, which seems like
      an oversight considering there are plenty of assertion checks on
      input column numbers.  Add assertions there too.
      
      * We failed to apply nodeModifyTable's ExecCheckPlanOutput() to
      the ON CONFLICT UPDATE tlist.  That wouldn't have caught this
      specific error, since that function is chartered to ignore resjunk
      columns; but it sure seems like a bad omission now that we've seen
      this bug.
      
      In HEAD, the right way to fix this is to make the processing of
      ON CONFLICT UPDATE tlists work the same as regular UPDATE tlists
      now do, that is don't add "SET x = x" entries, and use
      ExecBuildUpdateProjection to evaluate the tlist and combine it with
      old values of the not-set columns.  This adds a little complication
      to ExecBuildUpdateProjection, but allows removal of a comparable
      amount of now-dead code from the planner.
      
      In the back branches, the most expedient solution seems to be to
      (a) use an output slot for the ON CONFLICT UPDATE projection that
      actually matches the target table, and then (b) invent a variant of
      ExecBuildProjectionInfo that can be told to not store values resulting
      from resjunk columns, so it doesn't try to store into nonexistent
      columns of the output slot.  (We can't simply ignore the resjunk columns
      altogether; they have to be evaluated for MULTIEXPR_SUBLINK to work.)
      This works back to v10.  In 9.6, projections work much differently and
      we can't cheaply give them such an option.  The 9.6 version of this
      patch works by inserting a JunkFilter when it's necessary to get rid
      of resjunk columns.
      
      In addition, v11 and up have the reverse problem when trying to
      perform ON CONFLICT UPDATE on a partitioned table.  Through a
      further oversight, adjust_partition_tlist() discarded resjunk columns
      when re-ordering the ON CONFLICT UPDATE tlist to match a partition.
      This accidentally prevented the storing-bogus-tuples problem, but
      at the cost that MULTIEXPR_SUBLINK cases didn't work, typically
      crashing if more than one row has to be updated.  Fix by preserving
      resjunk columns in that routine.  (I failed to resist the temptation
      to add more assertions there too, and to do some minor code
      beautification.)
      
      Per report from Andres Freund.  Back-patch to all supported branches.
      
      Security: CVE-2021-32028
      049e1e2e