1. 07 Mar, 2019 7 commits
    • Robert Haas's avatar
      Allow ATTACH PARTITION with only ShareUpdateExclusiveLock. · 898e5e32
      Robert Haas authored
      We still require AccessExclusiveLock on the partition itself, because
      otherwise an insert that violates the newly-imposed partition
      constraint could be in progress at the same time that we're changing
      that constraint; only the lock level on the parent relation is
      weakened.
      
      To make this safe, we have to cope with (at least) three separate
      problems. First, relevant DDL might commit while we're in the process
      of building a PartitionDesc.  If so, find_inheritance_children() might
      see a new partition while the RELOID system cache still has the old
      partition bound cached, and even before invalidation messages have
      been queued.  To fix that, if we see that the pg_class tuple seems to
      be missing or to have a null relpartbound, refetch the value directly
      from the table. We can't get the wrong value, because DETACH PARTITION
      still requires AccessExclusiveLock throughout; if we ever want to
      change that, this will need more thought. In testing, I found it quite
      difficult to hit even the null-relpartbound case; the race condition
      is extremely tight, but the theoretical risk is there.
      
      Second, successive calls to RelationGetPartitionDesc might not return
      the same answer.  The query planner will get confused if lookup up the
      PartitionDesc for a particular relation does not return a consistent
      answer for the entire duration of query planning.  Likewise, query
      execution will get confused if the same relation seems to have a
      different PartitionDesc at different times.  Invent a new
      PartitionDirectory concept and use it to ensure consistency.  This
      ensures that a single invocation of either the planner or the executor
      sees the same view of the PartitionDesc from beginning to end, but it
      does not guarantee that the planner and the executor see the same
      view.  Since this allows pointers to old PartitionDesc entries to
      survive even after a relcache rebuild, also postpone removing the old
      PartitionDesc entry until we're certain no one is using it.
      
      For the most part, it seems to be OK for the planner and executor to
      have different views of the PartitionDesc, because the executor will
      just ignore any concurrently added partitions which were unknown at
      plan time; those partitions won't be part of the inheritance
      expansion, but invalidation messages will trigger replanning at some
      point.  Normally, this happens by the time the very next command is
      executed, but if the next command acquires no locks and executes a
      prepared query, it can manage not to notice until a new transaction is
      started.  We might want to tighten that up, but it's material for a
      separate patch.  There would still be a small window where a query
      that started just after an ATTACH PARTITION command committed might
      fail to notice its results -- but only if the command starts before
      the commit has been acknowledged to the user. All in all, the warts
      here around serializability seem small enough to be worth accepting
      for the considerable advantage of being able to add partitions without
      a full table lock.
      
      Although in general the consequences of new partitions showing up
      between planning and execution are limited to the query not noticing
      the new partitions, run-time partition pruning will get confused in
      that case, so that's the third problem that this patch fixes.
      Run-time partition pruning assumes that indexes into the PartitionDesc
      are stable between planning and execution.  So, add code so that if
      new partitions are added between plan time and execution time, the
      indexes stored in the subplan_map[] and subpart_map[] arrays within
      the plan's PartitionedRelPruneInfo get adjusted accordingly.  There
      does not seem to be a simple way to generalize this scheme to cope
      with partitions that are removed, mostly because they could then get
      added back again with different bounds, but it works OK for added
      partitions.
      
      This code does not try to ensure that every backend participating in
      a parallel query sees the same view of the PartitionDesc.  That
      currently doesn't matter, because we never pass PartitionDesc
      indexes between backends.  Each backend will ignore the concurrently
      added partitions which it notices, and it doesn't matter if different
      backends are ignoring different sets of concurrently added partitions.
      If in the future that matters, for example because we allow writes in
      parallel query and want all participants to do tuple routing to the same
      set of partitions, the PartitionDirectory concept could be improved to
      share PartitionDescs across backends.  There is a draft patch to
      serialize and restore PartitionDescs on the thread where this patch
      was discussed, which may be a useful place to start.
      
      Patch by me.  Thanks to Alvaro Herrera, David Rowley, Simon Riggs,
      Amit Langote, and Michael Paquier for discussion, and to Alvaro
      Herrera for some review.
      
      Discussion: http://postgr.es/m/CA+Tgmobt2upbSocvvDej3yzokd7AkiT+PvgFH+a9-5VV1oJNSQ@mail.gmail.com
      Discussion: http://postgr.es/m/CA+TgmoZE0r9-cyA-aY6f8WFEROaDLLL7Vf81kZ8MtFCkxpeQSw@mail.gmail.com
      Discussion: http://postgr.es/m/CA+TgmoY13KQZF-=HNTrt9UYWYx3_oYOQpu9ioNT49jGgiDpUEA@mail.gmail.com
      898e5e32
    • Alvaro Herrera's avatar
      Fix broken markup · ec51727f
      Alvaro Herrera authored
      ec51727f
    • Alvaro Herrera's avatar
      Fix the BY {REF,VALUE} clause of XMLEXISTS/XMLTABLE · eaaa5986
      Alvaro Herrera authored
      This clause is used to indicate the passing mode of a XML document, but
      we were doing it wrong: we accepted BY REF and ignored it, and rejected
      BY VALUE as a syntax error.  The reality, however, is that documents are
      always passed BY VALUE, so rejecting that clause was silly.  Change
      things so that we accept BY VALUE.
      
      BY REF continues to be accepted, and continues to be ignored.
      
      Author: Chapman Flack
      Reviewed-by: Pavel Stehule
      Discussion: https://postgr.es/m/5C297BB7.9070509@anastigmatix.net
      eaaa5986
    • Alvaro Herrera's avatar
      Add missing <limits.h> · cb706ec4
      Alvaro Herrera authored
      Per buildfarm
      cb706ec4
    • Alvaro Herrera's avatar
      pg_dump: allow multiple rows per insert · 7e413a0f
      Alvaro Herrera authored
      This is useful to speed up loading data in a different database engine.
      
      Authors: Surafel Temesgen and David Rowley.  Lightly edited by Álvaro.
      Reviewed-by: Fabien Coelho
      Discussion: https://postgr.es/m/CALAY4q9kumSdnRBzvRJvSRf2+BH20YmSvzqOkvwpEmodD-xv6g@mail.gmail.com
      7e413a0f
    • Thomas Munro's avatar
      Remove useless header inclusion. · 42210524
      Thomas Munro authored
      42210524
    • Thomas Munro's avatar
      Drop the vestigial "smgr" type. · 91595f9d
      Thomas Munro authored
      Before commit 3fa2bb31 this type appeared in the catalogs to
      select which of several block storage mechanisms each relation
      used.
      
      New features under development propose to revive the concept of
      different block storage managers for new kinds of data accessed
      via bufmgr.c, but don't need to put references to them in the
      catalogs.  So, avoid useless maintenance work on this type by
      dropping it.  Update some regression tests that were referencing
      it where any type would do.
      
      Discussion: https://postgr.es/m/CA%2BhUKG%2BDE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo%2B83vvw%40mail.gmail.com
      91595f9d
  2. 06 Mar, 2019 12 commits
  3. 05 Mar, 2019 4 commits
  4. 04 Mar, 2019 9 commits
  5. 03 Mar, 2019 4 commits
    • Andrew Dunstan's avatar
      Don't do pg_ctl logrotate test on Windows · d611175e
      Andrew Dunstan authored
      The test crashes and burns quite badly, for some reason, but even if it
      didn't it wouldn't work, since Windows doesn't let you rename a file
      held by a running process.
      d611175e
    • Tom Lane's avatar
      Improve performance of index-only scans with many index columns. · 80b9e9c4
      Tom Lane authored
      StoreIndexTuple was a loop over index_getattr, which is O(N^2)
      if the index columns are variable-width, and the performance
      impact is already quite visible at ten columns.  The obvious
      move is to replace that with a call to index_deform_tuple ...
      but that's *also* a loop over index_getattr.  Improve it to
      be essentially a clone of heap_deform_tuple.
      
      (There are a few other places that loop over all index columns
      with index_getattr, and perhaps should be changed likewise,
      but most of them don't seem performance-critical.  Anyway, the
      rest would mostly only be interested in the index key columns,
      which there aren't likely to be so many of.  Wide index tuples
      are a new thing with INCLUDE.)
      
      Konstantin Knizhnik
      
      Discussion: https://postgr.es/m/e06b2d27-04fc-5c0e-bb8c-ecd72aa24959@postgrespro.ru
      80b9e9c4
    • Andrew Dunstan's avatar
      Avoid accidental wildcard expansion in msys shell · 78b408a2
      Andrew Dunstan authored
      Commit f092de05 added a test for pg_dumpall --exclude-database including
      the wildcard pattern '*dump*' which matches some files in the source
      directory. The test library on msys uses the shell which expands this
      and thus the program gets incorrect arguments. This doesn't happen if
      the pattern doesn't match any files, so here the pattern is set to
      '*dump_test*' which is such a pattern.
      
      Per buildfarm animal jacana.
      78b408a2
    • Dean Rasheed's avatar
      Further fixing for multi-row VALUES lists for updatable views. · ed4653db
      Dean Rasheed authored
      Previously, rewriteTargetListIU() generated a list of attribute
      numbers from the targetlist, which were passed to rewriteValuesRTE(),
      which expected them to contain the same number of entries as there are
      columns in the VALUES RTE, and to be in the same order. That was fine
      when the target relation was a table, but for an updatable view it
      could be broken in at least three different ways ---
      rewriteTargetListIU() could insert additional targetlist entries for
      view columns with defaults, the view columns could be in a different
      order from the columns of the underlying base relation, and targetlist
      entries could be merged together when assigning to elements of an
      array or composite type. As a result, when recursing to the base
      relation, the list of attribute numbers generated from the rewritten
      targetlist could no longer be relied upon to match the columns of the
      VALUES RTE. We got away with that prior to 41531e42 because it used
      to always be the case that rewriteValuesRTE() did nothing for the
      underlying base relation, since all DEFAULTS had already been replaced
      when it was initially invoked for the view, but that was incorrect
      because it failed to apply defaults from the base relation.
      
      Fix this by examining the targetlist entries more carefully and
      picking out just those that are simple Vars referencing the VALUES
      RTE. That's sufficient for the purposes of rewriteValuesRTE(), which
      is only responsible for dealing with DEFAULT items in the VALUES
      RTE. Any DEFAULT item in the VALUES RTE that doesn't have a matching
      simple-Var-assignment in the targetlist is an error which we complain
      about, but in theory that ought to be impossible.
      
      Additionally, move this code into rewriteValuesRTE() to give a clearer
      separation of concerns between the 2 functions. There is no need for
      rewriteTargetListIU() to know about the details of the VALUES RTE.
      
      While at it, fix the comment for rewriteValuesRTE() which claimed that
      it doesn't support array element and field assignments --- that hasn't
      been true since a3c7a993 (9.6 and later).
      
      Back-patch to all supported versions, with minor differences for the
      pre-9.6 branches, which don't support array element and field
      assignments to the same column in multi-row VALUES lists.
      
      Reviewed by Amit Langote.
      
      Discussion: https://postgr.es/m/15623-5d67a46788ec8b7f@postgresql.org
      ed4653db
  6. 02 Mar, 2019 2 commits
  7. 01 Mar, 2019 2 commits
    • Tom Lane's avatar
      Check we don't misoptimize a NOT IN where the subquery returns no rows. · 3396138a
      Tom Lane authored
      Future-proofing against a common mistake in attempts to optimize NOT IN.
      We don't have such an optimization right now, but attempts to do so
      are in the works, and some of 'em are buggy.  Add a regression test case
      covering the point.
      
      David Rowley
      
      Discussion: https://postgr.es/m/CAKJS1f90E9agVZryVyUpbHQbjTt5ExqS2Fsodmt5_A7E_cEyVA@mail.gmail.com
      3396138a
    • Tom Lane's avatar
      Teach optimizer's predtest.c more things about ScalarArrayOpExpr. · 65ce07e0
      Tom Lane authored
      In particular, make it possible to prove/refute "x IS NULL" and
      "x IS NOT NULL" predicates from a clause involving a ScalarArrayOpExpr
      even when we are unable or unwilling to deconstruct the expression
      into an AND/OR tree.  This avoids a former unexpected degradation of
      plan quality when the size of an ARRAY[] expression or array constant
      exceeded the arbitrary MAX_SAOP_ARRAY_SIZE limit.  For IS-NULL proofs,
      we don't really care about the values of the individual array elements;
      at most, we care whether there are any, and for some common cases we
      needn't even know that.
      
      The main user-visible effect of this is to let the optimizer recognize
      applicability of partial indexes with "x IS NOT NULL" predicates to
      queries with "x IN (array)" clauses in some cases where it previously
      failed to recognize that.  The structure of predtest.c is such that a
      bunch of related proofs will now also succeed, but they're probably
      much less useful in the wild.
      
      James Coleman, reviewed by David Rowley
      
      Discussion: https://postgr.es/m/CAAaqYe8yKSvzbyu8w-dThRs9aTFMwrFxn_BkTYeXgjqe3CbNjg@mail.gmail.com
      65ce07e0