1. 08 Oct, 2018 4 commits
  2. 07 Oct, 2018 2 commits
    • Tom Lane's avatar
      Remove some unnecessary fields from Plan trees. · 52ed730d
      Tom Lane authored
      In the wake of commit f2343653, we no longer need some fields that
      were used before to control executor lock acquisitions:
      
      * PlannedStmt.nonleafResultRelations can go away entirely.
      
      * partitioned_rels can go away from Append, MergeAppend, and ModifyTable.
      However, ModifyTable still needs to know the RT index of the partition
      root table if any, which was formerly kept in the first entry of that
      list.  Add a new field "rootRelation" to remember that.  rootRelation is
      partly redundant with nominalRelation, in that if it's set it will have
      the same value as nominalRelation.  However, the latter field has a
      different purpose so it seems best to keep them distinct.
      
      Amit Langote, reviewed by David Rowley and Jesper Pedersen,
      and whacked around a bit more by me
      
      Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
      52ed730d
    • Alvaro Herrera's avatar
      Fix catalog insertion order for ATTACH PARTITION · 39808e88
      Alvaro Herrera authored
      Commit 2fbdf1b3 changed the order in which we inserted catalog rows
      when creating partitions, so that we could remove an unsightly hack
      required for untimely relcache invalidations.  However, that commit only
      changed the ordering for CREATE TABLE PARTITION OF, and left ALTER TABLE
      ATTACH PARTITION unchanged, so the latter can be affected when catalog
      invalidations occur, for instance when the partition key involves an SQL
      function.
      
      Reported-by: Rajkumar Raghuwanshi
      Author: Amit Langote
      Reviewed-by: Michaël Paquier
      Discussion: https://postgr.es/m/CAKcux6=nTz9KSfTr_6Z2mpzLJ_09JN-rK6=dWic6gGyTSWueyQ@mail.gmail.com
      39808e88
  3. 06 Oct, 2018 7 commits
    • Alvaro Herrera's avatar
      Fix event triggers for partitioned tables · ad08006b
      Alvaro Herrera authored
      Index DDL cascading on partitioned tables introduced a way for ALTER
      TABLE to be called reentrantly.  This caused an an important deficiency
      in event trigger support to be exposed: on exiting the reentrant call,
      the alter table state object was clobbered, causing a crash when the
      outer alter table tries to finalize its processing.  Fix the crash by
      creating a stack of event trigger state objects.  There are still ways
      to cause things to misbehave (and probably other crashers) with more
      elaborate tricks, but at least it now doesn't crash in the obvious
      scenario.
      
      Backpatch to 9.5, where DDL deparsing of event triggers was introduced.
      
      Reported-by: Marco Slot
      Authors: Michaël Paquier, Álvaro Herrera
      Discussion: https://postgr.es/m/CANNhMLCpi+HQ7M36uPfGbJZEQLyTy7XvX=5EFkpR-b1bo0uJew@mail.gmail.com
      ad08006b
    • Tom Lane's avatar
      Restore sane locking behavior during parallel query. · 29ef2b31
      Tom Lane authored
      Commit 9a3cebea changed things so that parallel workers didn't obtain
      any lock of their own on tables they access.  That was clearly a bad
      idea, but I'd mistakenly supposed that it was the intended end result
      of the series of patches for simplifying the executor's lock management.
      Undo that change in relation_open(), and adjust ExecOpenScanRelation()
      so that it gets the correct lock if inside a parallel worker.
      
      In passing, clean up some more obsolete comments about when locks
      are acquired.
      
      Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
      29ef2b31
    • Tom Lane's avatar
      Remove more redundant relation locking during executor startup. · f2343653
      Tom Lane authored
      We already have appropriate locks on every relation listed in the
      query's rangetable before we reach the executor.  Take the next step
      in exploiting that knowledge by removing code that worries about
      taking locks on non-leaf result relations in a partitioned table.
      
      In particular, get rid of ExecLockNonLeafAppendTables and a stanza in
      InitPlan that asserts we already have locks on certain such tables.
      
      In passing, clean up some now-obsolete comments in InitPlan.
      
      Amit Langote, reviewed by David Rowley and Jesper Pedersen,
      and whacked around a bit more by me
      
      Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
      f2343653
    • Tom Lane's avatar
      Don't use is_infinite() where isinf() will do. · 0209f028
      Tom Lane authored
      Places that aren't testing for sign should not use the more expensive
      function; it's just wasteful, not to mention being a cognitive load
      for readers who may know what isinf() is but not is_infinite().
      
      As things stand, we actually don't need is_infinite() anyplace except
      float4out/float8out, which means it could potentially go away altogether
      after the changes I proposed in <13178.1538794717@sss.pgh.pa.us>.
      0209f028
    • Tom Lane's avatar
      Propagate xactStartTimestamp and stmtStartTimestamp to parallel workers. · 07ee62ce
      Tom Lane authored
      Previously, a worker process would establish values for these based on
      its own start time.  In v10 and up, this can trivially be shown to cause
      misbehavior of transaction_timestamp(), timestamp_in(), and related
      functions which are (perhaps unwisely?) marked parallel-safe.  It seems
      likely that other behaviors might diverge from what happens in the parent
      as well.
      
      It's not as trivial to demonstrate problems in 9.6 or 9.5, but I'm sure
      it's still possible, so back-patch to all branches containing parallel
      worker infrastructure.
      
      In HEAD only, mark now() and statement_timestamp() as parallel-safe
      (other affected functions already were).  While in theory we could
      still squeeze that change into v11, it doesn't seem important enough
      to force a last-minute catversion bump.
      
      Konstantin Knizhnik, whacked around a bit by me
      
      Discussion: https://postgr.es/m/6406dbd2-5d37-4cb6-6eb2-9c44172c7e7c@postgrespro.ru
      07ee62ce
    • Dean Rasheed's avatar
      Improve the accuracy of floating point statistical aggregates. · e954a727
      Dean Rasheed authored
      When computing statistical aggregates like variance, the common
      schoolbook algorithm which computes the sum of the squares of the
      values and subtracts the square of the mean can lead to a large loss
      of precision when using floating point arithmetic, because the
      difference between the two terms is often very small relative to the
      terms themselves.
      
      To avoid this, re-work these aggregates to use the Youngs-Cramer
      algorithm, which is a proven, numerically stable algorithm that
      directly aggregates the sum of the squares of the differences of the
      values from the mean in a single pass over the data.
      
      While at it, improve the test coverage to test the aggregate combine
      functions used during parallel aggregation.
      
      Per report and suggested algorithm from Erich Schubert.
      
      Patch by me, reviewed by Madeleine Thompson.
      
      Discussion: https://postgr.es/m/153313051300.1397.9594490737341194671@wrigleys.postgresql.org
      e954a727
    • Michael Paquier's avatar
      Assign constraint name when cloning FK definition for partitions · 38921d14
      Michael Paquier authored
      This is for example used when attaching a partition to a partitioned
      table which includes foreign keys, and in this case the constraint name
      has been missing in the data cloned.  This could lead to hard crashes,
      as when validating the foreign key constraint, the constraint name is
      always expected.  Particularly, when using log_min_messages >= DEBUG1, a
      log message would be generated with this unassigned constraint name,
      leading to an assertion failure on HEAD.
      
      While on it, rename a variable in ATExecAttachPartition which was
      declared twice with the same name.
      
      Author: Michael Paquier
      Reviewed-by: Álvaro Herrera
      Discussion: https://postgr.es/m/20181005042236.GG1629@paquier.xyz
      Backpatch-through: 11
      38921d14
  4. 05 Oct, 2018 5 commits
    • Bruce Momjian's avatar
      doc: update PG 11 release notes · 6eb612fe
      Bruce Momjian authored
      Discussion: https://postgr.es/m/1f5b2e66-7ba8-98ec-c06a-aee9ff33f050@postgresql.org
      
      Author: Jonathan S. Katz
      
      Backpatch-through: 11
      6eb612fe
    • Tom Lane's avatar
      Allow btree comparison functions to return INT_MIN. · c87cb5f7
      Tom Lane authored
      Historically we forbade datatype-specific comparison functions from
      returning INT_MIN, so that it would be safe to invert the sort order
      just by negating the comparison result.  However, this was never
      really safe for comparison functions that directly return the result
      of memcmp(), strcmp(), etc, as POSIX doesn't place any such restriction
      on those library functions.  Buildfarm results show that at least on
      recent Linux on s390x, memcmp() actually does return INT_MIN sometimes,
      causing sort failures.
      
      The agreed-on answer is to remove this restriction and fix relevant
      call sites to not make such an assumption; code such as "res = -res"
      should be replaced by "INVERT_COMPARE_RESULT(res)".  The same is needed
      in a few places that just directly negated the result of memcmp or
      strcmp.
      
      To help find places having this problem, I've also added a compile option
      to nbtcompare.c that causes some of the commonly used comparators to
      return INT_MIN/INT_MAX instead of their usual -1/+1.  It'd likely be
      a good idea to have at least one buildfarm member running with
      "-DSTRESS_SORT_INT_MIN".  That's far from a complete test of course,
      but it should help to prevent fresh introductions of such bugs.
      
      This is a longstanding portability hazard, so back-patch to all supported
      branches.
      
      Discussion: https://postgr.es/m/20180928185215.ffoq2xrq5d3pafna@alap3.anarazel.de
      c87cb5f7
    • Tom Lane's avatar
      Ensure that PLPGSQL_DTYPE_ROW variables have valid refname fields. · 113a6599
      Tom Lane authored
      Without this, the syntax-tree-dumping functions in pl_funcs.c crash,
      and there are other places that might be at risk too.  Per report
      from Pavel Stehule.
      
      Looks like I broke this in commit f9263006, so back-patch to v11.
      
      Discussion: https://postgr.es/m/CAFj8pRA+3f5n4642q2g8BXCKjbTd7yU9JMYAgDyHgozk6cQ-VA@mail.gmail.com
      113a6599
    • Peter Eisentraut's avatar
      Remove redundant allocation · b5f03dc7
      Peter Eisentraut authored
      Author: Nikita Glukhov <n.gluhov@postgrespro.ru>
      b5f03dc7
    • Michael Paquier's avatar
      Add pg_ls_tmpdir function · 9cd92d1a
      Michael Paquier authored
      This lists the contents of a temporary directory associated to a given
      tablespace, useful to get information about on-disk consumption caused
      by temporary files used by a session query.  By default, pg_default is
      scanned, and a tablespace can be specified as argument.
      
      This function is intended to be used by monitoring tools, and, unlike
      pg_ls_dir(), access to them can be granted to non-superusers so that
      those monitoring tools can observe the principle of least privilege.
      Access is also given by default to members of pg_monitor.
      
      Author: Nathan Bossart
      Reviewed-by: Laurenz Albe
      Discussion: https://postgr.es/m/92F458A2-6459-44B8-A7F2-2ADD3225046A@amazon.com
      9cd92d1a
  5. 04 Oct, 2018 5 commits
  6. 03 Oct, 2018 8 commits
    • Andres Freund's avatar
    • Andres Freund's avatar
      Ensure that snprintf.c's fmtint() doesn't overflow when printing INT64_MIN. · 4868e446
      Andres Freund authored
      This isn't actually a live bug, as the output happens to be the
      same.  But it upsets tools like UBSan, which makes it worthwhile to
      fix.
      
      As it's an issue without practical consequences, don't backpatch.
      
      Author: Andres Freund
      Discussion: https://postgr.es/m/20180928001121.hhx5n6dsygqxr5wu@alap3.anarazel.de
      4868e446
    • Tom Lane's avatar
      Change executor to just Assert that table locks were already obtained. · 9a3cebea
      Tom Lane authored
      Instead of locking tables during executor startup, just Assert that
      suitable locks were obtained already during the parse/plan pipeline
      (or re-obtained by the plan cache).  This must be so, else we have a
      hazard that concurrent DDL has invalidated the plan.
      
      This is pretty inefficient as well as undercommented, but it's all going
      to go away shortly, so I didn't try hard.  This commit is just another
      attempt to use the buildfarm to see if we've missed anything in the plan
      to simplify the executor's table management.
      
      Note that the change needed here in relation_open() exposes that
      parallel workers now really are accessing tables without holding any
      lock of their own, whereas they were not doing that before this commit.
      This does not give me a warm fuzzy feeling about that aspect of parallel
      query; it does not seem like a good design, and we now know that it's
      had exactly no actual testing.  I think that we should modify parallel
      query so that that change can be reverted.
      
      Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
      9a3cebea
    • Andres Freund's avatar
      Fix issues around EXPLAIN with JIT. · c03c1449
      Andres Freund authored
      I (Andres) was more than a bit hasty in committing 33001fd7
      after last minute changes, leading to a number of problems (jit output
      was only shown for JIT in parallel workers, and just EXPLAIN without
      ANALYZE didn't work).  Lukas luckily found these issues quickly.
      
      Instead of combining instrumentation in in standard_ExecutorEnd(), do
      so on demand in the new ExplainPrintJITSummary().
      
      Also update a documentation example of the JIT output, changed in
      52050ad8.
      
      Author: Lukas Fittl, with minor changes by me
      Discussion: https://postgr.es/m/CAP53PkxmgJht69pabxBXJBM+0oc6kf3KHMborLP7H2ouJ0CCtQ@mail.gmail.com
      Backpatch: 11, where JIT compilation was introduced
      c03c1449
    • Tom Lane's avatar
      Rationalize snprintf.c's handling of "ll" formats. · 595a0eab
      Tom Lane authored
      Although all known platforms define "long long" as 64 bits, it still feels
      a bit shaky to be using "va_arg(args, int64)" to pull out an argument that
      the caller thought was declared "long long".  The reason it was coded like
      this, way back in commit 3311c766, was to work around the possibility that
      the compiler had no type named "long long" --- and, at the time, that it
      maybe didn't have 64-bit ints at all.  Now that we're requiring compilers
      to support C99, those concerns are moot.  Let's make the code clearer and
      more bulletproof by writing "long long" where we mean "long long".
      
      This does introduce a hazard that we'd inefficiently use 128-bit arithmetic
      to convert plain old integers.  The way to tackle that would be to provide
      two versions of fmtint(), one for "long long" and one for narrower types.
      Since, as of today, no platforms require that, we won't bother with the
      extra code for now.
      
      Discussion: https://postgr.es/m/1680.1538587115@sss.pgh.pa.us
      595a0eab
    • Tom Lane's avatar
      Provide fast path in snprintf.c for conversion specs that are just "%s". · 6d842be6
      Tom Lane authored
      This case occurs often enough (around 45% of conversion specs executed
      in our regression tests are just "%s") that it's worth an extra test
      per conversion spec to allow skipping all the logic associated with
      field widths and padding when it happens.
      
      Discussion: https://postgr.es/m/26193.1538582367@sss.pgh.pa.us
      6d842be6
    • Tom Lane's avatar
      Make assorted performance improvements in snprintf.c. · abd9ca37
      Tom Lane authored
      In combination, these changes make our version of snprintf as fast
      or faster than most platforms' native snprintf, except for cases
      involving floating-point conversion (which we still delegate to
      the native sprintf).  The speed penalty for a float conversion
      is down to around 10% though, much better than before.
      
      Notable changes:
      
      * Rather than always parsing the format twice to see if it contains
      instances of %n$, do the extra scan only if we actually find a $.
      This obviously wins for non-localized formats, and even when there
      is use of %n$, we can avoid scanning text before the first % twice.
      
      * Use strchrnul() if available to find the next %, and emit the
      literal text between % escapes as strings rather than char-by-char.
      
      * Create a bespoke function (dopr_outchmulti) for the common case
      of emitting N copies of the same character, in place of writing
      loops around dopr_outch.
      
      * Simplify construction of the format string for invocations of sprintf
      for floats.
      
      * Const-ify some internal functions, and avoid unnecessary use of
      pass-by-reference arguments.
      
      Patch by me, reviewed by Andres Freund
      
      Discussion: https://postgr.es/m/11787.1534530779@sss.pgh.pa.us
      abd9ca37
    • Amit Kapila's avatar
      MAXALIGN the target address where we store flattened value. · 9bc9f72b
      Amit Kapila authored
      The API (EOH_flatten_into) that flattens the expanded value representation
      expects the target address to be maxaligned.  All it's usage adhere to that
      principle except when serializing datums for parallel query.  Fix that
      usage.
      
      Diagnosed-by: Tom Lane
      Author: Tom Lane and Amit Kapila
      Backpatch-through: 9.6
      Discussion: https://postgr.es/m/11629.1536550032@sss.pgh.pa.us
      9bc9f72b
  7. 02 Oct, 2018 7 commits
    • Andrew Dunstan's avatar
      Don't build static libraries on Cygwin · a33245a8
      Andrew Dunstan authored
      Cygwin has been building and linking against static libraries. Although
      a bug this has been relatively harmless until now, when this has caused
      errors due to changes in the way we build certain libraries. So this
      patch makes things work the way we always intended, namely that we would
      link against the dynamic libraries (cygpq.dll etc.) and just not build
      the static libraries. The downstream packagers have been doing this for
      some time, so this just aligns with their practice.
      
      Extracted from a patch by Marco Atzeri, with a suggestion from Tom Lane.
      
      Discussion: https://postgr.es/m/1056.1538235347@sss.pgh.pa.us
      a33245a8
    • Tom Lane's avatar
      Change rewriter/planner/executor/plancache to depend on RTE rellockmode. · 6e35939f
      Tom Lane authored
      Instead of recomputing the required lock levels in all these places,
      just use what commit fdba460a made the parser store in the RTE fields.
      This already simplifies the code measurably in these places, and
      follow-on changes will remove a bunch of no-longer-needed infrastructure.
      
      In a few cases, this change causes us to acquire a higher lock level
      than we did before.  This is OK primarily because said higher lock level
      should've been acquired already at query parse time; thus, we're saving
      a useless extra trip through the shared lock manager to acquire a lesser
      lock alongside the original lock.  The only known exception to this is
      that re-execution of a previously planned SELECT FOR UPDATE/SHARE query,
      for a table that uses ROW_MARK_REFERENCE or ROW_MARK_COPY methods, might
      have gotten only AccessShareLock before.  Now it will get RowShareLock
      like the first execution did, which seems fine.
      
      While there's more to do, push it in this state anyway, to let the
      buildfarm help verify that nothing bad happened.
      
      Amit Langote, reviewed by David Rowley and Jesper Pedersen,
      and whacked around a bit more by me
      
      Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
      6e35939f
    • Andres Freund's avatar
      Use slots more widely in tuple mapping code and make naming more consistent. · cc2905e9
      Andres Freund authored
      It's inefficient to use a single slot for mapping between tuple
      descriptors for multiple tuples, as previously done when using
      ConvertPartitionTupleSlot(), as that means the slot's tuple descriptors
      change for every tuple.
      
      Previously we also, via ConvertPartitionTupleSlot(), built new tuples
      after the mapping even in cases where we, immediately afterwards,
      access individual columns again.
      
      Refactor the code so one slot, on demand, is used for each
      partition. That avoids having to change the descriptor (and allows to
      use the more efficient "fixed" tuple slots). Then use slot->slot
      mapping, to avoid unnecessarily forming a tuple.
      
      As the naming between the tuple and slot mapping functions wasn't
      consistent, rename them to execute_attr_map_{tuple,slot}.  It's likely
      that we'll also rename convert_tuples_by_* to denote that these
      functions "only" build a map, but that's left for later.
      
      Author: Amit Khandekar and Amit Langote, editorialized by me
      Reviewed-By: Amit Langote, Amit Khandekar, Andres Freund
      Discussion:
          https://postgr.es/m/CAJ3gD9fR0wRNeAE8VqffNTyONS_UfFPRpqxhnD9Q42vZB+Jvpg@mail.gmail.com
          https://postgr.es/m/e4f9d743-cd4b-efb0-7574-da21d86a7f36%40lab.ntt.co.jp
      Backpatch: -
      cc2905e9
    • Tom Lane's avatar
      Set snprintf.c's maximum number of NL arguments to be 31. · 625b38ea
      Tom Lane authored
      Previously, we used the platform's NL_ARGMAX if any, otherwise 16.
      The trouble with this is that the platform value is hugely variable,
      ranging from the POSIX-minimum 9 to as much as 64K on recent FreeBSD.
      Values of more than a dozen or two have no practical use and slow down
      the initialization of the argtypes array.  Worse, they cause snprintf.c
      to consume far more stack space than was the design intention, possibly
      resulting in stack-overflow crashes.
      
      Standardize on 31, which is comfortably more than we need (it looks like
      no existing translatable message has more than about 10 parameters).
      I chose that, not 32, to make the array sizes powers of 2, for some
      possible small gain in speed of the memset.
      
      The lack of reported crashes suggests that the set of platforms we
      use snprintf.c on (in released branches) may have no overlap with
      the set where NL_ARGMAX has unreasonably large values.  But that's
      not entirely clear, so back-patch to all supported branches.
      
      Per report from Mateusz Guzik (via Thomas Munro).
      
      Discussion: https://postgr.es/m/CAEepm=3VF=PUp2f8gU8fgZB22yPE_KBS0+e1AHAtQ=09schTHg@mail.gmail.com
      625b38ea
    • Tom Lane's avatar
      Fix corner-case failures in has_foo_privilege() family of functions. · 3d0f68dd
      Tom Lane authored
      The variants of these functions that take numeric inputs (OIDs or
      column numbers) are supposed to return NULL rather than failing
      on bad input; this rule reduces problems with snapshot skew when
      queries apply the functions to all rows of a catalog.
      
      has_column_privilege() had careless handling of the case where the
      table OID didn't exist.  You might get something like this:
      	select has_column_privilege(9999,'nosuchcol','select');
      	ERROR:  column "nosuchcol" of relation "(null)" does not exist
      or you might get a crash, depending on the platform's printf's response
      to a null string pointer.
      
      In addition, while applying the column-number variant to a dropped
      column returned NULL as desired, applying the column-name variant
      did not:
      	select has_column_privilege('mytable','........pg.dropped.2........','select');
      	ERROR:  column "........pg.dropped.2........" of relation "mytable" does not exist
      It seems better to make this case return NULL as well.
      
      Also, the OID-accepting variants of has_foreign_data_wrapper_privilege,
      has_server_privilege, and has_tablespace_privilege didn't follow the
      principle of returning NULL for nonexistent OIDs.  Superusers got TRUE,
      everybody else got an error.
      
      Per investigation of Jaime Casanova's report of a new crash in HEAD.
      These behaviors have been like this for a long time, so back-patch to
      all supported branches.
      
      Patch by me; thanks to Stephen Frost for discussion and review
      
      Discussion: https://postgr.es/m/CAJGNTeP=-6Gyqq5TN9OvYEydi7Fv1oGyYj650LGTnW44oAzYCg@mail.gmail.com
      3d0f68dd
    • Michael Paquier's avatar
      Fix documentation of pgrowlocks using "lock_type" instead of "modes" · 80810ca6
      Michael Paquier authored
      The example used in the documentation is outdated as well.  This is an
      oversight from 0ac5ad51, which bumped up pgrowlocks but forgot some bits
      of the documentation.
      
      Reported-by: Chris Wilson
      Discussion: https://postgr.es/m/153838692816.2950.12001142346234155699@wrigleys.postgresql.org
      Backpatch-through: 9.3
      80810ca6
    • Amit Kapila's avatar
      Test passing expanded-value representations to workers. · 0fd6a8a7
      Amit Kapila authored
      Currently, we don't have an explicit test to pass expanded-value
      representations to workers, so we don't know whether it works on all kind
      of platforms.  We suspect that the current code won't work on
      alignment-sensitive hardware.  This commit will test that aspect and can
      lead to failure on some of the buildfarm machines which we will fix in the
      later commit.
      
      Author: Tom Lane and Amit Kapila
      Discussion: https://postgr.es/m/11629.1536550032@sss.pgh.pa.us
      0fd6a8a7
  8. 01 Oct, 2018 2 commits