1. 29 Mar, 2019 11 commits
  2. 28 Mar, 2019 7 commits
    • Thomas Munro's avatar
      Fix typo. · 7e69323b
      Thomas Munro authored
      Author: Masahiko Sawada
      7e69323b
    • Andres Freund's avatar
      Fix a few comment copy & pastos. · 46bcd2af
      Andres Freund authored
      46bcd2af
    • Tomas Vondra's avatar
      Fix deserialization of pg_mcv_list values · 62bf0fb3
      Tomas Vondra authored
      There were multiple issues in deserialization of pg_mcv_list values.
      
      Firstly, the data is loaded from syscache, but the deserialization was
      performed after ReleaseSysCache(), at which point the data might have
      already disappeared.  Fixed by moving the calls in statext_mcv_load,
      and using the same NULL-handling code as existing stats.
      
      Secondly, the deserialized representation used pointers into the
      serialized representation.  But that is also unsafe, because the data
      may disappear at any time.  Fixed by reworking and simplifying the
      deserialization code to always copy all the data.
      
      And thirdly, when deserializing values for types passed by value, the
      code simply did memcpy(d,s,typlen) which however does not work on
      bigendian machines.  Fixed by using fetch_att/store_att_byval.
      62bf0fb3
    • Peter Eisentraut's avatar
      doc: Fix typo · f3afbbda
      Peter Eisentraut authored
      f3afbbda
    • Thomas Munro's avatar
      Use FullTransactionId for the transaction stack. · ad308058
      Thomas Munro authored
      Provide GetTopFullTransactionId() and GetCurrentFullTransactionId().
      The intended users of these interfaces are access methods that use
      xids for visibility checks but don't want to have to go back and
      "freeze" existing references some time later before the 32 bit xid
      counter wraps around.
      
      Use a new struct to serialize the transaction state for parallel
      query, because FullTransactionId doesn't fit into the previous
      serialization scheme very well.
      
      Author: Thomas Munro
      Reviewed-by: Heikki Linnakangas
      Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
      ad308058
    • Thomas Munro's avatar
      Add basic infrastructure for 64 bit transaction IDs. · 2fc7af5e
      Thomas Munro authored
      Instead of inferring epoch progress from xids and checkpoints,
      introduce a 64 bit FullTransactionId type and use it to track xid
      generation.  This fixes an unlikely bug where the epoch is reported
      incorrectly if the range of active xids wraps around more than once
      between checkpoints.
      
      The only user-visible effect of this commit is to correct the epoch
      used by txid_current() and txid_status(), also visible with
      pg_controldata, in those rare circumstances.  It also creates some
      basic infrastructure so that later patches can use 64 bit
      transaction IDs in more places.
      
      The new type is a struct that we pass by value, as a form of strong
      typedef.  This prevents the sort of accidental confusion between
      TransactionId and FullTransactionId that would be possible if we
      were to use a plain old uint64.
      
      Author: Thomas Munro
      Reported-by: Amit Kapila
      Reviewed-by: Andres Freund, Tom Lane, Heikki Linnakangas
      Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
      2fc7af5e
    • Andres Freund's avatar
      tableam: Support for an index build's initial table scan(s). · 2a96909a
      Andres Freund authored
      To support building indexes over tables of different AMs, the scans to
      do so need to be routed through the table AM.  While moving a fair
      amount of code, nearly all the changes are just moving code to below a
      callback.
      
      Currently the range based interface wouldn't make much sense for non
      block based table AMs. But that seems aceptable for now.
      
      Author: Andres Freund
      Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
      2a96909a
  3. 27 Mar, 2019 14 commits
    • Peter Eisentraut's avatar
      Fix vpath build · 12bb35fc
      Peter Eisentraut authored
      Skip doc/src/sgml/images/Makefile since the directory is not created.
      12bb35fc
    • Peter Eisentraut's avatar
      doc: Add some images · ea55aec0
      Peter Eisentraut authored
      Add infrastructure for having images in the documentation, in SVG
      format.  Add two images to start with.  See the included README file
      for instructions.
      
      Author: Jürgen Purtz <juergen@purtz.de>
      Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
      Discussion: https://www.postgresql.org/message-id/flat/aaa54502-05c0-4ea5-9af8-770411a6bf4b@purtz.de
      ea55aec0
    • Peter Eisentraut's avatar
      doc: Move htmlhelp output to subdirectory · 477422c9
      Peter Eisentraut authored
      This makes it behave more like the html output.  That will make some
      subsequent changes across all output formats easier.
      477422c9
    • Peter Eisentraut's avatar
      Use Pandoc also for plain-text documentation output · 2488ea7a
      Peter Eisentraut authored
      The makefile rule for the (rarely used) plain-text output postgres.txt
      was still written to use lynx, but in
      96b8b8b6, where the INSTALL file was
      switched to pandoc, the rest of the makefile support for lynx was
      removed, so this was broken.  Rewrite the rule to also use pandoc for
      postgres.txt.
      2488ea7a
    • Tomas Vondra's avatar
      Minor improvements for the multivariate MCV lists · a63b29a1
      Tomas Vondra authored
      The MCV build should always call get_mincount_for_mcv_list(), as the
      there is no other logic to decide whether the MCV list represents all
      the data. So just remove the (ngroups > nitems) condition.
      
      Also, when building MCV lists, the number of items was limited by the
      statistics target (i.e. up to 10000). But when deserializing the MCV
      list, a different value (8192) was used to check the input, causing
      an error.  Simply ensure that the same value is used in both places.
      
      This should have been included in 7300a699, but I forgot to include it
      in that commit.
      a63b29a1
    • Tomas Vondra's avatar
      Add support for multivariate MCV lists · 7300a699
      Tomas Vondra authored
      Introduce a third extended statistic type, supported by the CREATE
      STATISTICS command - MCV lists, a generalization of the statistic
      already built and used for individual columns.
      
      Compared to the already supported types (n-distinct coefficients and
      functional dependencies), MCV lists are more complex, include column
      values and allow estimation of much wider range of common clauses
      (equality and inequality conditions, IS NULL, IS NOT NULL etc.).
      Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.
      
      Author: Tomas Vondra
      Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
      Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
      7300a699
    • Tom Lane's avatar
      Avoid passing query tlist around separately from root->processed_tlist. · 333ed246
      Tom Lane authored
      In the dim past, the planner kept the fully-processed version of the query
      targetlist (the result of preprocess_targetlist) in grouping_planner's
      local variable "tlist", and only grudgingly passed it to individual other
      routines as needed.  Later we discovered a need to still have it available
      after grouping_planner finishes, and invented the root->processed_tlist
      field for that purpose, but it wasn't used internally to grouping_planner;
      the tlist was still being passed around separately in the same places as
      before.
      
      Now comes a proposed patch to allow appendrel expansion to add entries
      to the processed tlist, well after preprocess_targetlist has finished
      its work.  To avoid having to pass around the tlist explicitly, it's
      proposed to allow appendrel expansion to modify root->processed_tlist.
      That makes aliasing the tlist with assorted parameters and local
      variables really scary.  It would accidentally work as long as the
      tlist is initially nonempty, because then the List header won't move
      around, but it's not exactly hard to think of ways for that to break.
      Aliased values are poor programming practice anyway.
      
      Hence, get rid of local variables and parameters that can be identified
      with root->processed_tlist, in favor of just using that field directly.
      And adjust comments to match.  (Some of the new comments speak as though
      it's already possible for appendrel expansion to modify the tlist; that's
      not true yet, but will happen in a later patch.)
      
      Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp
      333ed246
    • Alvaro Herrera's avatar
      pgbench: doExecuteCommand -> executeMetaCommand · 9938d116
      Alvaro Herrera authored
      The new function is only in charge of meta commands, not SQL commands.
      This change makes the code a little clearer: now all the state changes
      are effected by advanceConnectionState.  It also removes one indent
      level, which makes the diff look bulkier than it really is.
      
      Author: Fabien Coelho
      Reviewed-by: Kirk Jamison
      Discussion: https://postgr.es/m/alpine.DEB.2.21.1811240904500.12627@lancre
      9938d116
    • Tom Lane's avatar
      Suppress uninitialized-variable warning. · a51cc7e9
      Tom Lane authored
      Apparently Andres' compiler is smart enough to see that hpage
      must be initialized before use ... but mine isn't.
      a51cc7e9
    • Michael Paquier's avatar
      Improve error handling of column references in expression transformation · ecfed4a1
      Michael Paquier authored
      Column references are not allowed in default expressions and partition
      bound expressions, and are restricted as such once the transformation of
      their expressions is done.  However, trying to use more complex column
      references can lead to confusing error messages.  For example, trying to
      use a two-field column reference name for default expressions and
      partition bounds leads to "missing FROM-clause entry for table", which
      makes no sense in their respective context.
      
      In order to make the errors generated more useful, this commit adds more
      verbose messages when transforming column references depending on the
      context.  This has a little consequence though: for example an
      expression using an aggregate with a column reference as argument would
      cause an error to be generated for the column reference, while the
      aggregate was the problem reported before this commit because column
      references get transformed first.
      
      The confusion exists for default expressions for a long time, and the
      problem is new as of v12 for partition bounds.  Still per the lack of
      complaints on the matter no backpatch is done.
      
      The patch has been written by Amit Langote and me, and Tom Lane has
      provided the improvement of the documentation for default expressions on
      the CREATE TABLE page.
      
      Author: Amit Langote, Michael Paquier
      Reviewed-by: Tom Lane
      Discussion: https://postgr.es/m/20190326020853.GM2558@paquier.xyz
      ecfed4a1
    • Thomas Munro's avatar
      Fix off-by-one error in txid_status(). · d2fd7f74
      Thomas Munro authored
      The transaction ID returned by GetNextXidAndEpoch() is in the future,
      so we can't attempt to access its status or we might try to read a
      CLOG page that doesn't exist.  The > vs >= confusion probably stemmed
      from the choice of a variable name containing the word "last" instead
      of "next", so fix that too.
      
      Back-patch to 10 where the function arrived.
      
      Author: Thomas Munro
      Discussion: https://postgr.es/m/CA%2BhUKG%2Buua_BV5cyfsioKVN2d61Lukg28ECsWTXKvh%3DBtN2DPA%40mail.gmail.com
      d2fd7f74
    • Michael Paquier's avatar
      Switch some palloc/memset calls to palloc0 · 1983af8e
      Michael Paquier authored
      Some code paths have been doing some allocations followed by an
      immediate memset() to initialize the allocated area with zeros, this is
      a bit overkill as there are already interfaces to do both things in one
      call.
      
      Author: Daniel Gustafsson
      Reviewed-by: Michael Paquier
      Discussion: https://postgr.es/m/vN0OodBPkKs7g2Z1uyk3CUEmhdtspHgYCImhlmSxv1Xn6nY1ZnaaGHL8EWUIQ-NEv36tyc4G5-uA3UXUF2l4sFXtK_EQgLN1hcgunlFVKhA=@yesql.se
      1983af8e
    • Michael Paquier's avatar
      Switch function current_schema[s]() to be parallel-unsafe · 5bde1651
      Michael Paquier authored
      When invoked for the first time in a session, current_schema() and
      current_schemas() can finish by creating a temporary schema.  Currently
      those functions are parallel-safe, however if for a reason or another
      they get launched across multiple parallel workers, they would fail when
      attempting to create a temporary schema as temporary contexts are not
      supported in this case.
      
      The original issue has been spotted by buildfarm members crake and
      lapwing, after commit c5660e0a has introduced the first regression tests
      based on current_schema() in the tree.  After that, 396676b0 has
      introduced a workaround to avoid parallel plans but that was not
      completely right either.
      
      Catversion is bumped.
      
      Author: Michael Paquier
      Reviewed-by: Daniel Gustafsson
      Discussion: https://postgr.es/m/20190118024618.GF1883@paquier.xyz
      5bde1651
    • Tomas Vondra's avatar
      Track unowned relations in doubly-linked list · 6ca015f9
      Tomas Vondra authored
      Relations dropped in a single transaction are tracked in a list of
      unowned relations.  With large number of dropped relations this resulted
      in poor performance at the end of a transaction, when the relations are
      removed from the singly linked list one by one.
      
      Commit b4166911 attempted to address this issue (particularly when it
      happens during recovery) by removing the relations in a reverse order,
      resulting in O(1) lookups in the list of unowned relations.  This did
      not work reliably, though, and it was possible to trigger the O(N^2)
      behavior in various ways.
      
      Instead of trying to remove the relations in a specific order with
      respect to the linked list, which seems rather fragile, switch to a
      regular doubly linked.  That allows us to remove relations cheaply no
      matter where in the list they are.
      
      As b4166911 was a bugfix, backpatched to all supported versions, do the
      same thing here.
      
      Reviewed-by: Alvaro Herrera
      Discussion: https://www.postgresql.org/message-id/flat/80c27103-99e4-1d0c-642c-d9f3b94aaa0a%402ndquadrant.com
      Backpatch-through: 9.4
      6ca015f9
  4. 26 Mar, 2019 8 commits
    • Andres Freund's avatar
      Compute XID horizon for page level index vacuum on primary. · 558a9165
      Andres Freund authored
      Previously the xid horizon was only computed during WAL replay. That
      had two major problems:
      1) It relied on knowing what the table pointed to looks like. That was
         easy enough before the introducing of tableam (we knew it had to be
         heap, although some trickery around logging the heap relfilenodes
         was required). But to properly handle table AMs we need
         per-database catalog access to look up the AM handler, which
         recovery doesn't allow.
      2) Not knowing the xid horizon also makes it hard to support logical
         decoding on standbys. When on a catalog table, we need to be able
         to conflict with slots that have an xid horizon that's too old. But
         computing the horizon by visiting the heap only works once
         consistency is reached, but we always need to be able to detect
         conflicts.
      
      There's also a secondary problem, in that the current method performs
      redundant work on every standby. But that's counterbalanced by
      potentially computing the value when not necessary (either because
      there's no standby, or because there's no connected backends).
      
      Solve 1) and 2) by moving computation of the xid horizon to the
      primary and by involving tableam in the computation of the horizon.
      
      To address the potentially increased overhead, increase the efficiency
      of the xid horizon computation for heap by sorting the tids, and
      eliminating redundant buffer accesses. When prefetching is available,
      additionally perform prefetching of buffers.  As this is more of a
      maintenance task, rather than something routinely done in every read
      only query, we add an arbitrary 10 to the effective concurrency -
      thereby using IO concurrency, when not globally enabled.  That's
      possibly not the perfect formula, but seems good enough for now.
      
      Bumps WAL format, as latestRemovedXid is now part of the records, and
      the heap's relfilenode isn't anymore.
      
      Author: Andres Freund, Amit Khandekar, Robert Haas
      Reviewed-By: Robert Haas
      Discussion:
          https://postgr.es/m/20181212204154.nsxf3gzqv3gesl32@alap3.anarazel.de
          https://postgr.es/m/20181214014235.dal5ogljs3bmlq44@alap3.anarazel.de
          https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
      558a9165
    • Alvaro Herrera's avatar
      Fix partitioned index creation bug with dropped columns · 126d6312
      Alvaro Herrera authored
      ALTER INDEX .. ATTACH PARTITION fails if the partitioned table where the
      index is defined contains more dropped columns than its partition, with
      this message:
        ERROR:  incorrect attribute map
      The cause was that one caller of CompareIndexInfo was passing the number
      of attributes of the partition rather than the parent, which confused
      the length check.  Repair.
      
      This can cause pg_upgrade to fail when used on such a database.  Leave
      some more objects around after regression tests, so that the case is
      detected by pg_upgrade test suite.
      
      Remove some spurious empty lines noticed while looking for other cases
      of the same problem.
      
      Discussion: https://postgr.es/m/20190326213924.GA2322@alvherre.pgsql
      126d6312
    • Tom Lane's avatar
      Build "other rels" of appendrel baserels in a separate step. · 53bcf5e3
      Tom Lane authored
      Up to now, otherrel RelOptInfos were built at the same time as baserel
      RelOptInfos, thanks to recursion in build_simple_rel().  However,
      nothing in query_planner's preprocessing cares at all about otherrels,
      only baserels, so we don't really need to build them until just before
      we enter make_one_rel.  This has two benefits:
      
      * create_lateral_join_info did a lot of extra work to propagate
      lateral-reference information from parents to the correct children.
      But if we delay creation of the children till after that, it's
      trivial (and much harder to break, too).
      
      * Since we have all the restriction quals correctly assigned to
      parent appendrels by this point, it'll be possible to do plan-time
      pruning and never make child RelOptInfos at all for partitions that
      can be pruned away.  That's not done here, but will be later on.
      
      Amit Langote, reviewed at various times by Dilip Kumar, Jesper Pedersen,
      Yoshikazu Imai, and David Rowley
      
      Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp
      53bcf5e3
    • Tom Lane's avatar
      Add ORDER BY to more ICU regression test cases. · 8994cc6f
      Tom Lane authored
      Commit c77e1220 didn't fully fix the problem.  Per buildfarm
      and local testing.
      8994cc6f
    • Tom Lane's avatar
      Fix oversight in data-type change for autovacuum_vacuum_cost_delay. · 7c366ac9
      Tom Lane authored
      Commit caf626b2 missed that the relevant reloptions entry needs
      to be moved from the intRelOpts[] array to realRelOpts[].
      Somewhat surprisingly, it seems to work anyway, perhaps because
      the desired default and limit values are all integers.  We ought
      to have either a simpler data structure or better cross-checking
      here, but that's for another patch.
      
      Nikolay Shaplov
      
      Discussion: https://postgr.es/m/4861742.12LTaSB3sv@x200m
      7c366ac9
    • Alvaro Herrera's avatar
      psql: Schema-qualify typecast in one \d query · 1d21ba8a
      Alvaro Herrera authored
      Bug introduced in my commit bc87f22e
      1d21ba8a
    • Tom Lane's avatar
      Get rid of duplicate child RTE for a partitioned table. · e8d5dd6b
      Tom Lane authored
      We've been creating duplicate RTEs for partitioned tables just
      because we do so for regular inheritance parent tables.  But unlike
      regular-inheritance parents which are themselves regular tables
      and thus need to be scanned, partitioned tables don't need the
      extra RTE.
      
      This makes the conditions for building a child RTE the same as those
      for building an AppendRelInfo, allowing minor simplification in
      expand_single_inheritance_child.  Since the planner's actual processing
      is driven off the AppendRelInfo list, nothing much changes beyond that,
      we just have one fewer useless RTE entry.
      
      Amit Langote, reviewed and hacked a bit by me
      
      Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp
      e8d5dd6b
    • Alvaro Herrera's avatar
      Improve psql's \d display of foreign key constraints · 1af25ca0
      Alvaro Herrera authored
      When used on a partition containing foreign keys coming from one of its
      ancestors, \d would (rather unhelpfully) print the details about the
      pg_constraint row in the partition.  This becomes a bit frustrating when
      the user tries things like dropping the FK in the partition; instead,
      show the details for the foreign key on the table where it is defined.
      
      Also, when a table is referenced by a foreign key on a partitioned
      table, we would show multiple "Referenced by" lines, one for each
      partition, which gets unwieldy pretty fast.  Modify that so that it
      shows only one line for the ancestor partitioned table where the FK is
      defined.
      
      Discussion: https://postgr.es/m/20181204143834.ym6euxxxi5aeqdpn@alvherre.pgsql
      Reviewed-by: Tom Lane, Amit Langote, Peter Eisentraut
      1af25ca0