1. 31 Jul, 2021 1 commit
  2. 18 Jun, 2021 1 commit
    • Tom Lane's avatar
      Centralize the logic for protective copying of utility statements. · 7c337b6b
      Tom Lane authored
      In the "simple Query" code path, it's fine for parse analysis or
      execution of a utility statement to scribble on the statement's node
      tree, since that'll just be thrown away afterwards.  However it's
      not fine if the node tree is in the plan cache, as then it'd be
      corrupted for subsequent executions.  Up to now we've dealt with
      that by having individual utility-statement functions apply
      copyObject() if they were going to modify the tree.  But that's
      prone to errors of omission.  Bug #17053 from Charles Samborski
      shows that CREATE/ALTER DOMAIN didn't get this memo, and can
      crash if executed repeatedly from plan cache.
      
      In the back branches, we'll just apply a narrow band-aid for that,
      but in HEAD it seems prudent to have a more principled fix that
      will close off the possibility of other similar bugs in future.
      Hence, let's hoist the responsibility for doing copyObject up into
      ProcessUtility from its children, thus ensuring that it happens for
      all utility statement types.
      
      Also, modify ProcessUtility's API so that its callers can tell it
      whether a copy step is necessary.  It turns out that in all cases,
      the immediate caller knows whether the node tree is transient, so
      this doesn't involve a huge amount of code thrashing.  In this way,
      while we lose a little bit in the execute-from-cache code path due
      to sometimes copying node trees that wouldn't be mutated anyway,
      we gain something in the simple-Query code path by not copying
      throwaway node trees.  Statements that are complex enough to be
      expensive to copy are almost certainly ones that would have to be
      copied anyway, so the loss in the cache code path shouldn't be much.
      
      (Note that this whole problem applies only to utility statements.
      Optimizable statements don't have the issue because we long ago made
      the executor treat Plan trees as read-only.  Perhaps someday we will
      make utility statement execution act likewise, but I'm not holding
      my breath.)
      
      Discussion: https://postgr.es/m/931771.1623893989@sss.pgh.pa.us
      Discussion: https://postgr.es/m/17053-3ca3f501bbc212b4@postgresql.org
      7c337b6b
  3. 21 May, 2021 1 commit
    • Tom Lane's avatar
      Restore the portal-level snapshot after procedure COMMIT/ROLLBACK. · 84f5c290
      Tom Lane authored
      COMMIT/ROLLBACK necessarily destroys all snapshots within the session.
      The original implementation of intra-procedure transactions just
      cavalierly did that, ignoring the fact that this left us executing in
      a rather different environment than normal.  In particular, it turns
      out that handling of toasted datums depends rather critically on there
      being an outer ActiveSnapshot: otherwise, when SPI or the core
      executor pop whatever snapshot they used and return, it's unsafe to
      dereference any toasted datums that may appear in the query result.
      It's possible to demonstrate "no known snapshots" and "missing chunk
      number N for toast value" errors as a result of this oversight.
      
      Historically this outer snapshot has been held by the Portal code,
      and that seems like a good plan to preserve.  So add infrastructure
      to pquery.c to allow re-establishing the Portal-owned snapshot if it's
      not there anymore, and add enough bookkeeping support that we can tell
      whether it is or not.
      
      We can't, however, just re-establish the Portal snapshot as part of
      COMMIT/ROLLBACK.  As in normal transaction start, acquiring the first
      snapshot should wait until after SET and LOCK commands.  Hence, teach
      spi.c about doing this at the right time.  (Note that this patch
      doesn't fix the problem for any PLs that try to run intra-procedure
      transactions without using SPI to execute SQL commands.)
      
      This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD,
      rename that to allow_nonatomic.
      
      replication/logical/worker.c also needs some fixes, because it wasn't
      careful to hold a snapshot open around AFTER trigger execution.
      That code doesn't use a Portal, which I suspect someday we're gonna
      have to fix.  But for now, just rearrange the order of operations.
      This includes back-patching the recent addition of finish_estate()
      to centralize the cleanup logic there.
      
      This also back-patches commit 2ecfeda3 into v13, to improve the
      test coverage for worker.c (it was that test that exposed that
      worker.c's snapshot management is wrong).
      
      Per bug #15990 from Andreas Wicht.  Back-patch to v11 where
      intra-procedure COMMIT was added.
      
      Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
      84f5c290
  4. 04 Feb, 2021 1 commit
    • Tom Lane's avatar
      Avoid crash when rolling back within a prepared statement. · 9624321e
      Tom Lane authored
      If a portal is used to run a prepared CALL or DO statement that
      contains a ROLLBACK, PortalRunMulti fails because the portal's
      statement list gets cleared by the rollback.  (Since the grammar
      doesn't allow CALL/DO in PREPARE, the only easy way to get to this is
      via extended query protocol, which treats all inputs as prepared
      statements.)  It's difficult to avoid resetting the portal early
      because of resource-management issues, so work around this by teaching
      PortalRunMulti to be wary of portal->stmts having suddenly become NIL.
      
      The crash has only been seen to occur in v13 and HEAD (as a
      consequence of commit 1cff1b95 having added an extra touch of
      portal->stmts).  But even before that, the code involved touching a
      List that the portal no longer has any claim on.  In the test case at
      hand, the List will still exist because of another refcount on the
      cached plan; but I'm far from convinced that it's impossible for the
      cached plan to have been dropped by the time control gets back to
      PortalRunMulti.  Hence, backpatch to v11 where nested transactions
      were added.
      
      Thomas Munro and Tom Lane, per bug #16811 from James Inform
      
      Discussion: https://postgr.es/m/16811-c1b599b2c6c2d622@postgresql.org
      9624321e
  5. 02 Jan, 2021 1 commit
  6. 12 Jun, 2020 1 commit
    • Tom Lane's avatar
      Avoid using a cursor in plpgsql's RETURN QUERY statement. · 2f48ede0
      Tom Lane authored
      plpgsql has always executed the query given in a RETURN QUERY command
      by opening it as a cursor and then fetching a few rows at a time,
      which it turns around and dumps into the function's result tuplestore.
      The point of this was to keep from blowing out memory with an oversized
      SPITupleTable result (note that while a tuplestore can spill tuples
      to disk, SPITupleTable cannot).  However, it's rather inefficient, both
      because of extra data copying and because of executor entry/exit
      overhead.  In recent versions, a new performance problem has emerged:
      use of a cursor prevents use of a parallel plan for the executed query.
      
      We can improve matters by skipping use of a cursor and having the
      executor push result tuples directly into the function's result
      tuplestore.  However, a moderate amount of new infrastructure is needed
      to make that idea work:
      
      * We can use the existing tstoreReceiver.c DestReceiver code to funnel
      executor output to the tuplestore, but it has to be extended to support
      plpgsql's requirement for possibly applying a tuple conversion map.
      
      * SPI needs to be extended to allow use of a caller-supplied
      DestReceiver instead of its usual receiver that puts tuples into
      a SPITupleTable.  Two new API calls are needed to handle both the
      RETURN QUERY and RETURN QUERY EXECUTE cases.
      
      I also felt that I didn't want these new API calls to use the legacy
      method of specifying query parameter values with "char" null flags
      (the old ' '/'n' convention); rather they should accept ParamListInfo
      objects containing the parameter type and value info.  This required
      a bit of additional new infrastructure since we didn't yet have any
      parse analysis callback that would interpret $N parameter symbols
      according to type data supplied in a ParamListInfo.  There seems to be
      no harm in letting makeParamList install that callback by default,
      rather than leaving a new ParamListInfo's parserSetup hook as NULL.
      (Indeed, as of HEAD, I couldn't find anyplace that was using the
      parserSetup field at all; plpgsql was using parserSetupArg for its
      own purposes, but parserSetup seemed to be write-only.)
      
      We can actually get plpgsql out of the business of using legacy null
      flags altogether, and using ParamListInfo instead of its ad-hoc
      PreparedParamsData structure; but this requires inventing one more
      SPI API call that can replace SPI_cursor_open_with_args.  That seems
      worth doing, though.
      
      SPI_execute_with_args and SPI_cursor_open_with_args are now unused
      anywhere in the core PG distribution.  Perhaps someday we could
      deprecate/remove them.  But cleaning up the crufty bits of the SPI
      API is a task for a different patch.
      
      Per bug #16040 from Jeremy Smith.  This is unfortunately too invasive to
      consider back-patching.  Patch by me; thanks to Hamid Akhtar for review.
      
      Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
      2f48ede0
  7. 02 Mar, 2020 1 commit
    • Alvaro Herrera's avatar
      Represent command completion tags as structs · 2f966131
      Alvaro Herrera authored
      The backend was using strings to represent command tags and doing string
      comparisons in multiple places, but that's slow and unhelpful.  Create a
      new command list with a supporting structure to use instead; this is
      stored in a tag-list-file that can be tailored to specific purposes with
      a caller-definable C macro, similar to what we do for WAL resource
      managers.  The first first such uses are a new CommandTag enum and a
      CommandTagBehavior struct.
      
      Replace numerous occurrences of char *completionTag with a
      QueryCompletion struct so that the code no longer stores information
      about completed queries in a cstring.  Only at the last moment, in
      EndCommand(), does this get converted to a string.
      
      EventTriggerCacheItem no longer holds an array of palloc’d tag strings
      in sorted order, but rather just a Bitmapset over the CommandTags.
      
      Author: Mark Dilger, with unsolicited help from Álvaro Herrera
      Reviewed-by: John Naylor, Tom Lane
      Discussion: https://postgr.es/m/981A9DB4-3F0C-4DA5-88AD-CB9CFF4D6CAD@enterprisedb.com
      2f966131
  8. 01 Jan, 2020 1 commit
  9. 29 Jul, 2019 1 commit
  10. 15 Jul, 2019 1 commit
    • Tom Lane's avatar
      Represent Lists as expansible arrays, not chains of cons-cells. · 1cff1b95
      Tom Lane authored
      Originally, Postgres Lists were a more or less exact reimplementation of
      Lisp lists, which consist of chains of separately-allocated cons cells,
      each having a value and a next-cell link.  We'd hacked that once before
      (commit d0b4399d) to add a separate List header, but the data was still
      in cons cells.  That makes some operations -- notably list_nth() -- O(N),
      and it's bulky because of the next-cell pointers and per-cell palloc
      overhead, and it's very cache-unfriendly if the cons cells end up
      scattered around rather than being adjacent.
      
      In this rewrite, we still have List headers, but the data is in a
      resizable array of values, with no next-cell links.  Now we need at
      most two palloc's per List, and often only one, since we can allocate
      some values in the same palloc call as the List header.  (Of course,
      extending an existing List may require repalloc's to enlarge the array.
      But this involves just O(log N) allocations not O(N).)
      
      Of course this is not without downsides.  The key difficulty is that
      addition or deletion of a list entry may now cause other entries to
      move, which it did not before.
      
      For example, that breaks foreach() and sister macros, which historically
      used a pointer to the current cons-cell as loop state.  We can repair
      those macros transparently by making their actual loop state be an
      integer list index; the exposed "ListCell *" pointer is no longer state
      carried across loop iterations, but is just a derived value.  (In
      practice, modern compilers can optimize things back to having just one
      loop state value, at least for simple cases with inline loop bodies.)
      In principle, this is a semantics change for cases where the loop body
      inserts or deletes list entries ahead of the current loop index; but
      I found no such cases in the Postgres code.
      
      The change is not at all transparent for code that doesn't use foreach()
      but chases lists "by hand" using lnext().  The largest share of such
      code in the backend is in loops that were maintaining "prev" and "next"
      variables in addition to the current-cell pointer, in order to delete
      list cells efficiently using list_delete_cell().  However, we no longer
      need a previous-cell pointer to delete a list cell efficiently.  Keeping
      a next-cell pointer doesn't work, as explained above, but we can improve
      matters by changing such code to use a regular foreach() loop and then
      using the new macro foreach_delete_current() to delete the current cell.
      (This macro knows how to update the associated foreach loop's state so
      that no cells will be missed in the traversal.)
      
      There remains a nontrivial risk of code assuming that a ListCell *
      pointer will remain good over an operation that could now move the list
      contents.  To help catch such errors, list.c can be compiled with a new
      define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents
      whenever that could possibly happen.  This makes list operations
      significantly more expensive so it's not normally turned on (though it
      is on by default if USE_VALGRIND is on).
      
      There are two notable API differences from the previous code:
      
      * lnext() now requires the List's header pointer in addition to the
      current cell's address.
      
      * list_delete_cell() no longer requires a previous-cell argument.
      
      These changes are somewhat unfortunate, but on the other hand code using
      either function needs inspection to see if it is assuming anything
      it shouldn't, so it's not all bad.
      
      Programmers should be aware of these significant performance changes:
      
      * list_nth() and related functions are now O(1); so there's no
      major access-speed difference between a list and an array.
      
      * Inserting or deleting a list element now takes time proportional to
      the distance to the end of the list, due to moving the array elements.
      (However, it typically *doesn't* require palloc or pfree, so except in
      long lists it's probably still faster than before.)  Notably, lcons()
      used to be about the same cost as lappend(), but that's no longer true
      if the list is long.  Code that uses lcons() and list_delete_first()
      to maintain a stack might usefully be rewritten to push and pop at the
      end of the list rather than the beginning.
      
      * There are now list_insert_nth...() and list_delete_nth...() functions
      that add or remove a list cell identified by index.  These have the
      data-movement penalty explained above, but there's no search penalty.
      
      * list_concat() and variants now copy the second list's data into
      storage belonging to the first list, so there is no longer any
      sharing of cells between the input lists.  The second argument is
      now declared "const List *" to reflect that it isn't changed.
      
      This patch just does the minimum needed to get the new implementation
      in place and fix bugs exposed by the regression tests.  As suggested
      by the foregoing, there's a fair amount of followup work remaining to
      do.
      
      Also, the ENABLE_LIST_COMPAT macros are finally removed in this
      commit.  Code using those should have been gone a dozen years ago.
      
      Patch by me; thanks to David Rowley, Jesper Pedersen, and others
      for review.
      
      Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
      1cff1b95
  11. 22 May, 2019 1 commit
  12. 02 Jan, 2019 1 commit
  13. 21 Nov, 2018 1 commit
    • Andres Freund's avatar
      Remove WITH OIDS support, change oid catalog column visibility. · 578b2297
      Andres Freund authored
      Previously tables declared WITH OIDS, including a significant fraction
      of the catalog tables, stored the oid column not as a normal column,
      but as part of the tuple header.
      
      This special column was not shown by default, which was somewhat odd,
      as it's often (consider e.g. pg_class.oid) one of the more important
      parts of a row.  Neither pg_dump nor COPY included the contents of the
      oid column by default.
      
      The fact that the oid column was not an ordinary column necessitated a
      significant amount of special case code to support oid columns. That
      already was painful for the existing, but upcoming work aiming to make
      table storage pluggable, would have required expanding and duplicating
      that "specialness" significantly.
      
      WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
      Remove it.
      
      Removing includes:
      - CREATE TABLE and ALTER TABLE syntax for declaring the table to be
        WITH OIDS has been removed (WITH (oids[ = true]) will error out)
      - pg_dump does not support dumping tables declared WITH OIDS and will
        issue a warning when dumping one (and ignore the oid column).
      - restoring an pg_dump archive with pg_restore will warn when
        restoring a table with oid contents (and ignore the oid column)
      - COPY will refuse to load binary dump that includes oids.
      - pg_upgrade will error out when encountering tables declared WITH
        OIDS, they have to be altered to remove the oid column first.
      - Functionality to access the oid of the last inserted row (like
        plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
      
      The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
      for CREATE TABLE) is still supported. While that requires a bit of
      support code, it seems unnecessary to break applications / dumps that
      do not use oids, and are explicit about not using them.
      
      The biggest user of WITH OID columns was postgres' catalog. This
      commit changes all 'magic' oid columns to be columns that are normally
      declared and stored. To reduce unnecessary query breakage all the
      newly added columns are still named 'oid', even if a table's column
      naming scheme would indicate 'reloid' or such.  This obviously
      requires adapting a lot code, mostly replacing oid access via
      HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
      
      The bootstrap process now assigns oids for all oid columns in
      genbki.pl that do not have an explicit value (starting at the largest
      oid previously used), only oids assigned later by oids will be above
      FirstBootstrapObjectId. As the oid column now is a normal column the
      special bootstrap syntax for oids has been removed.
      
      Oids are not automatically assigned during insertion anymore, all
      backend code explicitly assigns oids with GetNewOidWithIndex(). For
      the rare case that insertions into the catalog via SQL are called for
      the new pg_nextoid() function can be used (which only works on catalog
      tables).
      
      The fact that oid columns on system tables are now normal columns
      means that they will be included in the set of columns expanded
      by * (i.e. SELECT * FROM pg_class will now include the table's oid,
      previously it did not). It'd not technically be hard to hide oid
      column by default, but that'd mean confusing behavior would either
      have to be carried forward forever, or it'd cause breakage down the
      line.
      
      While it's not unlikely that further adjustments are needed, the
      scope/invasiveness of the patch makes it worthwhile to get merge this
      now. It's painful to maintain externally, too complicated to commit
      after the code code freeze, and a dependency of a number of other
      patches.
      
      Catversion bump, for obvious reasons.
      
      Author: Andres Freund, with contributions by John Naylor
      Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
      578b2297
  14. 16 Nov, 2018 1 commit
    • Andres Freund's avatar
      Introduce notion of different types of slots (without implementing them). · 1a0586de
      Andres Freund authored
      Upcoming work intends to allow pluggable ways to introduce new ways of
      storing table data. Accessing those table access methods from the
      executor requires TupleTableSlots to be carry tuples in the native
      format of such storage methods; otherwise there'll be a significant
      conversion overhead.
      
      Different access methods will require different data to store tuples
      efficiently (just like virtual, minimal, heap already require fields
      in TupleTableSlot). To allow that without requiring additional pointer
      indirections, we want to have different structs (embedding
      TupleTableSlot) for different types of slots.  Thus different types of
      slots are needed, which requires adapting creators of slots.
      
      The slot that most efficiently can represent a type of tuple in an
      executor node will often depend on the type of slot a child node
      uses. Therefore we need to track the type of slot is returned by
      nodes, so parent slots can create slots based on that.
      
      Relatedly, JIT compilation of tuple deforming needs to know which type
      of slot a certain expression refers to, so it can create an
      appropriate deforming function for the type of tuple in the slot.
      
      But not all nodes will only return one type of slot, e.g. an append
      node will potentially return different types of slots for each of its
      subplans.
      
      Therefore add function that allows to query the type of a node's
      result slot, and whether it'll always be the same type (whether it's
      fixed). This can be queried using ExecGetResultSlotOps().
      
      The scan, result, inner, outer type of slots are automatically
      inferred from ExecInitScanTupleSlot(), ExecInitResultSlot(),
      left/right subtrees respectively. If that's not correct for a node,
      that can be overwritten using new fields in PlanState.
      
      This commit does not introduce the actually abstracted implementation
      of different kind of TupleTableSlots, that will be left for a followup
      commit.  The different types of slots introduced will, for now, still
      use the same backing implementation.
      
      While this already partially invalidates the big comment in
      tuptable.h, it seems to make more sense to update it later, when the
      different TupleTableSlot implementations actually exist.
      
      Author: Ashutosh Bapat and Andres Freund, with changes by Amit Khandekar
      Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
      1a0586de
  15. 12 Apr, 2018 1 commit
    • Simon Riggs's avatar
      Revert MERGE patch · 08ea7a22
      Simon Riggs authored
      This reverts commits d204ef63,
      83454e3c and a few more commits thereafter
      (complete list at the end) related to MERGE feature.
      
      While the feature was fully functional, with sufficient test coverage and
      necessary documentation, it was felt that some parts of the executor and
      parse-analyzer can use a different design and it wasn't possible to do that in
      the available time. So it was decided to revert the patch for PG11 and retry
      again in the future.
      
      Thanks again to all reviewers and bug reporters.
      
      List of commits reverted, in reverse chronological order:
      
       f1464c53 Improve parse representation for MERGE
       ddb41585 MERGE syntax diagram correction
       530e69e5 Allow cpluspluscheck to pass by renaming variable
       01b88b4d MERGE minor errata
       3af7b2b0 MERGE fix variable warning in non-assert builds
       a5d86181 MERGE INSERT allows only one VALUES clause
       4b2d4403 MERGE post-commit review
       4923550c Tab completion for MERGE
       aa3faa3c WITH support in MERGE
       83454e3c New files for MERGE
       d204ef63 MERGE SQL Command following SQL:2016
      
      Author: Pavan Deolasee
      Reviewed-by: Michael Paquier
      08ea7a22
  16. 03 Apr, 2018 1 commit
    • Simon Riggs's avatar
      MERGE SQL Command following SQL:2016 · d204ef63
      Simon Riggs authored
      MERGE performs actions that modify rows in the target table
      using a source table or query. MERGE provides a single SQL
      statement that can conditionally INSERT/UPDATE/DELETE rows
      a task that would other require multiple PL statements.
      e.g.
      
      MERGE INTO target AS t
      USING source AS s
      ON t.tid = s.sid
      WHEN MATCHED AND t.balance > s.delta THEN
        UPDATE SET balance = t.balance - s.delta
      WHEN MATCHED THEN
        DELETE
      WHEN NOT MATCHED AND s.delta > 0 THEN
        INSERT VALUES (s.sid, s.delta)
      WHEN NOT MATCHED THEN
        DO NOTHING;
      
      MERGE works with regular and partitioned tables, including
      column and row security enforcement, as well as support for
      row, statement and transition triggers.
      
      MERGE is optimized for OLTP and is parameterizable, though
      also useful for large scale ETL/ELT. MERGE is not intended
      to be used in preference to existing single SQL commands
      for INSERT, UPDATE or DELETE since there is some overhead.
      MERGE can be used statically from PL/pgSQL.
      
      MERGE does not yet support inheritance, write rules,
      RETURNING clauses, updatable views or foreign tables.
      MERGE follows SQL Standard per the most recent SQL:2016.
      
      Includes full tests and documentation, including full
      isolation tests to demonstrate the concurrent behavior.
      
      This version written from scratch in 2017 by Simon Riggs,
      using docs and tests originally written in 2009. Later work
      from Pavan Deolasee has been both complex and deep, leaving
      the lead author credit now in his hands.
      Extensive discussion of concurrency from Peter Geoghegan,
      with thanks for the time and effort contributed.
      
      Various issues reported via sqlsmith by Andreas Seltenreich
      
      Authors: Pavan Deolasee, Simon Riggs
      Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
      
      Discussion:
      https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
      https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
      d204ef63
  17. 02 Apr, 2018 2 commits
  18. 09 Jan, 2018 2 commits
  19. 03 Jan, 2018 1 commit
  20. 08 Nov, 2017 1 commit
    • Peter Eisentraut's avatar
      Change TRUE/FALSE to true/false · 2eb4a831
      Peter Eisentraut authored
      The lower case spellings are C and C++ standard and are used in most
      parts of the PostgreSQL sources.  The upper case spellings are only used
      in some files/modules.  So standardize on the standard spellings.
      
      The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so
      those are left as is when using those APIs.
      
      In code comments, we use the lower-case spelling for the C concepts and
      keep the upper-case spelling for the SQL concepts.
      Reviewed-by: default avatarMichael Paquier <michael.paquier@gmail.com>
      2eb4a831
  21. 07 Sep, 2017 1 commit
  22. 21 Jun, 2017 2 commits
    • Tom Lane's avatar
      Phase 3 of pgindent updates. · 382ceffd
      Tom Lane authored
      Don't move parenthesized lines to the left, even if that means they
      flow past the right margin.
      
      By default, BSD indent lines up statement continuation lines that are
      within parentheses so that they start just to the right of the preceding
      left parenthesis.  However, traditionally, if that resulted in the
      continuation line extending to the right of the desired right margin,
      then indent would push it left just far enough to not overrun the margin,
      if it could do so without making the continuation line start to the left of
      the current statement indent.  That makes for a weird mix of indentations
      unless one has been completely rigid about never violating the 80-column
      limit.
      
      This behavior has been pretty universally panned by Postgres developers.
      Hence, disable it with indent's new -lpl switch, so that parenthesized
      lines are always lined up with the preceding left paren.
      
      This patch is much less interesting than the first round of indent
      changes, but also bulkier, so I thought it best to separate the effects.
      
      Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
      Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
      382ceffd
    • Tom Lane's avatar
      Phase 2 of pgindent updates. · c7b8998e
      Tom Lane authored
      Change pg_bsd_indent to follow upstream rules for placement of comments
      to the right of code, and remove pgindent hack that caused comments
      following #endif to not obey the general rule.
      
      Commit e3860ffa wasn't actually using
      the published version of pg_bsd_indent, but a hacked-up version that
      tried to minimize the amount of movement of comments to the right of
      code.  The situation of interest is where such a comment has to be
      moved to the right of its default placement at column 33 because there's
      code there.  BSD indent has always moved right in units of tab stops
      in such cases --- but in the previous incarnation, indent was working
      in 8-space tab stops, while now it knows we use 4-space tabs.  So the
      net result is that in about half the cases, such comments are placed
      one tab stop left of before.  This is better all around: it leaves
      more room on the line for comment text, and it means that in such
      cases the comment uniformly starts at the next 4-space tab stop after
      the code, rather than sometimes one and sometimes two tabs after.
      
      Also, ensure that comments following #endif are indented the same
      as comments following other preprocessor commands such as #else.
      That inconsistency turns out to have been self-inflicted damage
      from a poorly-thought-through post-indent "fixup" in pgindent.
      
      This patch is much less interesting than the first round of indent
      changes, but also bulkier, so I thought it best to separate the effects.
      
      Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
      Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
      c7b8998e
  23. 10 Apr, 2017 1 commit
    • Tom Lane's avatar
      Improve castNode notation by introducing list-extraction-specific variants. · 8f0530f5
      Tom Lane authored
      This extends the castNode() notation introduced by commit 5bcab111 to
      provide, in one step, extraction of a list cell's pointer and coercion to
      a concrete node type.  For example, "lfirst_node(Foo, lc)" is the same
      as "castNode(Foo, lfirst(lc))".  Almost half of the uses of castNode
      that have appeared so far include a list extraction call, so this is
      pretty widely useful, and it saves a few more keystrokes compared to the
      old way.
      
      As with the previous patch, back-patch the addition of these macros to
      pg_list.h, so that the notation will be available when back-patching.
      
      Patch by me, after an idea of Andrew Gierth's.
      
      Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
      8f0530f5
  24. 01 Apr, 2017 1 commit
    • Kevin Grittner's avatar
      Add infrastructure to support EphemeralNamedRelation references. · 18ce3a4a
      Kevin Grittner authored
      A QueryEnvironment concept is added, which allows new types of
      objects to be passed into queries from parsing on through
      execution.  At this point, the only thing implemented is a
      collection of EphemeralNamedRelation objects -- relations which
      can be referenced by name in queries, but do not exist in the
      catalogs.  The only type of ENR implemented is NamedTuplestore, but
      provision is made to add more types fairly easily.
      
      An ENR can carry its own TupleDesc or reference a relation in the
      catalogs by relid.
      
      Although these features can be used without SPI, convenience
      functions are added to SPI so that ENRs can easily be used by code
      run through SPI.
      
      The initial use of all this is going to be transition tables in
      AFTER triggers, but that will be added to each PL as a separate
      commit.
      
      An incidental effect of this patch is to produce a more informative
      error message if an attempt is made to modify the contents of a CTE
      from a referencing DML statement.  No tests previously covered that
      possibility, so one is added.
      
      Kevin Grittner and Thomas Munro
      Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro
      with valuable comments and suggestions from many others
      18ce3a4a
  25. 23 Mar, 2017 1 commit
    • Robert Haas's avatar
      Allow for parallel execution whenever ExecutorRun() is done only once. · 691b8d59
      Robert Haas authored
      Previously, it was unsafe to execute a plan in parallel if
      ExecutorRun() might be called with a non-zero row count.  However,
      it's quite easy to fix things up so that we can support that case,
      provided that it is known that we will never call ExecutorRun() a
      second time for the same QueryDesc.  Add infrastructure to signal
      this, and cross-checks to make sure that a caller who claims this is
      true doesn't later reneg.
      
      While that pattern never happens with queries received directly from a
      client -- there's no way to know whether multiple Execute messages
      will be sent unless the first one requests all the rows -- it's pretty
      common for queries originating from procedural languages, which often
      limit the result to a single tuple or to a user-specified number of
      tuples.
      
      This commit doesn't actually enable parallelism in any additional
      cases, because currently none of the places that would be able to
      benefit from this infrastructure pass CURSOR_OPT_PARALLEL_OK in the
      first place, but it makes it much more palatable to pass
      CURSOR_OPT_PARALLEL_OK in places where we currently don't, because it
      eliminates some cases where we'd end up having to run the parallel
      plan serially.
      
      Patch by me, based on some ideas from Rafia Sabih and corrected by
      Rafia Sabih based on feedback from Dilip Kumar and myself.
      
      Discussion: http://postgr.es/m/CA+TgmobXEhvHbJtWDuPZM9bVSLiTj-kShxQJ2uM5GPDze9fRYA@mail.gmail.com
      691b8d59
  26. 23 Feb, 2017 1 commit
    • Tom Lane's avatar
      Consistently declare timestamp variables as TimestampTz. · c29aff95
      Tom Lane authored
      Twiddle the replication-related code so that its timestamp variables
      are declared TimestampTz, rather than the uninformative "int64" that
      was previously used for meant-to-be-always-integer timestamps.
      This resolves the int64-vs-TimestampTz declaration inconsistencies
      introduced by commit 7c030783, though in the opposite direction to
      what was originally suggested.
      
      This required including datatype/timestamp.h in a couple more places
      than before.  I decided it would be a good idea to slim down that
      header by not having it pull in <float.h> etc, as those headers are
      no longer at all relevant to its purpose.  Unsurprisingly, a small number
      of .c files turn out to have been depending on those inclusions, so add
      them back in the .c files as needed.
      
      Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us
      Discussion: https://postgr.es/m/27694.1487456324@sss.pgh.pa.us
      c29aff95
  27. 27 Jan, 2017 1 commit
    • Tom Lane's avatar
      Use castNode() in a bunch of statement-list-related code. · 7afd56c3
      Tom Lane authored
      When I wrote commit ab1f0c82, I really missed the castNode() macro that
      Peter E. had proposed shortly before.  This back-fills the uses I would
      have put it to.  It's probably not all that significant, but there are
      more assertions here than there were before, and conceivably they will
      help catch any bugs associated with those representation changes.
      
      I left behind a number of usages like "(Query *) copyObject(query_var)".
      Those could have been converted as well, but Peter has proposed another
      notational improvement that would handle copyObject cases automatically,
      so I let that be for now.
      7afd56c3
  28. 14 Jan, 2017 1 commit
    • Tom Lane's avatar
      Change representation of statement lists, and add statement location info. · ab1f0c82
      Tom Lane authored
      This patch makes several changes that improve the consistency of
      representation of lists of statements.  It's always been the case
      that the output of parse analysis is a list of Query nodes, whatever
      the types of the individual statements in the list.  This patch brings
      similar consistency to the outputs of raw parsing and planning steps:
      
      * The output of raw parsing is now always a list of RawStmt nodes;
      the statement-type-dependent nodes are one level down from that.
      
      * The output of pg_plan_queries() is now always a list of PlannedStmt
      nodes, even for utility statements.  In the case of a utility statement,
      "planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
      the utility node.  This list representation is now used in Portal and
      CachedPlan plan lists, replacing the former convention of intermixing
      PlannedStmts with bare utility-statement nodes.
      
      Now, every list of statements has a consistent head-node type depending
      on how far along it is in processing.  This allows changing many places
      that formerly used generic "Node *" pointers to use a more specific
      pointer type, thus reducing the number of IsA() tests and casts needed,
      as well as improving code clarity.
      
      Also, the post-parse-analysis representation of DECLARE CURSOR is changed
      so that it looks more like EXPLAIN, PREPARE, etc.  That is, the contained
      SELECT remains a child of the DeclareCursorStmt rather than getting flipped
      around to be the other way.  It's now true for both Query and PlannedStmt
      that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
      That allows simplifying a lot of places that were testing both fields.
      (I think some of those were just defensive programming, but in many places,
      it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
      
      Because PlannedStmt carries a canSetTag field, we're also able to get rid
      of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
      statement; specifically, the assumption that a utility is canSetTag if and
      only if it's the only one in its list.  While I see no near-term need for
      relaxing that restriction, it's nice to get rid of the ad-hocery.
      
      The API of ProcessUtility() is changed so that what it's passed is the
      wrapper PlannedStmt not just the bare utility statement.  This will affect
      all users of ProcessUtility_hook, but the changes are pretty trivial; see
      the affected contrib modules for examples of the minimum change needed.
      (Most compilers should give pointer-type-mismatch warnings for uncorrected
      code.)
      
      There's also a change in the API of ExplainOneQuery_hook, to pass through
      cursorOptions instead of expecting hook functions to know what to pick.
      This is needed because of the DECLARE CURSOR changes, but really should
      have been done in 9.6; it's unlikely that any extant hook functions
      know about using CURSOR_OPT_PARALLEL_OK.
      
      Finally, teach gram.y to save statement boundary locations in RawStmt
      nodes, and pass those through to Query and PlannedStmt nodes.  This allows
      more intelligent handling of cases where a source query string contains
      multiple statements.  This patch doesn't actually do anything with the
      information, but a follow-on patch will.  (Passing this information through
      cleanly is the true motivation for these changes; while I think this is all
      good cleanup, it's unlikely we'd have bothered without this end goal.)
      
      catversion bump because addition of location fields to struct Query
      affects stored rules.
      
      This patch is by me, but it owes a good deal to Fabien Coelho who did
      a lot of preliminary work on the problem, and also reviewed the patch.
      
      Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
      ab1f0c82
  29. 03 Jan, 2017 1 commit
  30. 17 Nov, 2016 1 commit
    • Robert Haas's avatar
      Remove or reduce verbosity of some debug messages. · a43f1939
      Robert Haas authored
      The debug messages that merely print StartTransactionCommand,
      CommitTransactionCommand, ProcessUtilty, or ProcessQuery with no
      additional details seem to be useless.  Get rid of them.
      
      The transaction status messages produced by ShowTransactionState are
      occasionally useful, but they are extremely verbose, producing
      multiple lines of log output every time they fire, which can happens
      multiple times per transaction.  So, reduce the level to DEBUG5; avoid
      emitting an extra line just to explain which debug point is at issue;
      and tighten up the rest of the message so it doesn't use quite so much
      horizontal space.
      
      With these changes, it's possible to run a somewhat busy system with a
      log level even as high as DEBUG4, whereas previously anything above
      DEBUG2 would flood the log with output that probably wasn't really all
      that useful.
      a43f1939
  31. 07 Aug, 2016 1 commit
    • Tom Lane's avatar
      Fix TOAST access failure in RETURNING queries. · 9ee1cf04
      Tom Lane authored
      Discussion of commit 3e2f3c2e exposed a problem that is of longer
      standing: since we don't detoast data while sticking it into a portal's
      holdStore for PORTAL_ONE_RETURNING and PORTAL_UTIL_SELECT queries, and we
      release the query's snapshot as soon as we're done loading the holdStore,
      later readout of the holdStore can do TOAST fetches against data that can
      no longer be seen by any of the session's live snapshots.  This means that
      a concurrent VACUUM could remove the TOAST data before we can fetch it.
      Commit 3e2f3c2e exposed the problem by showing that sometimes we had *no*
      live snapshots while fetching TOAST data, but we'd be at risk anyway.
      
      I believe this code was all right when written, because our management of a
      session's exposed xmin was such that the TOAST references were safe until
      end of transaction.  But that's no longer true now that we can advance or
      clear our PGXACT.xmin intra-transaction.
      
      To fix, copy the query's snapshot during FillPortalStore() and save it in
      the Portal; release it only when the portal is dropped.  This essentially
      implements a policy that we must hold a relevant snapshot whenever we
      access potentially-toasted data.  We had already come to that conclusion
      in other places, cf commits 08e261cb and ec543db7.
      
      I'd have liked to add a regression test case for this, but I didn't see
      a way to make one that's not unreasonably bloated; it seems to require
      returning a toasted value to the client, and those will be big.
      
      In passing, improve PortalRunUtility() so that it positively verifies
      that its ending PopActiveSnapshot() call will pop the expected snapshot,
      removing a rather shaky assumption about which utility commands might
      do their own PopActiveSnapshot().  There's no known bug here, but now
      that we're actively referencing the snapshot it's almost free to make
      this code a bit more bulletproof.
      
      We might want to consider back-patching something like this into older
      branches, but it would be prudent to let it prove itself more in HEAD
      beforehand.
      
      Discussion: <87vazemeda.fsf@credativ.de>
      9ee1cf04
  32. 06 Jun, 2016 1 commit
    • Robert Haas's avatar
      Stop the executor if no more tuples can be sent from worker to leader. · c6dbf1fe
      Robert Haas authored
      If a Gather node has read as many tuples as it needs (for example, due
      to Limit) it may detach the queue connecting it to the worker before
      reading all of the worker's tuples.  Rather than let the worker
      continue to generate and send all of the results, have it stop after
      sending the next tuple.
      
      More could be done here to stop the worker even quicker, but this is
      about as well as we can hope to do for 9.6.
      
      This is in response to a problem report from Andreas Seltenreich.
      Commit 44339b89 should be actually be
      sufficient to fix that example even without this change, but it seems
      better to do this, too, since we might otherwise waste quite a large
      amount of effort in one or more workers.
      
      Discussion: CAA4eK1KOKGqmz9bGu+Z42qhRwMbm4R5rfnqsLCNqFs9j14jzEA@mail.gmail.com
      
      Amit Kapila
      c6dbf1fe
  33. 12 Mar, 2016 1 commit
    • Tom Lane's avatar
      Widen query numbers-of-tuples-processed counters to uint64. · 23a27b03
      Tom Lane authored
      This patch widens SPI_processed, EState's es_processed field, PortalData's
      portalPos field, FuncCallContext's call_cntr and max_calls fields,
      ExecutorRun's count argument, PortalRunFetch's result, and the max number
      of rows in a SPITupleTable to uint64, and deals with (I hope) all the
      ensuing fallout.  Some of these values were declared uint32 before, and
      others "long".
      
      I also removed PortalData's posOverflow field, since that logic seems
      pretty useless given that portalPos is now always 64 bits.
      
      The user-visible results are that command tags for SELECT etc will
      correctly report tuple counts larger than 4G, as will plpgsql's GET
      GET DIAGNOSTICS ... ROW_COUNT command.  Queries processing more tuples
      than that are still not exactly the norm, but they're becoming more
      common.
      
      Most values associated with FETCH/MOVE distances, such as PortalRun's count
      argument and the count argument of most SPI functions that have one, remain
      declared as "long".  It's not clear whether it would be worth promoting
      those to int64; but it would definitely be a large dollop of additional
      API churn on top of this, and it would only help 32-bit platforms which
      seem relatively less likely to see any benefit.
      
      Andreas Scherbaum, reviewed by Christian Ullrich, additional hacking by me
      23a27b03
  34. 02 Jan, 2016 1 commit
  35. 04 Sep, 2015 1 commit
    • Tom Lane's avatar
      Fix subtransaction cleanup after an outer-subtransaction portal fails. · c5454f99
      Tom Lane authored
      Formerly, we treated only portals created in the current subtransaction as
      having failed during subtransaction abort.  However, if the error occurred
      while running a portal created in an outer subtransaction (ie, a cursor
      declared before the last savepoint), that has to be considered broken too.
      
      To allow reliable detection of which ones those are, add a bookkeeping
      field to struct Portal that tracks the innermost subtransaction in which
      each portal has actually been executed.  (Without this, we'd end up
      failing portals containing functions that had called the subtransaction,
      thereby breaking plpgsql exception blocks completely.)
      
      In addition, when we fail an outer-subtransaction Portal, transfer its
      resources into the subtransaction's resource owner, so that they're
      released early in cleanup of the subxact.  This fixes a problem reported by
      Jim Nasby in which a function executed in an outer-subtransaction cursor
      could cause an Assert failure or crash by referencing a relation created
      within the inner subtransaction.
      
      The proximate cause of the Assert failure is that AtEOSubXact_RelationCache
      assumed it could blow away a relcache entry without first checking that the
      entry had zero refcount.  That was a bad idea on its own terms, so add such
      a check there, and to the similar coding in AtEOXact_RelationCache.  This
      provides an independent safety measure in case there are still ways to
      provoke the situation despite the Portal-level changes.
      
      This has been broken since subtransactions were invented, so back-patch
      to all supported branches.
      
      Tom Lane and Michael Paquier
      c5454f99
  36. 22 May, 2015 1 commit
    • Andres Freund's avatar
      Remove the new UPSERT command tag and use INSERT instead. · 631d7490
      Andres Freund authored
      Previously, INSERT with ON CONFLICT DO UPDATE specified used a new
      command tag -- UPSERT.  It was introduced out of concern that INSERT as
      a command tag would be a misrepresentation for ON CONFLICT DO UPDATE, as
      some affected rows may actually have been updated.
      
      Alvaro Herrera noticed that the implementation of that new command tag
      was incomplete; in subsequent discussion we concluded that having it
      doesn't provide benefits that are in line with the compatibility breaks
      it requires.
      
      Catversion bump due to the removal of PlannedStmt->isUpsert.
      
      Author: Peter Geoghegan
      Discussion: 20150520215816.GI5885@postgresql.org
      631d7490
  37. 08 May, 2015 1 commit
    • Andres Freund's avatar
      Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. · 168d5805
      Andres Freund authored
      The newly added ON CONFLICT clause allows to specify an alternative to
      raising a unique or exclusion constraint violation error when inserting.
      ON CONFLICT refers to constraints that can either be specified using a
      inference clause (by specifying the columns of a unique constraint) or
      by naming a unique or exclusion constraint.  DO NOTHING avoids the
      constraint violation, without touching the pre-existing row.  DO UPDATE
      SET ... [WHERE ...] updates the pre-existing tuple, and has access to
      both the tuple proposed for insertion and the existing tuple; the
      optional WHERE clause can be used to prevent an update from being
      executed.  The UPDATE SET and WHERE clauses have access to the tuple
      proposed for insertion using the "magic" EXCLUDED alias, and to the
      pre-existing tuple using the table name or its alias.
      
      This feature is often referred to as upsert.
      
      This is implemented using a new infrastructure called "speculative
      insertion". It is an optimistic variant of regular insertion that first
      does a pre-check for existing tuples and then attempts an insert.  If a
      violating tuple was inserted concurrently, the speculatively inserted
      tuple is deleted and a new attempt is made.  If the pre-check finds a
      matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
      If the insertion succeeds without detecting a conflict, the tuple is
      deemed inserted.
      
      To handle the possible ambiguity between the excluded alias and a table
      named excluded, and for convenience with long relation names, INSERT
      INTO now can alias its target table.
      
      Bumps catversion as stored rules change.
      
      Author: Peter Geoghegan, with significant contributions from Heikki
          Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
      Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
          Dean Rasheed, Stephen Frost and many others.
      168d5805