1. 07 Sep, 2021 1 commit
  2. 12 Aug, 2021 1 commit
    • Heikki Linnakangas's avatar
      Fix segfault during EvalPlanQual with mix of local and foreign partitions. · 6458ed18
      Heikki Linnakangas authored
      It's not sensible to re-evaluate a direct-modify Foreign Update or Delete
      during EvalPlanQual. However, ExecInitForeignScan() can still get called
      if a table mixes local and foreign partitions. EvalPlanQualStart() left
      the es_result_relations array uninitialized in the child EPQ EState, but
      ExecInitForeignScan() still expected to find it. That caused a segfault.
      
      Fix by skipping the es_result_relations lookup during EvalPlanQual
      processing. To make things a bit more robust, also skip the
      BeginDirectModify calls, and add a runtime check that ExecForeignScan()
      is not called on direct-modify foreign scans during EvalPlanQual
      processing.
      
      This is new in v14, commit 1375422c. Before that, EvalPlanQualStart()
      copied the whole ResultRelInfo array to the EPQ EState. Backpatch to v14.
      
      Report and diagnosis by Andrey Lepikhov.
      
      Discussion: https://www.postgresql.org/message-id/cb2b808d-cbaa-4772-76ee-c8809bafcf3d%40postgrespro.ru
      6458ed18
  3. 12 May, 2021 1 commit
    • Etsuro Fujita's avatar
      Fix EXPLAIN ANALYZE for async-capable nodes. · a363bc6d
      Etsuro Fujita authored
      EXPLAIN ANALYZE for an async-capable ForeignScan node associated with
      postgres_fdw is done just by using instrumentation for ExecProcNode()
      called from the node's callbacks, causing the following problems:
      
      1) If the remote table to scan is empty, the node is incorrectly
         considered as "never executed" by the command even if the node is
         executed, as ExecProcNode() isn't called from the node's callbacks at
         all in that case.
      2) The command fails to collect timings for things other than
         ExecProcNode() done in the node, such as creating a cursor for the
         node's remote query.
      
      To fix these problems, add instrumentation for async-capable nodes, and
      modify postgres_fdw accordingly.
      
      My oversight in commit 27e1f145.
      
      While at it, update a comment for the AsyncRequest struct in execnodes.h
      and the documentation for the ForeignAsyncRequest API in fdwhandler.sgml
      to match the code in ExecAsyncAppendResponse() in nodeAppend.c, and fix
      typos in comments in nodeAppend.c.
      
      Per report from Andrey Lepikhov, though I didn't use his patch.
      
      Reviewed-by: Andrey Lepikhov
      Discussion: https://postgr.es/m/2eb662bb-105d-fc20-7412-2f027cc3ca72%40postgrespro.ru
      a363bc6d
  4. 31 Mar, 2021 1 commit
    • Etsuro Fujita's avatar
      Add support for asynchronous execution. · 27e1f145
      Etsuro Fujita authored
      This implements asynchronous execution, which runs multiple parts of a
      non-parallel-aware Append concurrently rather than serially to improve
      performance when possible.  Currently, the only node type that can be
      run concurrently is a ForeignScan that is an immediate child of such an
      Append.  In the case where such ForeignScans access data on different
      remote servers, this would run those ForeignScans concurrently, and
      overlap the remote operations to be performed simultaneously, so it'll
      improve the performance especially when the operations involve
      time-consuming ones such as remote join and remote aggregation.
      
      We may extend this to other node types such as joins or aggregates over
      ForeignScans in the future.
      
      This also adds the support for postgres_fdw, which is enabled by the
      table-level/server-level option "async_capable".  The default is false.
      
      Robert Haas, Kyotaro Horiguchi, Thomas Munro, and myself.  This commit
      is mostly based on the patch proposed by Robert Haas, but also uses
      stuff from the patch proposed by Kyotaro Horiguchi and from the patch
      proposed by Thomas Munro.  Reviewed by Kyotaro Horiguchi, Konstantin
      Knizhnik, Andrey Lepikhov, Movead Li, Thomas Munro, Justin Pryzby, and
      others.
      
      Discussion: https://postgr.es/m/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com
      Discussion: https://postgr.es/m/CA%2BhUKGLBRyu0rHrDCMC4%3DRn3252gogyp1SjOgG8SEKKZv%3DFwfQ%40mail.gmail.com
      Discussion: https://postgr.es/m/20200228.170650.667613673625155850.horikyota.ntt%40gmail.com
      27e1f145
  5. 02 Jan, 2021 1 commit
  6. 14 Oct, 2020 1 commit
  7. 01 Jan, 2020 1 commit
  8. 27 Feb, 2019 1 commit
    • Andres Freund's avatar
      Store table oid and tuple's tid in tuple slots directly. · b8d71745
      Andres Freund authored
      After the introduction of tuple table slots all table AMs need to
      support returning the table oid of the tuple stored in a slot created
      by said AM. It does not make sense to re-implement that in every AM,
      therefore move handling of table OIDs into the TupleTableSlot
      structure itself.  It's possible that we, at a later date, might want
      to get rid of HeapTupleData.t_tableOid entirely, but doing so before
      the abstractions for table AMs are integrated turns out to be too
      hard, so delay that for now.
      
      Similarly, every AM needs to support the concept of a tuple
      identifier (tid / item pointer) for its tuples. It's quite possible
      that we'll generalize the exact form of a tid at a future point (to
      allow for things like index organized tables), but for now many parts
      of the code know about tids, so there's not much point in abstracting
      tids away. Therefore also move into slot (rather than providing API to
      set/get the tid associated with the tuple in a slot).
      
      Once table AM includes insert/updating/deleting tuples, the
      responsibility to set the correct tid after such an action will move
      into that. After that change, code doing such modifications, should
      not have to deal with HeapTuples directly anymore.
      
      Author: Andres Freund, Haribabu Kommi and Ashutosh Bapat
      Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
      b8d71745
  9. 02 Jan, 2019 1 commit
  10. 21 Nov, 2018 1 commit
    • Andres Freund's avatar
      Remove WITH OIDS support, change oid catalog column visibility. · 578b2297
      Andres Freund authored
      Previously tables declared WITH OIDS, including a significant fraction
      of the catalog tables, stored the oid column not as a normal column,
      but as part of the tuple header.
      
      This special column was not shown by default, which was somewhat odd,
      as it's often (consider e.g. pg_class.oid) one of the more important
      parts of a row.  Neither pg_dump nor COPY included the contents of the
      oid column by default.
      
      The fact that the oid column was not an ordinary column necessitated a
      significant amount of special case code to support oid columns. That
      already was painful for the existing, but upcoming work aiming to make
      table storage pluggable, would have required expanding and duplicating
      that "specialness" significantly.
      
      WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
      Remove it.
      
      Removing includes:
      - CREATE TABLE and ALTER TABLE syntax for declaring the table to be
        WITH OIDS has been removed (WITH (oids[ = true]) will error out)
      - pg_dump does not support dumping tables declared WITH OIDS and will
        issue a warning when dumping one (and ignore the oid column).
      - restoring an pg_dump archive with pg_restore will warn when
        restoring a table with oid contents (and ignore the oid column)
      - COPY will refuse to load binary dump that includes oids.
      - pg_upgrade will error out when encountering tables declared WITH
        OIDS, they have to be altered to remove the oid column first.
      - Functionality to access the oid of the last inserted row (like
        plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
      
      The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
      for CREATE TABLE) is still supported. While that requires a bit of
      support code, it seems unnecessary to break applications / dumps that
      do not use oids, and are explicit about not using them.
      
      The biggest user of WITH OID columns was postgres' catalog. This
      commit changes all 'magic' oid columns to be columns that are normally
      declared and stored. To reduce unnecessary query breakage all the
      newly added columns are still named 'oid', even if a table's column
      naming scheme would indicate 'reloid' or such.  This obviously
      requires adapting a lot code, mostly replacing oid access via
      HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
      
      The bootstrap process now assigns oids for all oid columns in
      genbki.pl that do not have an explicit value (starting at the largest
      oid previously used), only oids assigned later by oids will be above
      FirstBootstrapObjectId. As the oid column now is a normal column the
      special bootstrap syntax for oids has been removed.
      
      Oids are not automatically assigned during insertion anymore, all
      backend code explicitly assigns oids with GetNewOidWithIndex(). For
      the rare case that insertions into the catalog via SQL are called for
      the new pg_nextoid() function can be used (which only works on catalog
      tables).
      
      The fact that oid columns on system tables are now normal columns
      means that they will be included in the set of columns expanded
      by * (i.e. SELECT * FROM pg_class will now include the table's oid,
      previously it did not). It'd not technically be hard to hide oid
      column by default, but that'd mean confusing behavior would either
      have to be carried forward forever, or it'd cause breakage down the
      line.
      
      While it's not unlikely that further adjustments are needed, the
      scope/invasiveness of the patch makes it worthwhile to get merge this
      now. It's painful to maintain externally, too complicated to commit
      after the code code freeze, and a dependency of a number of other
      patches.
      
      Catversion bump, for obvious reasons.
      
      Author: Andres Freund, with contributions by John Naylor
      Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
      578b2297
  11. 16 Nov, 2018 1 commit
    • Andres Freund's avatar
      Introduce notion of different types of slots (without implementing them). · 1a0586de
      Andres Freund authored
      Upcoming work intends to allow pluggable ways to introduce new ways of
      storing table data. Accessing those table access methods from the
      executor requires TupleTableSlots to be carry tuples in the native
      format of such storage methods; otherwise there'll be a significant
      conversion overhead.
      
      Different access methods will require different data to store tuples
      efficiently (just like virtual, minimal, heap already require fields
      in TupleTableSlot). To allow that without requiring additional pointer
      indirections, we want to have different structs (embedding
      TupleTableSlot) for different types of slots.  Thus different types of
      slots are needed, which requires adapting creators of slots.
      
      The slot that most efficiently can represent a type of tuple in an
      executor node will often depend on the type of slot a child node
      uses. Therefore we need to track the type of slot is returned by
      nodes, so parent slots can create slots based on that.
      
      Relatedly, JIT compilation of tuple deforming needs to know which type
      of slot a certain expression refers to, so it can create an
      appropriate deforming function for the type of tuple in the slot.
      
      But not all nodes will only return one type of slot, e.g. an append
      node will potentially return different types of slots for each of its
      subplans.
      
      Therefore add function that allows to query the type of a node's
      result slot, and whether it'll always be the same type (whether it's
      fixed). This can be queried using ExecGetResultSlotOps().
      
      The scan, result, inner, outer type of slots are automatically
      inferred from ExecInitScanTupleSlot(), ExecInitResultSlot(),
      left/right subtrees respectively. If that's not correct for a node,
      that can be overwritten using new fields in PlanState.
      
      This commit does not introduce the actually abstracted implementation
      of different kind of TupleTableSlots, that will be left for a followup
      commit.  The different types of slots introduced will, for now, still
      use the same backing implementation.
      
      While this already partially invalidates the big comment in
      tuptable.h, it seems to make more sense to update it later, when the
      different TupleTableSlot implementations actually exist.
      
      Author: Ashutosh Bapat and Andres Freund, with changes by Amit Khandekar
      Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
      1a0586de
  12. 15 Nov, 2018 1 commit
    • Andres Freund's avatar
      Rejigger materializing and fetching a HeapTuple from a slot. · 763f2edd
      Andres Freund authored
      Previously materializing a slot always returned a HeapTuple. As
      current work aims to reduce the reliance on HeapTuples (so other
      storage systems can work efficiently), that needs to change. Thus
      split the tasks of materializing a slot (i.e. making it independent
      from the underlying storage / other memory contexts) from fetching a
      HeapTuple from the slot.  For brevity, allow to fetch a HeapTuple from
      a slot and materializing the slot at the same time, controlled by a
      parameter.
      
      For now some callers of ExecFetchSlotHeapTuple, with materialize =
      true, expect that changes to the heap tuple will be reflected in the
      underlying slot.  Those places will be adapted in due course, so while
      not pretty, that's OK for now.
      
      Also rename ExecFetchSlotTuple to ExecFetchSlotHeapTupleDatum and
      ExecFetchSlotTupleDatum to ExecFetchSlotHeapTupleDatum, as it's likely
      that future storage methods will need similar methods. There already
      is ExecFetchSlotMinimalTuple, so the new names make the naming scheme
      more coherent.
      
      Author: Ashutosh Bapat and Andres Freund, with changes by Amit Khandekar
      Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
      763f2edd
  13. 10 Nov, 2018 1 commit
    • Andres Freund's avatar
      Don't require return slots for nodes without projection. · 1ef6bd29
      Andres Freund authored
      In a lot of nodes the return slot is not required. That can either be
      because the node doesn't do any projection (say an Append node), or
      because the node does perform projections but the projection is
      optimized away because the projection would yield an identical row.
      
      Slots aren't that small, especially for wide rows, so it's worthwhile
      to avoid creating them.  It's not possible to just skip creating the
      slot - it's currently used to determine the tuple descriptor returned
      by ExecGetResultType().  So separate the determination of the result
      type from the slot creation.  The work previously done internally
      ExecInitResultTupleSlotTL() can now also be done separately with
      ExecInitResultTypeTL() and ExecInitResultSlot().  That way nodes that
      aren't guaranteed to need a result slot, can use
      ExecInitResultTypeTL() to determine the result type of the node, and
      ExecAssignScanProjectionInfo() (via
      ExecConditionalAssignProjectionInfo()) determines that a result slot
      is needed, it is created with ExecInitResultSlot().
      
      Besides the advantage of avoiding to create slots that then are
      unused, this is necessary preparation for later patches around tuple
      table slot abstraction. In particular separating the return descriptor
      and slot is a prerequisite to allow JITing of tuple deforming with
      knowledge of the underlying tuple format, and to avoid unnecessarily
      creating JITed tuple deforming for virtual slots.
      
      This commit removes a redundant argument from
      ExecInitResultTupleSlotTL(). While this commit touches a lot of the
      relevant lines anyway, it'd normally still not worthwhile to cause
      breakage, except that aforementioned later commits will touch *all*
      ExecInitResultTupleSlotTL() callers anyway (but fits worse
      thematically).
      
      Author: Andres Freund
      Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
      1ef6bd29
  14. 06 Oct, 2018 1 commit
  15. 04 Oct, 2018 1 commit
    • Tom Lane's avatar
      Centralize executor's opening/closing of Relations for rangetable entries. · 9ddef362
      Tom Lane authored
      Create an array estate->es_relations[] paralleling the es_range_table,
      and store references to Relations (relcache entries) there, so that any
      given RT entry is opened and closed just once per executor run.  Scan
      nodes typically still call ExecOpenScanRelation, but ExecCloseScanRelation
      is no more; relation closing is now done centrally in ExecEndPlan.
      
      This is slightly more complex than one would expect because of the
      interactions with relcache references held in ResultRelInfo nodes.
      The general convention is now that ResultRelInfo->ri_RelationDesc does
      not represent a separate relcache reference and so does not need to be
      explicitly closed; but there is an exception for ResultRelInfos in the
      es_trig_target_relations list, which are manufactured by
      ExecGetTriggerResultRel and have to be cleaned up by
      ExecCleanUpTriggerState.  (That much was true all along, but these
      ResultRelInfos are now more different from others than they used to be.)
      
      To allow the partition pruning logic to make use of es_relations[] rather
      than having its own relcache references, adjust PartitionedRelPruneInfo
      to store an RT index rather than a relation OID.
      
      Amit Langote, reviewed by David Rowley and Jesper Pedersen,
      some mods by me
      
      Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
      9ddef362
  16. 26 Mar, 2018 1 commit
    • Andres Freund's avatar
      JIT tuple deforming in LLVM JIT provider. · 32af96b2
      Andres Freund authored
      Performing JIT compilation for deforming gains performance benefits
      over unJITed deforming from compile-time knowledge of the tuple
      descriptor. Fixed column widths, NOT NULLness, etc can be taken
      advantage of.
      
      Right now the JITed deforming is only used when deforming tuples as
      part of expression evaluation (and obviously only if the descriptor is
      known). It's likely to be beneficial in other cases, too.
      
      By default tuple deforming is JITed whenever an expression is JIT
      compiled. There's a separate boolean GUC controlling it, but that's
      expected to be primarily useful for development and benchmarking.
      
      Docs will follow in a later commit containing docs for the whole JIT
      feature.
      
      Author: Andres Freund
      Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
      32af96b2
  17. 17 Feb, 2018 1 commit
    • Andres Freund's avatar
      Allow tupleslots to have a fixed tupledesc, use in executor nodes. · ad7dbee3
      Andres Freund authored
      The reason for doing so is that it will allow expression evaluation to
      optimize based on the underlying tupledesc. In particular it will
      allow to JIT tuple deforming together with the expression itself.
      
      For that expression initialization needs to be moved after the
      relevant slots are initialized - mostly unproblematic, except in the
      case of nodeWorktablescan.c.
      
      After doing so there's no need for ExecAssignResultType() and
      ExecAssignResultTypeFromTL() anymore, as all former callers have been
      converted to create a slot with a fixed descriptor.
      
      When creating a slot with a fixed descriptor, tts_values/isnull can be
      allocated together with the main slot, reducing allocation overhead
      and increasing cache density a bit.
      
      Author: Andres Freund
      Discussion: https://postgr.es/m/20171206093717.vqdxe5icqttpxs3p@alap3.anarazel.de
      ad7dbee3
  18. 03 Jan, 2018 1 commit
  19. 17 Nov, 2017 1 commit
  20. 30 Aug, 2017 1 commit
    • Tom Lane's avatar
      Separate reinitialization of shared parallel-scan state from ExecReScan. · 41b0dd98
      Tom Lane authored
      Previously, the parallel executor logic did reinitialization of shared
      state within the ExecReScan code for parallel-aware scan nodes.  This is
      problematic, because it means that the ExecReScan call has to occur
      synchronously (ie, during the parent Gather node's ReScan call).  That is
      swimming very much against the tide so far as the ExecReScan machinery is
      concerned; the fact that it works at all today depends on a lot of fragile
      assumptions, such as that no plan node between Gather and a parallel-aware
      scan node is parameterized.  Another objection is that because ExecReScan
      might be called in workers as well as the leader, hacky extra tests are
      needed in some places to prevent unwanted shared-state resets.
      
      Hence, let's separate this code into two functions, a ReInitializeDSM
      call and the ReScan call proper.  ReInitializeDSM is called only in
      the leader and is guaranteed to run before we start new workers.
      ReScan is returned to its traditional function of resetting only local
      state, which means that ExecReScan's usual habits of delaying or
      eliminating child rescan calls are safe again.
      
      As with the preceding commit 7df2c1f8, it doesn't seem to be necessary
      to make these changes in 9.6, which is a good thing because the FDW and
      CustomScan APIs are impacted.
      
      Discussion: https://postgr.es/m/CAA4eK1JkByysFJNh9M349u_nNjqETuEnY_y1VUc_kJiU0bxtaQ@mail.gmail.com
      41b0dd98
  21. 30 Jul, 2017 1 commit
  22. 05 Jun, 2017 1 commit
    • Tom Lane's avatar
      Don't be so trusting that shm_toc_lookup() will always succeed. · d4663350
      Tom Lane authored
      Given the possibility of race conditions and so on, it seems entirely
      unsafe to just assume that shm_toc_lookup() always finds the key it's
      looking for --- but that was exactly what all but one call site were
      doing.  To fix, add a "bool noError" argument, similarly to what we
      have in many other functions, and throw an error on an unexpected
      lookup failure.  Remove now-redundant Asserts that a rather random
      subset of call sites had.
      
      I doubt this will throw any light on buildfarm member lorikeet's
      recent failures, because if an unnoticed lookup failure were involved,
      you'd kind of expect a null-pointer-dereference crash rather than the
      observed symptom.  But you never know ... and this is better coding
      practice even if it never catches anything.
      
      Discussion: https://postgr.es/m/9697.1496675981@sss.pgh.pa.us
      d4663350
  23. 25 Mar, 2017 1 commit
    • Andres Freund's avatar
      Faster expression evaluation and targetlist projection. · b8d7f053
      Andres Freund authored
      This replaces the old, recursive tree-walk based evaluation, with
      non-recursive, opcode dispatch based, expression evaluation.
      Projection is now implemented as part of expression evaluation.
      
      This both leads to significant performance improvements, and makes
      future just-in-time compilation of expressions easier.
      
      The speed gains primarily come from:
      - non-recursive implementation reduces stack usage / overhead
      - simple sub-expressions are implemented with a single jump, without
        function calls
      - sharing some state between different sub-expressions
      - reduced amount of indirect/hard to predict memory accesses by laying
        out operation metadata sequentially; including the avoidance of
        nearly all of the previously used linked lists
      - more code has been moved to expression initialization, avoiding
        constant re-checks at evaluation time
      
      Future just-in-time compilation (JIT) has become easier, as
      demonstrated by released patches intended to be merged in a later
      release, for primarily two reasons: Firstly, due to a stricter split
      between expression initialization and evaluation, less code has to be
      handled by the JIT. Secondly, due to the non-recursive nature of the
      generated "instructions", less performance-critical code-paths can
      easily be shared between interpreted and compiled evaluation.
      
      The new framework allows for significant future optimizations. E.g.:
      - basic infrastructure for to later reduce the per executor-startup
        overhead of expression evaluation, by caching state in prepared
        statements.  That'd be helpful in OLTPish scenarios where
        initialization overhead is measurable.
      - optimizing the generated "code". A number of proposals for potential
        work has already been made.
      - optimizing the interpreter. Similarly a number of proposals have
        been made here too.
      
      The move of logic into the expression initialization step leads to some
      backward-incompatible changes:
      - Function permission checks are now done during expression
        initialization, whereas previously they were done during
        execution. In edge cases this can lead to errors being raised that
        previously wouldn't have been, e.g. a NULL array being coerced to a
        different array type previously didn't perform checks.
      - The set of domain constraints to be checked, is now evaluated once
        during expression initialization, previously it was re-built
        every time a domain check was evaluated. For normal queries this
        doesn't change much, but e.g. for plpgsql functions, which caches
        ExprStates, the old set could stick around longer.  The behavior
        around might still change.
      
      Author: Andres Freund, with significant changes by Tom Lane,
      	changes by Heikki Linnakangas
      Reviewed-By: Tom Lane, Heikki Linnakangas
      Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
      b8d7f053
  24. 26 Feb, 2017 1 commit
  25. 19 Jan, 2017 1 commit
  26. 03 Jan, 2017 1 commit
  27. 09 Jun, 2016 1 commit
  28. 18 Mar, 2016 1 commit
    • Robert Haas's avatar
      Directly modify foreign tables. · 0bf3ae88
      Robert Haas authored
      postgres_fdw can now sent an UPDATE or DELETE statement directly to
      the foreign server in simple cases, rather than sending a SELECT FOR
      UPDATE statement and then updating or deleting rows one-by-one.
      
      Etsuro Fujita, reviewed by Rushabh Lathia, Shigeru Hanada, Kyotaro
      Horiguchi, Albe Laurenz, Thom Brown, and me.
      0bf3ae88
  29. 03 Feb, 2016 1 commit
    • Robert Haas's avatar
      Allow parallel custom and foreign scans. · 69d34408
      Robert Haas authored
      This patch doesn't put the new infrastructure to use anywhere, and
      indeed it's not clear how it could ever be used for something like
      postgres_fdw which has to send an SQL query and wait for a reply,
      but there might be FDWs or custom scan providers that are CPU-bound,
      so let's give them a way to join club parallel.
      
      KaiGai Kohei, reviewed by me.
      69d34408
  30. 02 Jan, 2016 1 commit
  31. 08 Dec, 2015 1 commit
    • Robert Haas's avatar
      Allow foreign and custom joins to handle EvalPlanQual rechecks. · 385f337c
      Robert Haas authored
      Commit e7cb7ee1 provided basic
      infrastructure for allowing a foreign data wrapper or custom scan
      provider to replace a join of one or more tables with a scan.
      However, this infrastructure failed to take into account the need
      for possible EvalPlanQual rechecks, and ExecScanFetch would fail
      an assertion (or just overwrite memory) if such a check was attempted
      for a plan containing a pushed-down join.  To fix, adjust the EPQ
      machinery to skip some processing steps when scanrelid == 0, making
      those the responsibility of scan's recheck method, which also has
      the responsibility in this case of correctly populating the relevant
      slot.
      
      To allow foreign scans to gain control in the right place to make
      use of this new facility, add a new, optional RecheckForeignScan
      method.  Also, allow a foreign scan to have a child plan, which can
      be used to correctly populate the slot (or perhaps for something
      else, but this is the only use currently envisioned).
      
      KaiGai Kohei, reviewed by Robert Haas, Etsuro Fujita, and Kyotaro
      Horiguchi.
      385f337c
  32. 15 Oct, 2015 1 commit
    • Robert Haas's avatar
      Allow FDWs to push down quals without breaking EvalPlanQual rechecks. · 5fc4c26d
      Robert Haas authored
      This fixes a long-standing bug which was discovered while investigating
      the interaction between the new join pushdown code and the EvalPlanQual
      machinery: if a ForeignScan appears on the inner side of a paramaterized
      nestloop, an EPQ recheck would re-return the original tuple even if
      it no longer satisfied the pushed-down quals due to changed parameter
      values.
      
      This fix adds a new member to ForeignScan and ForeignScanState and a
      new argument to make_foreignscan, and requires changes to FDWs which
      push down quals to populate that new argument with a list of quals they
      have chosen to push down.  Therefore, I'm only back-patching to 9.5,
      even though the bug is not new in 9.5.
      
      Etsuro Fujita, reviewed by me and by Kyotaro Horiguchi.
      5fc4c26d
  33. 10 May, 2015 1 commit
    • Tom Lane's avatar
      Code review for foreign/custom join pushdown patch. · 1a8a4e5c
      Tom Lane authored
      Commit e7cb7ee1 included some design
      decisions that seem pretty questionable to me, and there was quite a lot
      of stuff not to like about the documentation and comments.  Clean up
      as follows:
      
      * Consider foreign joins only between foreign tables on the same server,
      rather than between any two foreign tables with the same underlying FDW
      handler function.  In most if not all cases, the FDW would simply have had
      to apply the same-server restriction itself (far more expensively, both for
      lack of caching and because it would be repeated for each combination of
      input sub-joins), or else risk nasty bugs.  Anyone who's really intent on
      doing something outside this restriction can always use the
      set_join_pathlist_hook.
      
      * Rename fdw_ps_tlist/custom_ps_tlist to fdw_scan_tlist/custom_scan_tlist
      to better reflect what they're for, and allow these custom scan tlists
      to be used even for base relations.
      
      * Change make_foreignscan() API to include passing the fdw_scan_tlist
      value, since the FDW is required to set that.  Backwards compatibility
      doesn't seem like an adequate reason to expect FDWs to set it in some
      ad-hoc extra step, and anyway existing FDWs can just pass NIL.
      
      * Change the API of path-generating subroutines of add_paths_to_joinrel,
      and in particular that of GetForeignJoinPaths and set_join_pathlist_hook,
      so that various less-used parameters are passed in a struct rather than
      as separate parameter-list entries.  The objective here is to reduce the
      probability that future additions to those parameter lists will result in
      source-level API breaks for users of these hooks.  It's possible that this
      is even a small win for the core code, since most CPU architectures can't
      pass more than half a dozen parameters efficiently anyway.  I kept root,
      joinrel, outerrel, innerrel, and jointype as separate parameters to reduce
      code churn in joinpath.c --- in particular, putting jointype into the
      struct would have been problematic because of the subroutines' habit of
      changing their local copies of that variable.
      
      * Avoid ad-hocery in ExecAssignScanProjectionInfo.  It was probably all
      right for it to know about IndexOnlyScan, but if the list is to grow
      we should refactor the knowledge out to the callers.
      
      * Restore nodeForeignscan.c's previous use of the relcache to avoid
      extra GetFdwRoutine lookups for base-relation scans.
      
      * Lots of cleanup of documentation and missed comments.  Re-order some
      code additions into more logical places.
      1a8a4e5c
  34. 01 May, 2015 1 commit
    • Robert Haas's avatar
      Allow FDWs and custom scan providers to replace joins with scans. · e7cb7ee1
      Robert Haas authored
      Foreign data wrappers can use this capability for so-called "join
      pushdown"; that is, instead of executing two separate foreign scans
      and then joining the results locally, they can generate a path which
      performs the join on the remote server and then is scanned locally.
      This commit does not extend postgres_fdw to take advantage of this
      capability; it just provides the infrastructure.
      
      Custom scan providers can use this in a similar way.  Previously,
      it was only possible for a custom scan provider to scan a single
      relation.  Now, it can scan an entire join tree, provided of course
      that it knows how to produce the same results that the join would
      have produced if executed normally.
      
      KaiGai Kohei, reviewed by Shigeru Hanada, Ashutosh Bapat, and me.
      e7cb7ee1
  35. 06 Jan, 2015 1 commit
  36. 06 May, 2014 1 commit
    • Bruce Momjian's avatar
      pgindent run for 9.4 · 0a783200
      Bruce Momjian authored
      This includes removing tabs after periods in C comments, which was
      applied to back branches, so this change should not effect backpatching.
      0a783200
  37. 07 Jan, 2014 1 commit
  38. 27 Apr, 2013 1 commit
    • Tom Lane's avatar
      Incidental cleanup of matviews code. · 5194024d
      Tom Lane authored
      Move checking for unscannable matviews into ExecOpenScanRelation, which is
      a better place for it first because the open relation is already available
      (saving a relcache lookup cycle), and second because this eliminates the
      problem of telling the difference between rangetable entries that will or
      will not be scanned by the query.  In particular we can get rid of the
      not-terribly-well-thought-out-or-implemented isResultRel field that the
      initial matviews patch added to RangeTblEntry.
      
      Also get rid of entirely unnecessary scannability check in the rewriter,
      and a bogus decision about whether RefreshMatViewStmt requires a parse-time
      snapshot.
      
      catversion bump due to removal of a RangeTblEntry field, which changes
      stored rules.
      5194024d
  39. 10 Mar, 2013 1 commit
    • Tom Lane's avatar
      Support writable foreign tables. · 21734d2f
      Tom Lane authored
      This patch adds the core-system infrastructure needed to support updates
      on foreign tables, and extends contrib/postgres_fdw to allow updates
      against remote Postgres servers.  There's still a great deal of room for
      improvement in optimization of remote updates, but at least there's basic
      functionality there now.
      
      KaiGai Kohei, reviewed by Alexander Korotkov and Laurenz Albe, and rather
      heavily revised by Tom Lane.
      21734d2f
  40. 07 Mar, 2013 1 commit