Commit afb9249d authored by Tom Lane's avatar Tom Lane

Add support for doing late row locking in FDWs.

Previously, FDWs could only do "early row locking", that is lock a row as
soon as it's fetched, even though local restriction/join conditions might
discard the row later.  This patch adds callbacks that allow FDWs to do
late locking in the same way that it's done for regular tables.

To make use of this feature, an FDW must support the "ctid" column as a
unique row identifier.  Currently, since ctid has to be of type TID,
the feature is of limited use, though in principle it could be used by
postgres_fdw.  We may eventually allow FDWs to specify another data type
for ctid, which would make it possible for more FDWs to use this feature.

This commit does not modify postgres_fdw to use late locking.  We've
tested some prototype code for that, but it's not in committable shape,
and besides it's quite unclear whether it actually makes sense to do late
locking against a remote server.  The extra round trips required are likely
to outweigh any benefit from improved concurrency.

Etsuro Fujita, reviewed by Ashutosh Bapat, and hacked up a lot by me
parent aa4a0b95
......@@ -665,6 +665,108 @@ IsForeignRelUpdatable (Relation rel);
</sect2>
<sect2 id="fdw-callbacks-row-locking">
<title>FDW Routines For Row Locking</title>
<para>
If an FDW wishes to support <firstterm>late row locking</> (as described
in <xref linkend="fdw-row-locking">), it must provide the following
callback functions:
</para>
<para>
<programlisting>
RowMarkType
GetForeignRowMarkType (RangeTblEntry *rte,
LockClauseStrength strength);
</programlisting>
Report which row-marking option to use for a foreign table.
<literal>rte</> is the <structname>RangeTblEntry</> node for the table
and <literal>strength</> describes the lock strength requested by the
relevant <literal>FOR UPDATE/SHARE</> clause, if any. The result must be
a member of the <literal>RowMarkType</> enum type.
</para>
<para>
This function is called during query planning for each foreign table that
appears in an <command>UPDATE</>, <command>DELETE</>, or <command>SELECT
FOR UPDATE/SHARE</> query and is not the target of <command>UPDATE</>
or <command>DELETE</>.
</para>
<para>
If the <function>GetForeignRowMarkType</> pointer is set to
<literal>NULL</>, the <literal>ROW_MARK_COPY</> option is always used.
(This implies that <function>RefetchForeignRow</> will never be called,
so it need not be provided either.)
</para>
<para>
See <xref linkend="fdw-row-locking"> for more information.
</para>
<para>
<programlisting>
HeapTuple
RefetchForeignRow (EState *estate,
ExecRowMark *erm,
Datum rowid,
bool *updated);
</programlisting>
Re-fetch one tuple from the foreign table, after locking it if required.
<literal>estate</> is global execution state for the query.
<literal>erm</> is the <structname>ExecRowMark</> struct describing
the target foreign table and the row lock type (if any) to acquire.
<literal>rowid</> identifies the tuple to be fetched.
<literal>updated</> is an output parameter.
</para>
<para>
This function should return a palloc'ed copy of the fetched tuple,
or <literal>NULL</> if the row lock couldn't be obtained. The row lock
type to acquire is defined by <literal>erm-&gt;markType</>, which is the
value previously returned by <function>GetForeignRowMarkType</>.
(<literal>ROW_MARK_REFERENCE</> means to just re-fetch the tuple without
acquiring any lock, and <literal>ROW_MARK_COPY</> will never be seen by
this routine.)
</para>
<para>
In addition, <literal>*updated</> should be set to <literal>true</>
if what was fetched was an updated version of the tuple rather than
the same version previously obtained. (If the FDW cannot be sure about
this, always returning <literal>true</> is recommended.)
</para>
<para>
Note that by default, failure to acquire a row lock should result in
raising an error; a <literal>NULL</> return is only appropriate if
the <literal>SKIP LOCKED</> option is specified
by <literal>erm-&gt;waitPolicy</>.
</para>
<para>
The <literal>rowid</> is the <structfield>ctid</> value previously read
for the row to be re-fetched. Although the <literal>rowid</> value is
passed as a <type>Datum</>, it can currently only be a <type>tid</>. The
function API is chosen in hopes that it may be possible to allow other
datatypes for row IDs in future.
</para>
<para>
If the <function>RefetchForeignRow</> pointer is set to
<literal>NULL</>, attempts to re-fetch rows will fail
with an error message.
</para>
<para>
See <xref linkend="fdw-row-locking"> for more information.
</para>
</sect2>
<sect2 id="fdw-callbacks-explain">
<title>FDW Routines for <command>EXPLAIN</></title>
......@@ -1092,24 +1194,6 @@ GetForeignServerByName(const char *name, bool missing_ok);
structures that <function>copyObject</> knows how to copy.
</para>
<para>
For an <command>UPDATE</> or <command>DELETE</> against an external data
source that supports concurrent updates, it is recommended that the
<literal>ForeignScan</> operation lock the rows that it fetches, perhaps
via the equivalent of <command>SELECT FOR UPDATE</>. The FDW may also
choose to lock rows at fetch time when the foreign table is referenced
in a <command>SELECT FOR UPDATE/SHARE</>; if it does not, the
<literal>FOR UPDATE</> or <literal>FOR SHARE</> option is essentially a
no-op so far as the foreign table is concerned. This behavior may yield
semantics slightly different from operations on local tables, where row
locking is customarily delayed as long as possible: remote rows may get
locked even though they subsequently fail locally-applied restriction or
join conditions. However, matching the local semantics exactly would
require an additional remote access for every row, and might be
impossible anyway depending on what locking semantics the external data
source provides.
</para>
<para>
<command>INSERT</> with an <literal>ON CONFLICT</> clause does not
support specifying the conflict target, as remote constraints are not
......@@ -1117,6 +1201,118 @@ GetForeignServerByName(const char *name, bool missing_ok);
UPDATE</> is not supported, since the specification is mandatory there.
</para>
</sect1>
<sect1 id="fdw-row-locking">
<title>Row Locking in Foreign Data Wrappers</title>
<para>
If an FDW's underlying storage mechanism has a concept of locking
individual rows to prevent concurrent updates of those rows, it is
usually worthwhile for the FDW to perform row-level locking with as
close an approximation as practical to the semantics used in
ordinary <productname>PostgreSQL</> tables. There are multiple
considerations involved in this.
</para>
<para>
One key decision to be made is whether to perform <firstterm>early
locking</> or <firstterm>late locking</>. In early locking, a row is
locked when it is first retrieved from the underlying store, while in
late locking, the row is locked only when it is known that it needs to
be locked. (The difference arises because some rows may be discarded by
locally-checked restriction or join conditions.) Early locking is much
simpler and avoids extra round trips to a remote store, but it can cause
locking of rows that need not have been locked, resulting in reduced
concurrency or even unexpected deadlocks. Also, late locking is only
possible if the row to be locked can be uniquely re-identified later.
Preferably the row identifier should identify a specific version of the
row, as <productname>PostgreSQL</> TIDs do.
</para>
<para>
By default, <productname>PostgreSQL</> ignores locking considerations
when interfacing to FDWs, but an FDW can perform early locking without
any explicit support from the core code. The API functions described
in <xref linkend="fdw-callbacks-row-locking">, which were added
in <productname>PostgreSQL</> 9.5, allow an FDW to use late locking if
it wishes.
</para>
<para>
An additional consideration is that in <literal>READ COMMITTED</>
isolation mode, <productname>PostgreSQL</> may need to re-check
restriction and join conditions against an updated version of some
target tuple. Rechecking join conditions requires re-obtaining copies
of the non-target rows that were previously joined to the target tuple.
When working with standard <productname>PostgreSQL</> tables, this is
done by including the TIDs of the non-target tables in the column list
projected through the join, and then re-fetching non-target rows when
required. This approach keeps the join data set compact, but it
requires inexpensive re-fetch capability, as well as a TID that can
uniquely identify the row version to be re-fetched. By default,
therefore, the approach used with foreign tables is to include a copy of
the entire row fetched from a foreign table in the column list projected
through the join. This puts no special demands on the FDW but can
result in reduced performance of merge and hash joins. An FDW that is
capable of meeting the re-fetch requirements can choose to do it the
first way.
</para>
<para>
For an <command>UPDATE</> or <command>DELETE</> on a foreign table, it
is recommended that the <literal>ForeignScan</> operation on the target
table perform early locking on the rows that it fetches, perhaps via the
equivalent of <command>SELECT FOR UPDATE</>. An FDW can detect whether
a table is an <command>UPDATE</>/<command>DELETE</> target at plan time
by comparing its relid to <literal>root-&gt;parse-&gt;resultRelation</>,
or at execution time by using <function>ExecRelationIsTargetRelation()</>.
An alternative possibility is to perform late locking within the
<function>ExecForeignUpdate</> or <function>ExecForeignDelete</>
callback, but no special support is provided for this.
</para>
<para>
For foreign tables that are specified to be locked by a <command>SELECT
FOR UPDATE/SHARE</> command, the <literal>ForeignScan</> operation can
again perform early locking by fetching tuples with the equivalent
of <command>SELECT FOR UPDATE/SHARE</>. To perform late locking
instead, provide the callback functions defined
in <xref linkend="fdw-callbacks-row-locking">.
In <function>GetForeignRowMarkType</>, select rowmark option
<literal>ROW_MARK_EXCLUSIVE</>, <literal>ROW_MARK_NOKEYEXCLUSIVE</>,
<literal>ROW_MARK_SHARE</>, or <literal>ROW_MARK_KEYSHARE</> depending
on the requested lock strength. (The core code will act the same
regardless of which of these four options you choose.)
Elsewhere, you can detect whether a foreign table was specified to be
locked by this type of command by using <function>get_plan_rowmark</> at
plan time, or <function>ExecFindRowMark</> at execution time; you must
check not only whether a non-null rowmark struct is returned, but that
its <structfield>strength</> field is not <literal>LCS_NONE</>.
</para>
<para>
Lastly, for foreign tables that are used in an <command>UPDATE</>,
<command>DELETE</> or <command>SELECT FOR UPDATE/SHARE</> command but
are not specified to be row-locked, you can override the default choice
to copy entire rows by having <function>GetForeignRowMarkType</> select
option <literal>ROW_MARK_REFERENCE</> when it sees lock strength
<literal>LCS_NONE</>. This will cause <function>RefetchForeignRow</> to
be called with that value for <structfield>markType</>; it should then
re-fetch the row without acquiring any new lock. (If you have
a <function>GetForeignRowMarkType</> function but don't wish to re-fetch
unlocked rows, select option <literal>ROW_MARK_COPY</>
for <literal>LCS_NONE</>.)
</para>
<para>
See <filename>src/include/nodes/lockoptions.h</>, the comments
for <type>RowMarkType</> and <type>PlanRowMark</>
in <filename>src/include/nodes/plannodes.h</>, and the comments for
<type>ExecRowMark</> in <filename>src/include/nodes/execnodes.h</> for
additional information.
</para>
</sect1>
</chapter>
......@@ -898,8 +898,11 @@ InitPlan(QueryDesc *queryDesc, int eflags)
erm->prti = rc->prti;
erm->rowmarkId = rc->rowmarkId;
erm->markType = rc->markType;
erm->strength = rc->strength;
erm->waitPolicy = rc->waitPolicy;
erm->ermActive = false;
ItemPointerSetInvalid(&(erm->curCtid));
erm->ermExtra = NULL;
estate->es_rowMarks = lappend(estate->es_rowMarks, erm);
}
......@@ -1143,6 +1146,8 @@ CheckValidResultRel(Relation resultRel, CmdType operation)
static void
CheckValidRowMarkRel(Relation rel, RowMarkType markType)
{
FdwRoutine *fdwroutine;
switch (rel->rd_rel->relkind)
{
case RELKIND_RELATION:
......@@ -1178,11 +1183,13 @@ CheckValidRowMarkRel(Relation rel, RowMarkType markType)
RelationGetRelationName(rel))));
break;
case RELKIND_FOREIGN_TABLE:
/* Should not get here; planner should have used ROW_MARK_COPY */
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot lock rows in foreign table \"%s\"",
RelationGetRelationName(rel))));
/* Okay only if the FDW supports it */
fdwroutine = GetFdwRoutineForRelation(rel, false);
if (fdwroutine->RefetchForeignRow == NULL)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot lock rows in foreign table \"%s\"",
RelationGetRelationName(rel))));
break;
default:
ereport(ERROR,
......@@ -2005,9 +2012,11 @@ ExecUpdateLockMode(EState *estate, ResultRelInfo *relinfo)
/*
* ExecFindRowMark -- find the ExecRowMark struct for given rangetable index
*
* If no such struct, either return NULL or throw error depending on missing_ok
*/
ExecRowMark *
ExecFindRowMark(EState *estate, Index rti)
ExecFindRowMark(EState *estate, Index rti, bool missing_ok)
{
ListCell *lc;
......@@ -2018,8 +2027,9 @@ ExecFindRowMark(EState *estate, Index rti)
if (erm->rti == rti)
return erm;
}
elog(ERROR, "failed to find ExecRowMark for rangetable index %u", rti);
return NULL; /* keep compiler quiet */
if (!missing_ok)
elog(ERROR, "failed to find ExecRowMark for rangetable index %u", rti);
return NULL;
}
/*
......@@ -2530,7 +2540,7 @@ EvalPlanQualFetchRowMarks(EPQState *epqstate)
if (erm->markType == ROW_MARK_REFERENCE)
{
Buffer buffer;
HeapTuple copyTuple;
Assert(erm->relation != NULL);
......@@ -2541,17 +2551,50 @@ EvalPlanQualFetchRowMarks(EPQState *epqstate)
/* non-locked rels could be on the inside of outer joins */
if (isNull)
continue;
tuple.t_self = *((ItemPointer) DatumGetPointer(datum));
/* okay, fetch the tuple */
if (!heap_fetch(erm->relation, SnapshotAny, &tuple, &buffer,
false, NULL))
elog(ERROR, "failed to fetch tuple for EvalPlanQual recheck");
/* fetch requests on foreign tables must be passed to their FDW */
if (erm->relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
{
FdwRoutine *fdwroutine;
bool updated = false;
/* successful, copy and store tuple */
EvalPlanQualSetTuple(epqstate, erm->rti,
heap_copytuple(&tuple));
ReleaseBuffer(buffer);
fdwroutine = GetFdwRoutineForRelation(erm->relation, false);
/* this should have been checked already, but let's be safe */
if (fdwroutine->RefetchForeignRow == NULL)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot lock rows in foreign table \"%s\"",
RelationGetRelationName(erm->relation))));
copyTuple = fdwroutine->RefetchForeignRow(epqstate->estate,
erm,
datum,
&updated);
if (copyTuple == NULL)
elog(ERROR, "failed to fetch tuple for EvalPlanQual recheck");
/*
* Ideally we'd insist on updated == false here, but that
* assumes that FDWs can track that exactly, which they might
* not be able to. So just ignore the flag.
*/
}
else
{
/* ordinary table, fetch the tuple */
Buffer buffer;
tuple.t_self = *((ItemPointer) DatumGetPointer(datum));
if (!heap_fetch(erm->relation, SnapshotAny, &tuple, &buffer,
false, NULL))
elog(ERROR, "failed to fetch tuple for EvalPlanQual recheck");
/* successful, copy tuple */
copyTuple = heap_copytuple(&tuple);
ReleaseBuffer(buffer);
}
/* store tuple */
EvalPlanQualSetTuple(epqstate, erm->rti, copyTuple);
}
else
{
......
......@@ -805,20 +805,11 @@ ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags)
lockmode = NoLock;
else
{
ListCell *l;
/* Keep this check in sync with InitPlan! */
ExecRowMark *erm = ExecFindRowMark(estate, scanrelid, true);
foreach(l, estate->es_rowMarks)
{
ExecRowMark *erm = lfirst(l);
/* Keep this check in sync with InitPlan! */
if (erm->rti == scanrelid &&
erm->relation != NULL)
{
lockmode = NoLock;
break;
}
}
if (erm != NULL && erm->relation != NULL)
lockmode = NoLock;
}
/* Open the relation and acquire lock as needed */
......
......@@ -25,6 +25,7 @@
#include "access/xact.h"
#include "executor/executor.h"
#include "executor/nodeLockRows.h"
#include "foreign/fdwapi.h"
#include "storage/bufmgr.h"
#include "utils/rel.h"
#include "utils/tqual.h"
......@@ -40,7 +41,7 @@ ExecLockRows(LockRowsState *node)
TupleTableSlot *slot;
EState *estate;
PlanState *outerPlan;
bool epq_started;
bool epq_needed;
ListCell *lc;
/*
......@@ -58,15 +59,18 @@ lnext:
if (TupIsNull(slot))
return NULL;
/* We don't need EvalPlanQual unless we get updated tuple version(s) */
epq_needed = false;
/*
* Attempt to lock the source tuple(s). (Note we only have locking
* rowmarks in lr_arowMarks.)
*/
epq_started = false;
foreach(lc, node->lr_arowMarks)
{
ExecAuxRowMark *aerm = (ExecAuxRowMark *) lfirst(lc);
ExecRowMark *erm = aerm->rowmark;
HeapTuple *testTuple;
Datum datum;
bool isNull;
HeapTupleData tuple;
......@@ -77,8 +81,10 @@ lnext:
HeapTuple copyTuple;
/* clear any leftover test tuple for this rel */
if (node->lr_epqstate.estate != NULL)
EvalPlanQualSetTuple(&node->lr_epqstate, erm->rti, NULL);
testTuple = &(node->lr_curtuples[erm->rti - 1]);
if (*testTuple != NULL)
heap_freetuple(*testTuple);
*testTuple = NULL;
/* if child rel, must check whether it produced this row */
if (erm->rti != erm->prti)
......@@ -97,10 +103,12 @@ lnext:
if (tableoid != erm->relid)
{
/* this child is inactive right now */
erm->ermActive = false;
ItemPointerSetInvalid(&(erm->curCtid));
continue;
}
}
erm->ermActive = true;
/* fetch the tuple's ctid */
datum = ExecGetJunkAttribute(slot,
......@@ -109,9 +117,45 @@ lnext:
/* shouldn't ever get a null result... */
if (isNull)
elog(ERROR, "ctid is NULL");
tuple.t_self = *((ItemPointer) DatumGetPointer(datum));
/* requests for foreign tables must be passed to their FDW */
if (erm->relation->rd_rel->relkind == RELKIND_FOREIGN_TABLE)
{
FdwRoutine *fdwroutine;
bool updated = false;
fdwroutine = GetFdwRoutineForRelation(erm->relation, false);
/* this should have been checked already, but let's be safe */
if (fdwroutine->RefetchForeignRow == NULL)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot lock rows in foreign table \"%s\"",
RelationGetRelationName(erm->relation))));
copyTuple = fdwroutine->RefetchForeignRow(estate,
erm,
datum,
&updated);
if (copyTuple == NULL)
{
/* couldn't get the lock, so skip this row */
goto lnext;
}
/* save locked tuple for possible EvalPlanQual testing below */
*testTuple = copyTuple;
/*
* if FDW says tuple was updated before getting locked, we need to
* perform EPQ testing to see if quals are still satisfied
*/
if (updated)
epq_needed = true;
continue;
}
/* okay, try to lock the tuple */
tuple.t_self = *((ItemPointer) DatumGetPointer(datum));
switch (erm->markType)
{
case ROW_MARK_EXCLUSIVE:
......@@ -191,40 +235,11 @@ lnext:
/* remember the actually locked tuple's TID */
tuple.t_self = copyTuple->t_self;
/*
* Need to run a recheck subquery. Initialize EPQ state if we
* didn't do so already.
*/
if (!epq_started)
{
ListCell *lc2;
/* Save locked tuple for EvalPlanQual testing below */
*testTuple = copyTuple;
EvalPlanQualBegin(&node->lr_epqstate, estate);
/*
* Ensure that rels with already-visited rowmarks are told
* not to return tuples during the first EPQ test. We can
* exit this loop once it reaches the current rowmark;
* rels appearing later in the list will be set up
* correctly by the EvalPlanQualSetTuple call at the top
* of the loop.
*/
foreach(lc2, node->lr_arowMarks)
{
ExecAuxRowMark *aerm2 = (ExecAuxRowMark *) lfirst(lc2);
if (lc2 == lc)
break;
EvalPlanQualSetTuple(&node->lr_epqstate,
aerm2->rowmark->rti,
NULL);
}
epq_started = true;
}
/* Store target tuple for relation's scan node */
EvalPlanQualSetTuple(&node->lr_epqstate, erm->rti, copyTuple);
/* Remember we need to do EPQ testing */
epq_needed = true;
/* Continue loop until we have all target tuples */
break;
......@@ -237,17 +252,35 @@ lnext:
test);
}
/* Remember locked tuple's TID for WHERE CURRENT OF */
/* Remember locked tuple's TID for EPQ testing and WHERE CURRENT OF */
erm->curCtid = tuple.t_self;
}
/*
* If we need to do EvalPlanQual testing, do so.
*/
if (epq_started)
if (epq_needed)
{
int i;
/* Initialize EPQ machinery */
EvalPlanQualBegin(&node->lr_epqstate, estate);
/*
* Transfer already-fetched tuples into the EPQ state, and make sure
* its test tuples for other tables are reset to NULL.
*/
for (i = 0; i < node->lr_ntables; i++)
{
EvalPlanQualSetTuple(&node->lr_epqstate,
i + 1,
node->lr_curtuples[i]);
/* freeing this tuple is now the responsibility of EPQ */
node->lr_curtuples[i] = NULL;
}
/*
* First, fetch a copy of any rows that were successfully locked
* Next, fetch a copy of any rows that were successfully locked
* without any update having occurred. (We do this in a separate pass
* so as to avoid overhead in the common case where there are no
* concurrent updates.)
......@@ -260,7 +293,7 @@ lnext:
Buffer buffer;
/* ignore non-active child tables */
if (!ItemPointerIsValid(&(erm->curCtid)))
if (!erm->ermActive)
{
Assert(erm->rti != erm->prti); /* check it's child table */
continue;
......@@ -269,6 +302,10 @@ lnext:
if (EvalPlanQualGetTuple(&node->lr_epqstate, erm->rti) != NULL)
continue; /* it was updated and fetched above */
/* foreign tables should have been fetched above */
Assert(erm->relation->rd_rel->relkind != RELKIND_FOREIGN_TABLE);
Assert(ItemPointerIsValid(&(erm->curCtid)));
/* okay, fetch the tuple */
tuple.t_self = erm->curCtid;
if (!heap_fetch(erm->relation, SnapshotAny, &tuple, &buffer,
......@@ -351,6 +388,13 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
ExecAssignResultTypeFromTL(&lrstate->ps);
lrstate->ps.ps_ProjInfo = NULL;
/*
* Create workspace in which we can remember per-RTE locked tuples
*/
lrstate->lr_ntables = list_length(estate->es_range_table);
lrstate->lr_curtuples = (HeapTuple *)
palloc0(lrstate->lr_ntables * sizeof(HeapTuple));
/*
* Locate the ExecRowMark(s) that this node is responsible for, and
* construct ExecAuxRowMarks for them. (InitPlan should already have
......@@ -370,8 +414,11 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
if (rc->isParent)
continue;
/* safety check on size of lr_curtuples array */
Assert(rc->rti > 0 && rc->rti <= lrstate->lr_ntables);
/* find ExecRowMark and build ExecAuxRowMark */
erm = ExecFindRowMark(estate, rc->rti);
erm = ExecFindRowMark(estate, rc->rti, false);
aerm = ExecBuildAuxRowMark(erm, outerPlan->targetlist);
/*
......
......@@ -1720,7 +1720,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
continue;
/* find ExecRowMark (same for all subplans) */
erm = ExecFindRowMark(estate, rc->rti);
erm = ExecFindRowMark(estate, rc->rti, false);
/* build ExecAuxRowMark for each subplan */
for (i = 0; i < nplans; i++)
......
......@@ -20,6 +20,7 @@
#include "access/htup_details.h"
#include "executor/executor.h"
#include "executor/nodeAgg.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
#ifdef OPTIMIZER_DEBUG
......@@ -2324,7 +2325,12 @@ select_rowmark_type(RangeTblEntry *rte, LockClauseStrength strength)
}
else if (rte->relkind == RELKIND_FOREIGN_TABLE)
{
/* For now, we force all foreign tables to use ROW_MARK_COPY */
/* Let the FDW select the rowmark type, if it wants to */
FdwRoutine *fdwroutine = GetFdwRoutineByRelId(rte->relid);
if (fdwroutine->GetForeignRowMarkType != NULL)
return fdwroutine->GetForeignRowMarkType(rte, strength);
/* Otherwise, use ROW_MARK_COPY by default */
return ROW_MARK_COPY;
}
else
......
......@@ -196,7 +196,7 @@ extern void ExecConstraints(ResultRelInfo *resultRelInfo,
extern void ExecWithCheckOptions(WCOKind kind, ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate);
extern LockTupleMode ExecUpdateLockMode(EState *estate, ResultRelInfo *relinfo);
extern ExecRowMark *ExecFindRowMark(EState *estate, Index rti);
extern ExecRowMark *ExecFindRowMark(EState *estate, Index rti, bool missing_ok);
extern ExecAuxRowMark *ExecBuildAuxRowMark(ExecRowMark *erm, List *targetlist);
extern TupleTableSlot *EvalPlanQual(EState *estate, EPQState *epqstate,
Relation relation, Index rti, int lockmode,
......
......@@ -89,6 +89,14 @@ typedef void (*EndForeignModify_function) (EState *estate,
typedef int (*IsForeignRelUpdatable_function) (Relation rel);
typedef RowMarkType (*GetForeignRowMarkType_function) (RangeTblEntry *rte,
LockClauseStrength strength);
typedef HeapTuple (*RefetchForeignRow_function) (EState *estate,
ExecRowMark *erm,
Datum rowid,
bool *updated);
typedef void (*ExplainForeignScan_function) (ForeignScanState *node,
struct ExplainState *es);
......@@ -151,6 +159,10 @@ typedef struct FdwRoutine
EndForeignModify_function EndForeignModify;
IsForeignRelUpdatable_function IsForeignRelUpdatable;
/* Functions for SELECT FOR UPDATE/SHARE row locking */
GetForeignRowMarkType_function GetForeignRowMarkType;
RefetchForeignRow_function RefetchForeignRow;
/* Support functions for EXPLAIN */
ExplainForeignScan_function ExplainForeignScan;
ExplainForeignModify_function ExplainForeignModify;
......
......@@ -429,8 +429,11 @@ typedef struct EState
* parent RTEs, which can be ignored at runtime). Virtual relations such as
* subqueries-in-FROM will have an ExecRowMark with relation == NULL. See
* PlanRowMark for details about most of the fields. In addition to fields
* directly derived from PlanRowMark, we store curCtid, which is used by the
* WHERE CURRENT OF code.
* directly derived from PlanRowMark, we store an activity flag (to denote
* inactive children of inheritance trees), curCtid, which is used by the
* WHERE CURRENT OF code, and ermExtra, which is available for use by the plan
* node that sources the relation (e.g., for a foreign table the FDW can use
* ermExtra to hold information).
*
* EState->es_rowMarks is a list of these structs.
*/
......@@ -442,8 +445,11 @@ typedef struct ExecRowMark
Index prti; /* parent range table index, if child */
Index rowmarkId; /* unique identifier for resjunk columns */
RowMarkType markType; /* see enum in nodes/plannodes.h */
LockClauseStrength strength; /* LockingClause's strength, or LCS_NONE */
LockWaitPolicy waitPolicy; /* NOWAIT and SKIP LOCKED */
bool ermActive; /* is this mark relevant for current tuple? */
ItemPointerData curCtid; /* ctid of currently locked tuple, if any */
void *ermExtra; /* available for use by relation source node */
} ExecRowMark;
/*
......@@ -1921,6 +1927,8 @@ typedef struct LockRowsState
PlanState ps; /* its first field is NodeTag */
List *lr_arowMarks; /* List of ExecAuxRowMarks */
EPQState lr_epqstate; /* for evaluating EvalPlanQual rechecks */
HeapTuple *lr_curtuples; /* locked tuples (one entry per RT entry) */
int lr_ntables; /* length of lr_curtuples[] array */
} LockRowsState;
/* ----------------
......
......@@ -822,16 +822,16 @@ typedef struct Limit
*
* The first four of these values represent different lock strengths that
* we can take on tuples according to SELECT FOR [KEY] UPDATE/SHARE requests.
* We only support these on regular tables. For foreign tables, any locking
* that might be done for these requests must happen during the initial row
* fetch; there is no mechanism for going back to lock a row later (and thus
* no need for EvalPlanQual machinery during updates of foreign tables).
* We support these on regular tables, as well as on foreign tables whose FDWs
* report support for late locking. For other foreign tables, any locking
* that might be done for such requests must happen during the initial row
* fetch; their FDWs provide no mechanism for going back to lock a row later.
* This means that the semantics will be a bit different than for a local
* table; in particular we are likely to lock more rows than would be locked
* locally, since remote rows will be locked even if they then fail
* locally-checked restriction or join quals. However, the alternative of
* doing a separate remote query to lock each selected row is extremely
* unappealing, so let's do it like this for now.
* locally-checked restriction or join quals. However, the prospect of
* doing a separate remote query to lock each selected row is usually pretty
* unappealing, so early locking remains a credible design choice for FDWs.
*
* When doing UPDATE, DELETE, or SELECT FOR UPDATE/SHARE, we have to uniquely
* identify all the source rows, not only those from the target relations, so
......@@ -840,12 +840,11 @@ typedef struct Limit
* represented by ROW_MARK_REFERENCE. Otherwise (for example for VALUES or
* FUNCTION scans) we have to copy the whole row value. ROW_MARK_COPY is
* pretty inefficient, since most of the time we'll never need the data; but
* fortunately the case is not performance-critical in practice. Note that
* we use ROW_MARK_COPY for non-target foreign tables, even if the FDW has a
* concept of rowid and so could theoretically support some form of
* ROW_MARK_REFERENCE. Although copying the whole row value is inefficient,
* it's probably still faster than doing a second remote fetch, so it doesn't
* seem worth the extra complexity to permit ROW_MARK_REFERENCE.
* fortunately the overhead is usually not performance-critical in practice.
* By default we use ROW_MARK_COPY for foreign tables, but if the FDW has
* a concept of rowid it can request to use ROW_MARK_REFERENCE instead.
* (Again, this probably doesn't make sense if a physical remote fetch is
* needed, but for FDWs that map to local storage it might be credible.)
*/
typedef enum RowMarkType
{
......@@ -866,7 +865,7 @@ typedef enum RowMarkType
* When doing UPDATE, DELETE, or SELECT FOR UPDATE/SHARE, we create a separate
* PlanRowMark node for each non-target relation in the query. Relations that
* are not specified as FOR UPDATE/SHARE are marked ROW_MARK_REFERENCE (if
* regular tables) or ROW_MARK_COPY (if not).
* regular tables or supported foreign tables) or ROW_MARK_COPY (if not).
*
* Initially all PlanRowMarks have rti == prti and isParent == false.
* When the planner discovers that a relation is the root of an inheritance
......@@ -879,8 +878,8 @@ typedef enum RowMarkType
* to use different markTypes).
*
* The planner also adds resjunk output columns to the plan that carry
* information sufficient to identify the locked or fetched rows. For
* regular tables (markType != ROW_MARK_COPY), these columns are named
* information sufficient to identify the locked or fetched rows. When
* markType != ROW_MARK_COPY, these columns are named
* tableoid%u OID of table
* ctid%u TID of row
* The tableoid column is only present for an inheritance hierarchy.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment