Invent a "one-shot" variant of CachedPlans for better performance.

SPI_execute() and related functions create a CachedPlan, execute it once, and immediately discard it, so that the functionality offered by plancache.c is of no value in this code path. And performance measurements show that the extra data copying and invalidation checking done by plancache.c slows down simple queries by 10% or more compared to 9.1. However, enough of the SPI code is shared with functions that do need plan caching that it seems impractical to bypass plancache.c altogether. Instead, let's invent a variant version of cached plans that preserves 99% of the API but doesn't offer any of the actual functionality, nor the overhead. This puts SPI_execute() performance back on par, or maybe even slightly better, than it was before. This change should resolve recent complaints of performance degradation from Dong Ye, Pavel Stehule, and others. By avoiding data copying, this change also reduces the amount of memory needed to execute many-statement SPI_execute() strings, as for instance in a recent complaint from Tomas Vondra. An additional benefit of this change is that multi-statement SPI_execute() query strings are now processed fully serially, that is we complete execution of earlier statements before running parse analysis and planning on following ones. This eliminates a long-standing POLA violation, in that DDL that affects the behavior of a later statement will now behave as expected. Back-patch to 9.2, since this was a performance regression compared to 9.1. (In 9.2, place the added struct fields so as to avoid changing the offsets of existing fields.) Heikki Linnakangas and Tom Lane

Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once, and immediately discard it, so that the functionality offered by plancache.c is of no value in this code path. And performance measurements show that the extra data copying and invalidation checking done by plancache.c slows down simple queries by 10% or more compared to 9.1. However, enough of the SPI code is shared with functions that do need plan caching that it seems impractical to bypass plancache.c altogether. Instead, let's invent a variant version of cached plans that preserves 99% of the API but doesn't offer any of the actual functionality, nor the overhead. This puts SPI_execute() performance back on par, or maybe even slightly better, than it was before. This change should resolve recent complaints of performance degradation from Dong Ye, Pavel Stehule, and others. By avoiding data copying, this change also reduces the amount of memory needed to execute many-statement SPI_execute() strings, as for instance in a recent complaint from Tomas Vondra. An additional benefit of this change is that multi-statement SPI_execute() query strings are now processed fully serially, that is we complete execution of earlier statements before running parse analysis and planning on following ones. This eliminates a long-standing POLA violation, in that DDL that affects the behavior of a later statement will now behave as expected. Back-patch to 9.2, since this was a performance regression compared to 9.1. (In 9.2, place the added struct fields so as to avoid changing the offsets of existing fields.) Heikki Linnakangas and Tom Lane
94afbd58 · Tom Lane · 78a5e738 · 94afbd58 · 94afbd58 · 94afbd58
Commit 94afbd58 authored Jan 04, 2013 by Tom Lane
5 changed files
--- a/doc/src/sgml/spi.sgml
+++ b/doc/src/sgml/spi.sgml
@@ -326,9 +326,7 @@ SPI_execute("INSERT INTO foo SELECT * FROM bar", false, 5);
  </para>
  <para>
-   You can pass multiple commands in one string, but later commands cannot
+   You can pass multiple commands in one string;
-   depend on the creation of objects earlier in the string, because the
-   whole string will be parsed and planned before execution begins.
   <function>SPI_execute</function> returns the
   result for the command executed last.  The <parameter>count</parameter>
   limit applies to each command separately, but it is not applied to
@@ -395,7 +393,8 @@ typedef struct
    TupleDesc   tupdesc;        /* row descriptor */
    HeapTuple  *vals;           /* rows */
 } SPITupleTable;
-</programlisting><structfield>vals</> is an array of pointers to rows.  (The number
+</programlisting>
+   <structfield>vals</> is an array of pointers to rows.  (The number
   of valid entries is given by <varname>SPI_processed</varname>.)
   <structfield>tupdesc</> is a row descriptor which you can pass to
   SPI functions dealing with rows.  <structfield>tuptabcxt</>,
@@ -435,7 +434,8 @@ typedef struct
    <term><literal>long <parameter>count</parameter></literal></term>
    <listitem>
     <para>
-      maximum number of rows to process or return
+      maximum number of rows to process or return,
+      or <literal>0</> for no limit
     </para>
    </listitem>
   </varlistentry>
@@ -674,7 +674,8 @@ int SPI_exec(const char * <parameter>command</parameter>, long <parameter>count<
    <term><literal>long <parameter>count</parameter></literal></term>
    <listitem>
     <para>
-      maximum number of rows to process or return
+      maximum number of rows to process or return,
+      or <literal>0</> for no limit
     </para>
    </listitem>
   </varlistentry>
@@ -812,7 +813,8 @@ int SPI_execute_with_args(const char *<parameter>command</parameter>,
    <term><literal>long <parameter>count</parameter></literal></term>
    <listitem>
     <para>
-      maximum number of rows to process or return
+      maximum number of rows to process or return,
+      or <literal>0</> for no limit
     </para>
    </listitem>
   </varlistentry>
@@ -1455,7 +1457,8 @@ int SPI_execute_plan(SPIPlanPtr <parameter>plan</parameter>, Datum * <parameter>
    <term><literal>long <parameter>count</parameter></literal></term>
    <listitem>
     <para>
-      maximum number of rows to process or return
+      maximum number of rows to process or return,
+      or <literal>0</> for no limit
     </para>
    </listitem>
   </varlistentry>
@@ -1572,7 +1575,8 @@ int SPI_execute_plan_with_paramlist(SPIPlanPtr <parameter>plan</parameter>,
    <term><literal>long <parameter>count</parameter></literal></term>
    <listitem>
     <para>
-      maximum number of rows to process or return
+      maximum number of rows to process or return,
+      or <literal>0</> for no limit
     </para>
    </listitem>
   </varlistentry>
@@ -1672,7 +1676,8 @@ int SPI_execp(SPIPlanPtr <parameter>plan</parameter>, Datum * <parameter>values<
    <term><literal>long <parameter>count</parameter></literal></term>
    <listitem>
     <para>
-      maximum number of rows to process or return
+      maximum number of rows to process or return,
+      or <literal>0</> for no limit
     </para>
    </listitem>
   </varlistentry>

--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -49,8 +49,9 @@ static int	_SPI_curid = -1;
 static Portal SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
 						 ParamListInfo paramLI, bool read_only);
-static void _SPI_prepare_plan(const char *src, SPIPlanPtr plan,
+static void _SPI_prepare_plan(const char *src, SPIPlanPtr plan);
-				  ParamListInfo boundParams);
+static void _SPI_prepare_oneshot_plan(const char *src, SPIPlanPtr plan);
 static int _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 				  Snapshot snapshot, Snapshot crosscheck_snapshot,
@@ -355,7 +356,7 @@ SPI_execute(const char *src, bool read_only, long tcount)
 	plan.magic = _SPI_PLAN_MAGIC;
 	plan.cursor_options = 0;
-	_SPI_prepare_plan(src, &plan, NULL);
+	_SPI_prepare_oneshot_plan(src, &plan);
 	res = _SPI_execute_plan(&plan, NULL,
 							InvalidSnapshot, InvalidSnapshot,
@@ -506,7 +507,7 @@ SPI_execute_with_args(const char *src,
 	paramLI = _SPI_convert_params(nargs, argtypes,
 								  Values, Nulls);
-	_SPI_prepare_plan(src, &plan, paramLI);
+	_SPI_prepare_oneshot_plan(src, &plan);
 	res = _SPI_execute_plan(&plan, paramLI,
 							InvalidSnapshot, InvalidSnapshot,
@@ -547,7 +548,7 @@ SPI_prepare_cursor(const char *src, int nargs, Oid *argtypes,
 	plan.parserSetup = NULL;
 	plan.parserSetupArg = NULL;
-	_SPI_prepare_plan(src, &plan, NULL);
+	_SPI_prepare_plan(src, &plan);
 	/* copy plan to procedure context */
 	result = _SPI_make_plan_non_temp(&plan);
@@ -584,7 +585,7 @@ SPI_prepare_params(const char *src,
 	plan.parserSetup = parserSetup;
 	plan.parserSetupArg = parserSetupArg;
-	_SPI_prepare_plan(src, &plan, NULL);
+	_SPI_prepare_plan(src, &plan);
 	/* copy plan to procedure context */
 	result = _SPI_make_plan_non_temp(&plan);
@@ -599,7 +600,8 @@ SPI_keepplan(SPIPlanPtr plan)
 {
 	ListCell   *lc;
-	if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC || plan->saved)
+	if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC ||
+		plan->saved || plan->oneshot)
 		return SPI_ERROR_ARGUMENT;
 	/*
@@ -1083,7 +1085,7 @@ SPI_cursor_open_with_args(const char *name,
 	paramLI = _SPI_convert_params(nargs, argtypes,
 								  Values, Nulls);
-	_SPI_prepare_plan(src, &plan, paramLI);
+	_SPI_prepare_plan(src, &plan);
 	/* We needn't copy the plan; SPI_cursor_open_internal will do so */
@@ -1645,10 +1647,6 @@ spi_printtup(TupleTableSlot *slot, DestReceiver *self)
 *
 * At entry, plan->argtypes and plan->nargs (or alternatively plan->parserSetup
 * and plan->parserSetupArg) must be valid, as must plan->cursor_options.
- * If boundParams isn't NULL then it represents parameter values that are made
- * available to the planner (as either estimates or hard values depending on
- * their PARAM_FLAG_CONST marking).  The boundParams had better match the
- * param type information embedded in the plan!
 *
 * Results are stored into *plan (specifically, plan->plancache_list).
 * Note that the result data is all in CurrentMemoryContext or child contexts
@@ -1657,13 +1655,12 @@ spi_printtup(TupleTableSlot *slot, DestReceiver *self)
 * parsing is also left in CurrentMemoryContext.
 */
 static void
-_SPI_prepare_plan(const char *src, SPIPlanPtr plan, ParamListInfo boundParams)
+_SPI_prepare_plan(const char *src, SPIPlanPtr plan)
 {
 	List	   *raw_parsetree_list;
 	List	   *plancache_list;
 	ListCell   *list_item;
 	ErrorContextCallback spierrcontext;
-	int			cursor_options = plan->cursor_options;
 	/*
 	 * Setup error traceback support for ereport()
@@ -1726,13 +1723,80 @@ _SPI_prepare_plan(const char *src, SPIPlanPtr plan, ParamListInfo boundParams)
 						   plan->nargs,
 						   plan->parserSetup,
 						   plan->parserSetupArg,
-						   cursor_options,
+						   plan->cursor_options,
 						   false);		/* not fixed result */
 		plancache_list = lappend(plancache_list, plansource);
 	}
 	plan->plancache_list = plancache_list;
+	plan->oneshot = false;
+	/*
+	 * Pop the error context stack
+	 */
+	error_context_stack = spierrcontext.previous;
+}
+/*
+ * Parse, but don't analyze, a querystring.
+ *
+ * This is a stripped-down version of _SPI_prepare_plan that only does the
+ * initial raw parsing.  It creates "one shot" CachedPlanSources
+ * that still require parse analysis before execution is possible.
+ *
+ * The advantage of using the "one shot" form of CachedPlanSource is that
+ * we eliminate data copying and invalidation overhead.  Postponing parse
+ * analysis also prevents issues if some of the raw parsetrees are DDL
+ * commands that affect validity of later parsetrees.  Both of these
+ * attributes are good things for SPI_execute() and similar cases.
+ *
+ * Results are stored into *plan (specifically, plan->plancache_list).
+ * Note that the result data is all in CurrentMemoryContext or child contexts
+ * thereof; in practice this means it is in the SPI executor context, and
+ * what we are creating is a "temporary" SPIPlan.  Cruft generated during
+ * parsing is also left in CurrentMemoryContext.
+ */
+static void
+_SPI_prepare_oneshot_plan(const char *src, SPIPlanPtr plan)
+{
+	List	   *raw_parsetree_list;
+	List	   *plancache_list;
+	ListCell   *list_item;
+	ErrorContextCallback spierrcontext;
+	/*
+	 * Setup error traceback support for ereport()
+	 */
+	spierrcontext.callback = _SPI_error_callback;
+	spierrcontext.arg = (void *) src;
+	spierrcontext.previous = error_context_stack;
+	error_context_stack = &spierrcontext;
+	/*
+	 * Parse the request string into a list of raw parse trees.
+	 */
+	raw_parsetree_list = pg_parse_query(src);
+	/*
+	 * Construct plancache entries, but don't do parse analysis yet.
+	 */
+	plancache_list = NIL;
+	foreach(list_item, raw_parsetree_list)
+	{
+		Node	   *parsetree = (Node *) lfirst(list_item);
+		CachedPlanSource *plansource;
+		plansource = CreateOneShotCachedPlan(parsetree,
+											 src,
+											 CreateCommandTag(parsetree));
+		plancache_list = lappend(plancache_list, plansource);
+	}
+	plan->plancache_list = plancache_list;
+	plan->oneshot = true;
 	/*
 	 * Pop the error context stack
@@ -1770,7 +1834,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 	 * Setup error traceback support for ereport()
 	 */
 	spierrcontext.callback = _SPI_error_callback;
-	spierrcontext.arg = NULL;
+	spierrcontext.arg = NULL;	/* we'll fill this below */
 	spierrcontext.previous = error_context_stack;
 	error_context_stack = &spierrcontext;
@@ -1816,6 +1880,47 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 		spierrcontext.arg = (void *) plansource->query_string;
+		/*
+		 * If this is a one-shot plan, we still need to do parse analysis.
+		 */
+		if (plan->oneshot)
+		{
+			Node	   *parsetree = plansource->raw_parse_tree;
+			const char *src = plansource->query_string;
+			List	   *stmt_list;
+			/*
+			 * Parameter datatypes are driven by parserSetup hook if provided,
+			 * otherwise we use the fixed parameter list.
+			 */
+			if (plan->parserSetup != NULL)
+			{
+				Assert(plan->nargs == 0);
+				stmt_list = pg_analyze_and_rewrite_params(parsetree,
+														  src,
+														  plan->parserSetup,
+														  plan->parserSetupArg);
+			}
+			else
+			{
+				stmt_list = pg_analyze_and_rewrite(parsetree,
+												   src,
+												   plan->argtypes,
+												   plan->nargs);
+			}
+			/* Finish filling in the CachedPlanSource */
+			CompleteCachedPlan(plansource,
+							   stmt_list,
+							   NULL,
+							   plan->argtypes,
+							   plan->nargs,
+							   plan->parserSetup,
+							   plan->parserSetupArg,
+							   plan->cursor_options,
+							   false);		/* not fixed result */
+		}
 		/*
 		 * Replan if needed, and increment plan refcount.  If it's a saved
 		 * plan, the refcount must be backed by the CurrentResourceOwner.
@@ -2313,6 +2418,8 @@ _SPI_make_plan_non_temp(SPIPlanPtr plan)
 	/* Assert the input is a temporary SPIPlan */
 	Assert(plan->magic == _SPI_PLAN_MAGIC);
 	Assert(plan->plancxt == NULL);
+	/* One-shot plans can't be saved */
+	Assert(!plan->oneshot);
 	/*
 	 * Create a memory context for the plan, underneath the procedure context.
@@ -2330,6 +2437,7 @@ _SPI_make_plan_non_temp(SPIPlanPtr plan)
 	newplan = (SPIPlanPtr) palloc(sizeof(_SPI_plan));
 	newplan->magic = _SPI_PLAN_MAGIC;
 	newplan->saved = false;
+	newplan->oneshot = false;
 	newplan->plancache_list = NIL;
 	newplan->plancxt = plancxt;
 	newplan->cursor_options = plan->cursor_options;
@@ -2379,6 +2487,9 @@ _SPI_save_plan(SPIPlanPtr plan)
 	MemoryContext oldcxt;
 	ListCell   *lc;
+	/* One-shot plans can't be saved */
+	Assert(!plan->oneshot);
 	/*
 	 * Create a memory context for the plan.  We don't expect the plan to be
 	 * very large, so use smaller-than-default alloc parameters.  It's a
@@ -2395,6 +2506,7 @@ _SPI_save_plan(SPIPlanPtr plan)
 	newplan = (SPIPlanPtr) palloc(sizeof(_SPI_plan));
 	newplan->magic = _SPI_PLAN_MAGIC;
 	newplan->saved = false;
+	newplan->oneshot = false;
 	newplan->plancache_list = NIL;
 	newplan->plancxt = plancxt;
 	newplan->cursor_options = plan->cursor_options;

--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
--- a/src/include/executor/spi_priv.h
+++ b/src/include/executor/spi_priv.h
@@ -59,6 +59,12 @@ typedef struct
 * while additional data such as argtypes and list cells is loose in the SPI
 * executor context.  Such plans can be identified by having plancxt == NULL.
 *
+ * We can also have "one-shot" SPI plans (which are typically temporary,
+ * as described above).  These are meant to be executed once and discarded,
+ * and various optimizations are made on the assumption of single use.
+ * Note in particular that the CachedPlanSources within such an SPI plan
+ * are not "complete" until execution.
+ *
 * Note: if the original query string contained only whitespace and comments,
 * the plancache_list will be NIL and so there is no place to store the
 * query string.  We don't care about that, but we do care about the
@@ -68,6 +74,7 @@ typedef struct _SPI_plan
 {
 	int			magic;			/* should equal _SPI_PLAN_MAGIC */
 	bool		saved;			/* saved or unsaved plan? */
+	bool		oneshot;		/* one-shot plan? */
 	List	   *plancache_list; /* one CachedPlanSource per parsetree */
 	MemoryContext plancxt;		/* Context containing _SPI_plan and data */
 	int			cursor_options; /* Cursor options used for planning */

--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -60,6 +60,14 @@
 * context that holds the rewritten query tree and associated data.  This
 * allows the query tree to be discarded easily when it is invalidated.
 *
+ * Some callers wish to use the CachedPlan API even with one-shot queries
+ * that have no reason to be saved at all.  We therefore support a "oneshot"
+ * variant that does no data copying or invalidation checking.  In this case
+ * there are no separate memory contexts: the CachedPlanSource struct and
+ * all subsidiary data live in the caller's CurrentMemoryContext, and there
+ * is no way to free memory short of clearing that entire context.  A oneshot
+ * plan is always treated as unsaved.
+ *
 * Note: the string referenced by commandTag is not subsidiary storage;
 * it is assumed to be a compile-time-constant string.	As with portals,
 * commandTag shall be NULL if and only if the original query string (before
@@ -69,7 +77,7 @@ typedef struct CachedPlanSource
 {
 	int			magic;			/* should equal CACHEDPLANSOURCE_MAGIC */
 	Node	   *raw_parse_tree; /* output of raw_parser() */
-	char	   *query_string;	/* source text of query */
+	const char *query_string;	/* source text of query */
 	const char *commandTag;		/* command tag (a constant!), or NULL */
 	Oid		   *param_types;	/* array of parameter type OIDs, or NULL */
 	int			num_params;		/* length of param_types array */
@@ -88,6 +96,7 @@ typedef struct CachedPlanSource
 	/* If we have a generic plan, this is a reference-counted link to it: */
 	struct CachedPlan *gplan;	/* generic plan, or NULL if not valid */
 	/* Some state flags: */
+	bool		is_oneshot;		/* is it a "oneshot" plan? */
 	bool		is_complete;	/* has CompleteCachedPlan been done? */
 	bool		is_saved;		/* has CachedPlanSource been "saved"? */
 	bool		is_valid;		/* is the query_list currently valid? */
@@ -106,13 +115,16 @@ typedef struct CachedPlanSource
 * (if any), and any active plan executions, so the plan can be discarded
 * exactly when refcount goes to zero.	Both the struct itself and the
 * subsidiary data live in the context denoted by the context field.
- * This makes it easy to free a no-longer-needed cached plan.
+ * This makes it easy to free a no-longer-needed cached plan.  (However,
+ * if is_oneshot is true, the context does not belong solely to the CachedPlan
+ * so no freeing is possible.)
 */
 typedef struct CachedPlan
 {
 	int			magic;			/* should equal CACHEDPLAN_MAGIC */
 	List	   *stmt_list;		/* list of statement nodes (PlannedStmts and
 								 * bare utility statements) */
+	bool		is_oneshot;		/* is it a "oneshot" plan? */
 	bool		is_saved;		/* is CachedPlan in a long-lived context? */
 	bool		is_valid;		/* is the stmt_list currently valid? */
 	TransactionId saved_xmin;	/* if valid, replan when TransactionXmin
@@ -129,6 +141,9 @@ extern void ResetPlanCache(void);
 extern CachedPlanSource *CreateCachedPlan(Node *raw_parse_tree,
 				 const char *query_string,
 				 const char *commandTag);
+extern CachedPlanSource *CreateOneShotCachedPlan(Node *raw_parse_tree,
+				 const char *query_string,
+				 const char *commandTag);
 extern void CompleteCachedPlan(CachedPlanSource *plansource,
 				   List *querytree_list,
 				   MemoryContext querytree_context,