Commit 499be013 authored by Alvaro Herrera's avatar Alvaro Herrera

Support partition pruning at execution time

Existing partition pruning is only able to work at plan time, for query
quals that appear in the parsed query.  This is good but limiting, as
there can be parameters that appear later that can be usefully used to
further prune partitions.

This commit adds support for pruning subnodes of Append which cannot
possibly contain any matching tuples, during execution, by evaluating
Params to determine the minimum set of subnodes that can possibly match.
We support more than just simple Params in WHERE clauses. Support
additionally includes:

1. Parameterized Nested Loop Joins: The parameter from the outer side of the
   join can be used to determine the minimum set of inner side partitions to
   scan.

2. Initplans: Once an initplan has been executed we can then determine which
   partitions match the value from the initplan.

Partition pruning is performed in two ways.  When Params external to the plan
are found to match the partition key we attempt to prune away unneeded Append
subplans during the initialization of the executor.  This allows us to bypass
the initialization of non-matching subplans meaning they won't appear in the
EXPLAIN or EXPLAIN ANALYZE output.

For parameters whose value is only known during the actual execution
then the pruning of these subplans must wait.  Subplans which are
eliminated during this stage of pruning are still visible in the EXPLAIN
output.  In order to determine if pruning has actually taken place, the
EXPLAIN ANALYZE must be viewed.  If a certain Append subplan was never
executed due to the elimination of the partition then the execution
timing area will state "(never executed)".  Whereas, if, for example in
the case of parameterized nested loops, the number of loops stated in
the EXPLAIN ANALYZE output for certain subplans may appear lower than
others due to the subplan having been scanned fewer times.  This is due
to the list of matching subnodes having to be evaluated whenever a
parameter which was found to match the partition key changes.

This commit required some additional infrastructure that permits the
building of a data structure which is able to perform the translation of
the matching partition IDs, as returned by get_matching_partitions, into
the list index of a subpaths list, as exist in node types such as
Append, MergeAppend and ModifyTable.  This allows us to translate a list
of clauses into a Bitmapset of all the subpath indexes which must be
included to satisfy the clause list.

Author: David Rowley, based on an earlier effort by Beena Emerson
Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi,
Jesper Pedersen
Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com
parent 5c067521
...@@ -894,6 +894,18 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000 ...@@ -894,6 +894,18 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000
BitmapAnd and BitmapOr nodes always report their actual row counts as zero, BitmapAnd and BitmapOr nodes always report their actual row counts as zero,
due to implementation limitations. due to implementation limitations.
</para> </para>
<para>
Generally, the <command>EXPLAIN</command> output will display details for
every plan node which was generated by the query planner. However, there
are cases where the executor is able to determine that certain nodes are
not required; currently, the only node type to support this is the
<literal>Append</literal> node. This node type has the ability to discard
subnodes which it is able to determine won't contain any records required
by the query. It is possible to determine that nodes have been removed in
this way by the presence of a "Subplans Removed" property in the
<command>EXPLAIN</command> output.
</para>
</sect2> </sect2>
</sect1> </sect1>
......
...@@ -118,8 +118,8 @@ static void ExplainModifyTarget(ModifyTable *plan, ExplainState *es); ...@@ -118,8 +118,8 @@ static void ExplainModifyTarget(ModifyTable *plan, ExplainState *es);
static void ExplainTargetRel(Plan *plan, Index rti, ExplainState *es); static void ExplainTargetRel(Plan *plan, Index rti, ExplainState *es);
static void show_modifytable_info(ModifyTableState *mtstate, List *ancestors, static void show_modifytable_info(ModifyTableState *mtstate, List *ancestors,
ExplainState *es); ExplainState *es);
static void ExplainMemberNodes(List *plans, PlanState **planstates, static void ExplainMemberNodes(PlanState **planstates, int nsubnodes,
List *ancestors, ExplainState *es); int nplans, List *ancestors, ExplainState *es);
static void ExplainSubPlans(List *plans, List *ancestors, static void ExplainSubPlans(List *plans, List *ancestors,
const char *relationship, ExplainState *es); const char *relationship, ExplainState *es);
static void ExplainCustomChildren(CustomScanState *css, static void ExplainCustomChildren(CustomScanState *css,
...@@ -1811,28 +1811,33 @@ ExplainNode(PlanState *planstate, List *ancestors, ...@@ -1811,28 +1811,33 @@ ExplainNode(PlanState *planstate, List *ancestors,
switch (nodeTag(plan)) switch (nodeTag(plan))
{ {
case T_ModifyTable: case T_ModifyTable:
ExplainMemberNodes(((ModifyTable *) plan)->plans, ExplainMemberNodes(((ModifyTableState *) planstate)->mt_plans,
((ModifyTableState *) planstate)->mt_plans, ((ModifyTableState *) planstate)->mt_nplans,
list_length(((ModifyTable *) plan)->plans),
ancestors, es); ancestors, es);
break; break;
case T_Append: case T_Append:
ExplainMemberNodes(((Append *) plan)->appendplans, ExplainMemberNodes(((AppendState *) planstate)->appendplans,
((AppendState *) planstate)->appendplans, ((AppendState *) planstate)->as_nplans,
list_length(((Append *) plan)->appendplans),
ancestors, es); ancestors, es);
break; break;
case T_MergeAppend: case T_MergeAppend:
ExplainMemberNodes(((MergeAppend *) plan)->mergeplans, ExplainMemberNodes(((MergeAppendState *) planstate)->mergeplans,
((MergeAppendState *) planstate)->mergeplans, ((MergeAppendState *) planstate)->ms_nplans,
list_length(((MergeAppend *) plan)->mergeplans),
ancestors, es); ancestors, es);
break; break;
case T_BitmapAnd: case T_BitmapAnd:
ExplainMemberNodes(((BitmapAnd *) plan)->bitmapplans, ExplainMemberNodes(((BitmapAndState *) planstate)->bitmapplans,
((BitmapAndState *) planstate)->bitmapplans, ((BitmapAndState *) planstate)->nplans,
list_length(((BitmapAnd *) plan)->bitmapplans),
ancestors, es); ancestors, es);
break; break;
case T_BitmapOr: case T_BitmapOr:
ExplainMemberNodes(((BitmapOr *) plan)->bitmapplans, ExplainMemberNodes(((BitmapOrState *) planstate)->bitmapplans,
((BitmapOrState *) planstate)->bitmapplans, ((BitmapOrState *) planstate)->nplans,
list_length(((BitmapOr *) plan)->bitmapplans),
ancestors, es); ancestors, es);
break; break;
case T_SubqueryScan: case T_SubqueryScan:
...@@ -3173,18 +3178,28 @@ show_modifytable_info(ModifyTableState *mtstate, List *ancestors, ...@@ -3173,18 +3178,28 @@ show_modifytable_info(ModifyTableState *mtstate, List *ancestors,
* *
* The ancestors list should already contain the immediate parent of these * The ancestors list should already contain the immediate parent of these
* plans. * plans.
* *
* Note: we don't actually need to examine the Plan list members, but * nsubnodes indicates the number of items in the planstates array.
* we need the list in order to determine the length of the PlanState array. * nplans indicates the original number of subnodes in the Plan, some of these
* may have been pruned by the run-time pruning code.
*/ */
static void static void
ExplainMemberNodes(List *plans, PlanState **planstates, ExplainMemberNodes(PlanState **planstates, int nsubnodes, int nplans,
List *ancestors, ExplainState *es) List *ancestors, ExplainState *es)
{ {
int nplans = list_length(plans);
int j; int j;
for (j = 0; j < nplans; j++) /*
* The number of subnodes being lower than the number of subplans that was
* specified in the plan means that some subnodes have been ignored per
* instruction for the partition pruning code during the executor
* initialization. To make this a bit less mysterious, we'll indicate
* here that this has happened.
*/
if (nsubnodes < nplans)
ExplainPropertyInteger("Subplans Removed", NULL, nplans - nsubnodes, es);
for (j = 0; j < nsubnodes; j++)
ExplainNode(planstates[j], ancestors, ExplainNode(planstates[j], ancestors,
"Member", NULL, es); "Member", NULL, es);
} }
......
This diff is collapsed.
This diff is collapsed.
...@@ -248,6 +248,7 @@ _copyAppend(const Append *from) ...@@ -248,6 +248,7 @@ _copyAppend(const Append *from)
COPY_NODE_FIELD(partitioned_rels); COPY_NODE_FIELD(partitioned_rels);
COPY_NODE_FIELD(appendplans); COPY_NODE_FIELD(appendplans);
COPY_SCALAR_FIELD(first_partial_plan); COPY_SCALAR_FIELD(first_partial_plan);
COPY_NODE_FIELD(part_prune_infos);
return newnode; return newnode;
} }
...@@ -2182,6 +2183,23 @@ _copyPartitionPruneStepCombine(const PartitionPruneStepCombine *from) ...@@ -2182,6 +2183,23 @@ _copyPartitionPruneStepCombine(const PartitionPruneStepCombine *from)
return newnode; return newnode;
} }
static PartitionPruneInfo *
_copyPartitionPruneInfo(const PartitionPruneInfo *from)
{
PartitionPruneInfo *newnode = makeNode(PartitionPruneInfo);
COPY_SCALAR_FIELD(reloid);
COPY_NODE_FIELD(pruning_steps);
COPY_BITMAPSET_FIELD(present_parts);
COPY_SCALAR_FIELD(nparts);
COPY_POINTER_FIELD(subnode_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
COPY_BITMAPSET_FIELD(extparams);
COPY_BITMAPSET_FIELD(execparams);
return newnode;
}
/* **************************************************************** /* ****************************************************************
* relation.h copy functions * relation.h copy functions
* *
...@@ -5123,6 +5141,9 @@ copyObjectImpl(const void *from) ...@@ -5123,6 +5141,9 @@ copyObjectImpl(const void *from)
case T_PlaceHolderInfo: case T_PlaceHolderInfo:
retval = _copyPlaceHolderInfo(from); retval = _copyPlaceHolderInfo(from);
break; break;
case T_PartitionPruneInfo:
retval = _copyPartitionPruneInfo(from);
break;
/* /*
* VALUE NODES * VALUE NODES
......
...@@ -30,7 +30,7 @@ static int leftmostLoc(int loc1, int loc2); ...@@ -30,7 +30,7 @@ static int leftmostLoc(int loc1, int loc2);
static bool fix_opfuncids_walker(Node *node, void *context); static bool fix_opfuncids_walker(Node *node, void *context);
static bool planstate_walk_subplans(List *plans, bool (*walker) (), static bool planstate_walk_subplans(List *plans, bool (*walker) (),
void *context); void *context);
static bool planstate_walk_members(List *plans, PlanState **planstates, static bool planstate_walk_members(PlanState **planstates, int nplans,
bool (*walker) (), void *context); bool (*walker) (), void *context);
...@@ -3806,32 +3806,32 @@ planstate_tree_walker(PlanState *planstate, ...@@ -3806,32 +3806,32 @@ planstate_tree_walker(PlanState *planstate,
switch (nodeTag(plan)) switch (nodeTag(plan))
{ {
case T_ModifyTable: case T_ModifyTable:
if (planstate_walk_members(((ModifyTable *) plan)->plans, if (planstate_walk_members(((ModifyTableState *) planstate)->mt_plans,
((ModifyTableState *) planstate)->mt_plans, ((ModifyTableState *) planstate)->mt_nplans,
walker, context)) walker, context))
return true; return true;
break; break;
case T_Append: case T_Append:
if (planstate_walk_members(((Append *) plan)->appendplans, if (planstate_walk_members(((AppendState *) planstate)->appendplans,
((AppendState *) planstate)->appendplans, ((AppendState *) planstate)->as_nplans,
walker, context)) walker, context))
return true; return true;
break; break;
case T_MergeAppend: case T_MergeAppend:
if (planstate_walk_members(((MergeAppend *) plan)->mergeplans, if (planstate_walk_members(((MergeAppendState *) planstate)->mergeplans,
((MergeAppendState *) planstate)->mergeplans, ((MergeAppendState *) planstate)->ms_nplans,
walker, context)) walker, context))
return true; return true;
break; break;
case T_BitmapAnd: case T_BitmapAnd:
if (planstate_walk_members(((BitmapAnd *) plan)->bitmapplans, if (planstate_walk_members(((BitmapAndState *) planstate)->bitmapplans,
((BitmapAndState *) planstate)->bitmapplans, ((BitmapAndState *) planstate)->nplans,
walker, context)) walker, context))
return true; return true;
break; break;
case T_BitmapOr: case T_BitmapOr:
if (planstate_walk_members(((BitmapOr *) plan)->bitmapplans, if (planstate_walk_members(((BitmapOrState *) planstate)->bitmapplans,
((BitmapOrState *) planstate)->bitmapplans, ((BitmapOrState *) planstate)->nplans,
walker, context)) walker, context))
return true; return true;
break; break;
...@@ -3881,15 +3881,11 @@ planstate_walk_subplans(List *plans, ...@@ -3881,15 +3881,11 @@ planstate_walk_subplans(List *plans,
/* /*
* Walk the constituent plans of a ModifyTable, Append, MergeAppend, * Walk the constituent plans of a ModifyTable, Append, MergeAppend,
* BitmapAnd, or BitmapOr node. * BitmapAnd, or BitmapOr node.
*
* Note: we don't actually need to examine the Plan list members, but
* we need the list in order to determine the length of the PlanState array.
*/ */
static bool static bool
planstate_walk_members(List *plans, PlanState **planstates, planstate_walk_members(PlanState **planstates, int nplans,
bool (*walker) (), void *context) bool (*walker) (), void *context)
{ {
int nplans = list_length(plans);
int j; int j;
for (j = 0; j < nplans; j++) for (j = 0; j < nplans; j++)
......
...@@ -419,6 +419,7 @@ _outAppend(StringInfo str, const Append *node) ...@@ -419,6 +419,7 @@ _outAppend(StringInfo str, const Append *node)
WRITE_NODE_FIELD(partitioned_rels); WRITE_NODE_FIELD(partitioned_rels);
WRITE_NODE_FIELD(appendplans); WRITE_NODE_FIELD(appendplans);
WRITE_INT_FIELD(first_partial_plan); WRITE_INT_FIELD(first_partial_plan);
WRITE_NODE_FIELD(part_prune_infos);
} }
static void static void
...@@ -1758,6 +1759,30 @@ _outMergeAction(StringInfo str, const MergeAction *node) ...@@ -1758,6 +1759,30 @@ _outMergeAction(StringInfo str, const MergeAction *node)
WRITE_NODE_FIELD(targetList); WRITE_NODE_FIELD(targetList);
} }
static void
_outPartitionPruneInfo(StringInfo str, const PartitionPruneInfo *node)
{
int i;
WRITE_NODE_TYPE("PARTITIONPRUNEINFO");
WRITE_OID_FIELD(reloid);
WRITE_NODE_FIELD(pruning_steps);
WRITE_BITMAPSET_FIELD(present_parts);
WRITE_INT_FIELD(nparts);
appendStringInfoString(str, " :subnode_map");
for (i = 0; i < node->nparts; i++)
appendStringInfo(str, " %d", node->subnode_map[i]);
appendStringInfoString(str, " :subpart_map");
for (i = 0; i < node->nparts; i++)
appendStringInfo(str, " %d", node->subpart_map[i]);
WRITE_BITMAPSET_FIELD(extparams);
WRITE_BITMAPSET_FIELD(execparams);
}
/***************************************************************************** /*****************************************************************************
* *
* Stuff from relation.h. * Stuff from relation.h.
...@@ -3996,6 +4021,9 @@ outNode(StringInfo str, const void *obj) ...@@ -3996,6 +4021,9 @@ outNode(StringInfo str, const void *obj)
case T_PartitionPruneStepCombine: case T_PartitionPruneStepCombine:
_outPartitionPruneStepCombine(str, obj); _outPartitionPruneStepCombine(str, obj);
break; break;
case T_PartitionPruneInfo:
_outPartitionPruneInfo(str, obj);
break;
case T_Path: case T_Path:
_outPath(str, obj); _outPath(str, obj);
break; break;
......
...@@ -1373,6 +1373,23 @@ _readMergeAction(void) ...@@ -1373,6 +1373,23 @@ _readMergeAction(void)
READ_DONE(); READ_DONE();
} }
static PartitionPruneInfo *
_readPartitionPruneInfo(void)
{
READ_LOCALS(PartitionPruneInfo);
READ_OID_FIELD(reloid);
READ_NODE_FIELD(pruning_steps);
READ_BITMAPSET_FIELD(present_parts);
READ_INT_FIELD(nparts);
READ_INT_ARRAY(subnode_map, local_node->nparts);
READ_INT_ARRAY(subpart_map, local_node->nparts);
READ_BITMAPSET_FIELD(extparams);
READ_BITMAPSET_FIELD(execparams);
READ_DONE();
}
/* /*
* Stuff from parsenodes.h. * Stuff from parsenodes.h.
*/ */
...@@ -1675,6 +1692,7 @@ _readAppend(void) ...@@ -1675,6 +1692,7 @@ _readAppend(void)
READ_NODE_FIELD(partitioned_rels); READ_NODE_FIELD(partitioned_rels);
READ_NODE_FIELD(appendplans); READ_NODE_FIELD(appendplans);
READ_INT_FIELD(first_partial_plan); READ_INT_FIELD(first_partial_plan);
READ_NODE_FIELD(part_prune_infos);
READ_DONE(); READ_DONE();
} }
...@@ -2645,6 +2663,8 @@ parseNodeString(void) ...@@ -2645,6 +2663,8 @@ parseNodeString(void)
return_value = _readPartitionPruneStepOp(); return_value = _readPartitionPruneStepOp();
else if (MATCH("PARTITIONPRUNESTEPCOMBINE", 25)) else if (MATCH("PARTITIONPRUNESTEPCOMBINE", 25))
return_value = _readPartitionPruneStepCombine(); return_value = _readPartitionPruneStepCombine();
else if (MATCH("PARTITIONPRUNEINFO", 18))
return_value = _readPartitionPruneInfo();
else if (MATCH("RTE", 3)) else if (MATCH("RTE", 3))
return_value = _readRangeTblEntry(); return_value = _readRangeTblEntry();
else if (MATCH("RANGETBLFUNCTION", 16)) else if (MATCH("RANGETBLFUNCTION", 16))
......
...@@ -1604,7 +1604,7 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel, ...@@ -1604,7 +1604,7 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
* if we have zero or one live subpath due to constraint exclusion.) * if we have zero or one live subpath due to constraint exclusion.)
*/ */
if (subpaths_valid) if (subpaths_valid)
add_path(rel, (Path *) create_append_path(rel, subpaths, NIL, add_path(rel, (Path *) create_append_path(root, rel, subpaths, NIL,
NULL, 0, false, NULL, 0, false,
partitioned_rels, -1)); partitioned_rels, -1));
...@@ -1646,8 +1646,8 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel, ...@@ -1646,8 +1646,8 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
Assert(parallel_workers > 0); Assert(parallel_workers > 0);
/* Generate a partial append path. */ /* Generate a partial append path. */
appendpath = create_append_path(rel, NIL, partial_subpaths, NULL, appendpath = create_append_path(root, rel, NIL, partial_subpaths,
parallel_workers, NULL, parallel_workers,
enable_parallel_append, enable_parallel_append,
partitioned_rels, -1); partitioned_rels, -1);
...@@ -1695,7 +1695,7 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel, ...@@ -1695,7 +1695,7 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
max_parallel_workers_per_gather); max_parallel_workers_per_gather);
Assert(parallel_workers > 0); Assert(parallel_workers > 0);
appendpath = create_append_path(rel, pa_nonpartial_subpaths, appendpath = create_append_path(root, rel, pa_nonpartial_subpaths,
pa_partial_subpaths, pa_partial_subpaths,
NULL, parallel_workers, true, NULL, parallel_workers, true,
partitioned_rels, partial_rows); partitioned_rels, partial_rows);
...@@ -1758,7 +1758,7 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel, ...@@ -1758,7 +1758,7 @@ add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
if (subpaths_valid) if (subpaths_valid)
add_path(rel, (Path *) add_path(rel, (Path *)
create_append_path(rel, subpaths, NIL, create_append_path(root, rel, subpaths, NIL,
required_outer, 0, false, required_outer, 0, false,
partitioned_rels, -1)); partitioned_rels, -1));
} }
...@@ -2024,7 +2024,7 @@ set_dummy_rel_pathlist(RelOptInfo *rel) ...@@ -2024,7 +2024,7 @@ set_dummy_rel_pathlist(RelOptInfo *rel)
rel->pathlist = NIL; rel->pathlist = NIL;
rel->partial_pathlist = NIL; rel->partial_pathlist = NIL;
add_path(rel, (Path *) create_append_path(rel, NIL, NIL, NULL, add_path(rel, (Path *) create_append_path(NULL, rel, NIL, NIL, NULL,
0, false, NIL, -1)); 0, false, NIL, -1));
/* /*
......
...@@ -1230,7 +1230,7 @@ mark_dummy_rel(RelOptInfo *rel) ...@@ -1230,7 +1230,7 @@ mark_dummy_rel(RelOptInfo *rel)
rel->partial_pathlist = NIL; rel->partial_pathlist = NIL;
/* Set up the dummy path */ /* Set up the dummy path */
add_path(rel, (Path *) create_append_path(rel, NIL, NIL, NULL, add_path(rel, (Path *) create_append_path(NULL, rel, NIL, NIL, NULL,
0, false, NIL, -1)); 0, false, NIL, -1));
/* Set or update cheapest_total_path and related fields */ /* Set or update cheapest_total_path and related fields */
......
...@@ -41,6 +41,7 @@ ...@@ -41,6 +41,7 @@
#include "optimizer/var.h" #include "optimizer/var.h"
#include "parser/parse_clause.h" #include "parser/parse_clause.h"
#include "parser/parsetree.h" #include "parser/parsetree.h"
#include "partitioning/partprune.h"
#include "utils/lsyscache.h" #include "utils/lsyscache.h"
...@@ -210,7 +211,7 @@ static NamedTuplestoreScan *make_namedtuplestorescan(List *qptlist, List *qpqual ...@@ -210,7 +211,7 @@ static NamedTuplestoreScan *make_namedtuplestorescan(List *qptlist, List *qpqual
static WorkTableScan *make_worktablescan(List *qptlist, List *qpqual, static WorkTableScan *make_worktablescan(List *qptlist, List *qpqual,
Index scanrelid, int wtParam); Index scanrelid, int wtParam);
static Append *make_append(List *appendplans, int first_partial_plan, static Append *make_append(List *appendplans, int first_partial_plan,
List *tlist, List *partitioned_rels); List *tlist, List *partitioned_rels, List *partpruneinfos);
static RecursiveUnion *make_recursive_union(List *tlist, static RecursiveUnion *make_recursive_union(List *tlist,
Plan *lefttree, Plan *lefttree,
Plan *righttree, Plan *righttree,
...@@ -1041,6 +1042,8 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path) ...@@ -1041,6 +1042,8 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path)
List *tlist = build_path_tlist(root, &best_path->path); List *tlist = build_path_tlist(root, &best_path->path);
List *subplans = NIL; List *subplans = NIL;
ListCell *subpaths; ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
List *partpruneinfos = NIL;
/* /*
* The subpaths list could be empty, if every child was proven empty by * The subpaths list could be empty, if every child was proven empty by
...@@ -1078,6 +1081,38 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path) ...@@ -1078,6 +1081,38 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path)
subplans = lappend(subplans, subplan); subplans = lappend(subplans, subplan);
} }
if (rel->reloptkind == RELOPT_BASEREL &&
best_path->partitioned_rels != NIL)
{
List *prunequal;
prunequal = extract_actual_clauses(rel->baserestrictinfo, false);
if (best_path->path.param_info)
{
List *prmquals = best_path->path.param_info->ppi_clauses;
prmquals = extract_actual_clauses(prmquals, false);
prmquals = (List *) replace_nestloop_params(root,
(Node *) prmquals);
prunequal = list_concat(prunequal, prmquals);
}
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Generate a PartitionPruneInfo for each
* partitioned rel to store these quals and allow translation of
* partition indexes into subpath indexes.
*/
if (prunequal != NIL)
partpruneinfos =
make_partition_pruneinfo(root,
best_path->partitioned_rels,
best_path->subpaths, prunequal);
}
/* /*
* XXX ideally, if there's just one child, we'd not bother to generate an * XXX ideally, if there's just one child, we'd not bother to generate an
* Append node but just return the single child. At the moment this does * Append node but just return the single child. At the moment this does
...@@ -1086,7 +1121,8 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path) ...@@ -1086,7 +1121,8 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path)
*/ */
plan = make_append(subplans, best_path->first_partial_path, plan = make_append(subplans, best_path->first_partial_path,
tlist, best_path->partitioned_rels); tlist, best_path->partitioned_rels,
partpruneinfos);
copy_generic_path_info(&plan->plan, (Path *) best_path); copy_generic_path_info(&plan->plan, (Path *) best_path);
...@@ -5382,7 +5418,8 @@ make_foreignscan(List *qptlist, ...@@ -5382,7 +5418,8 @@ make_foreignscan(List *qptlist,
static Append * static Append *
make_append(List *appendplans, int first_partial_plan, make_append(List *appendplans, int first_partial_plan,
List *tlist, List *partitioned_rels) List *tlist, List *partitioned_rels,
List *partpruneinfos)
{ {
Append *node = makeNode(Append); Append *node = makeNode(Append);
Plan *plan = &node->plan; Plan *plan = &node->plan;
...@@ -5394,7 +5431,7 @@ make_append(List *appendplans, int first_partial_plan, ...@@ -5394,7 +5431,7 @@ make_append(List *appendplans, int first_partial_plan,
node->partitioned_rels = partitioned_rels; node->partitioned_rels = partitioned_rels;
node->appendplans = appendplans; node->appendplans = appendplans;
node->first_partial_plan = first_partial_plan; node->first_partial_plan = first_partial_plan;
node->part_prune_infos = partpruneinfos;
return node; return node;
} }
......
...@@ -3920,7 +3920,8 @@ create_degenerate_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel, ...@@ -3920,7 +3920,8 @@ create_degenerate_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
paths = lappend(paths, path); paths = lappend(paths, path);
} }
path = (Path *) path = (Path *)
create_append_path(grouped_rel, create_append_path(root,
grouped_rel,
paths, paths,
NIL, NIL,
NULL, NULL,
...@@ -6852,8 +6853,9 @@ apply_scanjoin_target_to_paths(PlannerInfo *root, ...@@ -6852,8 +6853,9 @@ apply_scanjoin_target_to_paths(PlannerInfo *root,
* node, which would cause this relation to stop appearing to be a * node, which would cause this relation to stop appearing to be a
* dummy rel.) * dummy rel.)
*/ */
rel->pathlist = list_make1(create_append_path(rel, NIL, NIL, NULL, rel->pathlist = list_make1(create_append_path(root, rel, NIL, NIL,
0, false, NIL, -1)); NULL, 0, false, NIL,
-1));
rel->partial_pathlist = NIL; rel->partial_pathlist = NIL;
set_cheapest(rel); set_cheapest(rel);
Assert(IS_DUMMY_REL(rel)); Assert(IS_DUMMY_REL(rel));
......
...@@ -648,7 +648,7 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root, ...@@ -648,7 +648,7 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
/* /*
* Append the child results together. * Append the child results together.
*/ */
path = (Path *) create_append_path(result_rel, pathlist, NIL, path = (Path *) create_append_path(root, result_rel, pathlist, NIL,
NULL, 0, false, NIL, -1); NULL, 0, false, NIL, -1);
/* /*
...@@ -703,7 +703,7 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root, ...@@ -703,7 +703,7 @@ generate_union_paths(SetOperationStmt *op, PlannerInfo *root,
Assert(parallel_workers > 0); Assert(parallel_workers > 0);
ppath = (Path *) ppath = (Path *)
create_append_path(result_rel, NIL, partial_pathlist, create_append_path(root, result_rel, NIL, partial_pathlist,
NULL, parallel_workers, enable_parallel_append, NULL, parallel_workers, enable_parallel_append,
NIL, -1); NIL, -1);
ppath = (Path *) ppath = (Path *)
...@@ -814,7 +814,7 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root, ...@@ -814,7 +814,7 @@ generate_nonunion_paths(SetOperationStmt *op, PlannerInfo *root,
/* /*
* Append the child results together. * Append the child results together.
*/ */
path = (Path *) create_append_path(result_rel, pathlist, NIL, path = (Path *) create_append_path(root, result_rel, pathlist, NIL,
NULL, 0, false, NIL, -1); NULL, 0, false, NIL, -1);
/* Identify the grouping semantics */ /* Identify the grouping semantics */
......
...@@ -1210,7 +1210,8 @@ create_tidscan_path(PlannerInfo *root, RelOptInfo *rel, List *tidquals, ...@@ -1210,7 +1210,8 @@ create_tidscan_path(PlannerInfo *root, RelOptInfo *rel, List *tidquals,
* Note that we must handle subpaths = NIL, representing a dummy access path. * Note that we must handle subpaths = NIL, representing a dummy access path.
*/ */
AppendPath * AppendPath *
create_append_path(RelOptInfo *rel, create_append_path(PlannerInfo *root,
RelOptInfo *rel,
List *subpaths, List *partial_subpaths, List *subpaths, List *partial_subpaths,
Relids required_outer, Relids required_outer,
int parallel_workers, bool parallel_aware, int parallel_workers, bool parallel_aware,
...@@ -1224,8 +1225,25 @@ create_append_path(RelOptInfo *rel, ...@@ -1224,8 +1225,25 @@ create_append_path(RelOptInfo *rel,
pathnode->path.pathtype = T_Append; pathnode->path.pathtype = T_Append;
pathnode->path.parent = rel; pathnode->path.parent = rel;
pathnode->path.pathtarget = rel->reltarget; pathnode->path.pathtarget = rel->reltarget;
pathnode->path.param_info = get_appendrel_parampathinfo(rel,
required_outer); /*
* When generating an Append path for a partitioned table, there may be
* parameters that are useful so we can eliminate certain partitions
* during execution. Here we'll go all the way and fully populate the
* parameter info data as we do for normal base relations. However, we
* need only bother doing this for RELOPT_BASEREL rels, as
* RELOPT_OTHER_MEMBER_REL's Append paths are merged into the base rel's
* Append subpaths. It would do no harm to do this, we just avoid it to
* save wasting effort.
*/
if (partitioned_rels != NIL && root && rel->reloptkind == RELOPT_BASEREL)
pathnode->path.param_info = get_baserel_parampathinfo(root,
rel,
required_outer);
else
pathnode->path.param_info = get_appendrel_parampathinfo(rel,
required_outer);
pathnode->path.parallel_aware = parallel_aware; pathnode->path.parallel_aware = parallel_aware;
pathnode->path.parallel_safe = rel->consider_parallel; pathnode->path.parallel_safe = rel->consider_parallel;
pathnode->path.parallel_workers = parallel_workers; pathnode->path.parallel_workers = parallel_workers;
...@@ -3574,7 +3592,7 @@ reparameterize_path(PlannerInfo *root, Path *path, ...@@ -3574,7 +3592,7 @@ reparameterize_path(PlannerInfo *root, Path *path,
i++; i++;
} }
return (Path *) return (Path *)
create_append_path(rel, childpaths, partialpaths, create_append_path(root, rel, childpaths, partialpaths,
required_outer, required_outer,
apath->path.parallel_workers, apath->path.parallel_workers,
apath->path.parallel_aware, apath->path.parallel_aware,
......
This diff is collapsed.
...@@ -17,6 +17,7 @@ ...@@ -17,6 +17,7 @@
#include "nodes/execnodes.h" #include "nodes/execnodes.h"
#include "nodes/parsenodes.h" #include "nodes/parsenodes.h"
#include "nodes/plannodes.h" #include "nodes/plannodes.h"
#include "partitioning/partprune.h"
/*----------------------- /*-----------------------
* PartitionDispatch - information about one partitioned table in a partition * PartitionDispatch - information about one partitioned table in a partition
...@@ -108,6 +109,77 @@ typedef struct PartitionTupleRouting ...@@ -108,6 +109,77 @@ typedef struct PartitionTupleRouting
TupleTableSlot *root_tuple_slot; TupleTableSlot *root_tuple_slot;
} PartitionTupleRouting; } PartitionTupleRouting;
/*-----------------------
* PartitionPruningData - Encapsulates all information required to support
* elimination of partitions in node types which support arbitrary Lists of
* subplans. Information stored here allows the planner's partition pruning
* functions to be called and the return value of partition indexes translated
* into the subpath indexes of node types such as Append, thus allowing us to
* bypass certain subnodes when we have proofs that indicate that no tuple
* matching the 'pruning_steps' will be found within.
*
* subnode_map An array containing the subnode index which
* matches this partition index, or -1 if the
* subnode has been pruned already.
* subpart_map An array containing the offset into the
* 'partprunedata' array in PartitionPruning, or
* -1 if there is no such element in that array.
* present_parts A Bitmapset of the partition index that we have
* subnodes mapped for.
* context Contains the context details required to call
* the partition pruning code.
* pruning_steps Contains a list of PartitionPruneStep used to
* perform the actual pruning.
* extparams Contains paramids of external params found
* matching partition keys in 'pruning_steps'.
* allparams As 'extparams' but also including exec params.
*-----------------------
*/
typedef struct PartitionPruningData
{
int *subnode_map;
int *subpart_map;
Bitmapset *present_parts;
PartitionPruneContext context;
List *pruning_steps;
Bitmapset *extparams;
Bitmapset *allparams;
} PartitionPruningData;
/*-----------------------
* PartitionPruneState - State object required for executor nodes to perform
* partition pruning elimination of their subnodes. This encapsulates a
* flattened hierarchy of PartitionPruningData structs and also stores all
* paramids which were found to match the partition keys of each partition.
* This struct can be attached to node types which support arbitrary Lists of
* subnodes containing partitions to allow subnodes to be eliminated due to
* the clauses being unable to match to any tuple that the subnode could
* possibly produce.
*
* partprunedata Array of PartitionPruningData for the node's target
* partitioned relation. First element contains the
* details for the target partitioned table.
* num_partprunedata Number of items in 'partprunedata' array.
* prune_context A memory context which can be used to call the query
* planner's partition prune functions.
* extparams All PARAM_EXTERN paramids which were found to match a
* partition key in each of the contained
* PartitionPruningData structs.
* execparams As above but for PARAM_EXEC.
* allparams Union of 'extparams' and 'execparams', saved to avoid
* recalculation.
*-----------------------
*/
typedef struct PartitionPruneState
{
PartitionPruningData *partprunedata;
int num_partprunedata;
MemoryContext prune_context;
Bitmapset *extparams;
Bitmapset *execparams;
Bitmapset *allparams;
} PartitionPruneState;
extern PartitionTupleRouting *ExecSetupPartitionTupleRouting(ModifyTableState *mtstate, extern PartitionTupleRouting *ExecSetupPartitionTupleRouting(ModifyTableState *mtstate,
Relation rel); Relation rel);
extern int ExecFindPartition(ResultRelInfo *resultRelInfo, extern int ExecFindPartition(ResultRelInfo *resultRelInfo,
...@@ -133,5 +205,10 @@ extern HeapTuple ConvertPartitionTupleSlot(TupleConversionMap *map, ...@@ -133,5 +205,10 @@ extern HeapTuple ConvertPartitionTupleSlot(TupleConversionMap *map,
TupleTableSlot **p_my_slot); TupleTableSlot **p_my_slot);
extern void ExecCleanupTupleRouting(ModifyTableState *mtstate, extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
PartitionTupleRouting *proute); PartitionTupleRouting *proute);
extern PartitionPruneState *ExecSetupPartitionPruneState(PlanState *planstate,
List *partitionpruneinfo);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
int nsubnodes);
#endif /* EXECPARTITION_H */ #endif /* EXECPARTITION_H */
...@@ -1123,8 +1123,13 @@ typedef struct ModifyTableState ...@@ -1123,8 +1123,13 @@ typedef struct ModifyTableState
/* ---------------- /* ----------------
* AppendState information * AppendState information
* *
* nplans how many plans are in the array * nplans how many plans are in the array
* whichplan which plan is being executed (0 .. n-1) * whichplan which plan is being executed (0 .. n-1), or a
* special negative value. See nodeAppend.c.
* pruningstate details required to allow partitions to be
* eliminated from the scan, or NULL if not possible.
* valid_subplans for runtime pruning, valid appendplans indexes to
* scan.
* ---------------- * ----------------
*/ */
...@@ -1132,6 +1137,7 @@ struct AppendState; ...@@ -1132,6 +1137,7 @@ struct AppendState;
typedef struct AppendState AppendState; typedef struct AppendState AppendState;
struct ParallelAppendState; struct ParallelAppendState;
typedef struct ParallelAppendState ParallelAppendState; typedef struct ParallelAppendState ParallelAppendState;
struct PartitionPruneState;
struct AppendState struct AppendState
{ {
...@@ -1141,6 +1147,8 @@ struct AppendState ...@@ -1141,6 +1147,8 @@ struct AppendState
int as_whichplan; int as_whichplan;
ParallelAppendState *as_pstate; /* parallel coordination info */ ParallelAppendState *as_pstate; /* parallel coordination info */
Size pstate_len; /* size of parallel coordination info */ Size pstate_len; /* size of parallel coordination info */
struct PartitionPruneState *as_prune_state;
Bitmapset *as_valid_subplans;
bool (*choose_next_subplan) (AppendState *); bool (*choose_next_subplan) (AppendState *);
}; };
......
...@@ -196,6 +196,7 @@ typedef enum NodeTag ...@@ -196,6 +196,7 @@ typedef enum NodeTag
T_PartitionPruneStep, T_PartitionPruneStep,
T_PartitionPruneStepOp, T_PartitionPruneStepOp,
T_PartitionPruneStepCombine, T_PartitionPruneStepCombine,
T_PartitionPruneInfo,
/* /*
* TAGS FOR EXPRESSION STATE NODES (execnodes.h) * TAGS FOR EXPRESSION STATE NODES (execnodes.h)
......
...@@ -256,6 +256,11 @@ typedef struct Append ...@@ -256,6 +256,11 @@ typedef struct Append
List *partitioned_rels; List *partitioned_rels;
List *appendplans; List *appendplans;
int first_partial_plan; int first_partial_plan;
/*
* Mapping details for run-time subplan pruning, one per partitioned_rels
*/
List *part_prune_infos;
} Append; } Append;
/* ---------------- /* ----------------
......
...@@ -1581,4 +1581,27 @@ typedef struct PartitionPruneStepCombine ...@@ -1581,4 +1581,27 @@ typedef struct PartitionPruneStepCombine
List *source_stepids; List *source_stepids;
} PartitionPruneStepCombine; } PartitionPruneStepCombine;
/*----------
* PartitionPruneInfo - Details required to allow the executor to prune
* partitions.
*
* Here we store mapping details to allow translation of a partitioned table's
* index into subnode indexes for node types which support arbitrary numbers
* of sub nodes, such as Append.
*----------
*/
typedef struct PartitionPruneInfo
{
NodeTag type;
Oid reloid; /* Oid of partition rel */
List *pruning_steps; /* List of PartitionPruneStep */
Bitmapset *present_parts; /* Indexes of all partitions which subnodes
* are present for. */
int nparts; /* The length of the following two arrays */
int *subnode_map; /* subnode index by partition id, or -1 */
int *subpart_map; /* subpart index by partition id, or -1 */
Bitmapset *extparams; /* All external paramids seen in prunesteps */
Bitmapset *execparams; /* All exec paramids seen in prunesteps */
} PartitionPruneInfo;
#endif /* PRIMNODES_H */ #endif /* PRIMNODES_H */
...@@ -64,7 +64,7 @@ extern BitmapOrPath *create_bitmap_or_path(PlannerInfo *root, ...@@ -64,7 +64,7 @@ extern BitmapOrPath *create_bitmap_or_path(PlannerInfo *root,
List *bitmapquals); List *bitmapquals);
extern TidPath *create_tidscan_path(PlannerInfo *root, RelOptInfo *rel, extern TidPath *create_tidscan_path(PlannerInfo *root, RelOptInfo *rel,
List *tidquals, Relids required_outer); List *tidquals, Relids required_outer);
extern AppendPath *create_append_path(RelOptInfo *rel, extern AppendPath *create_append_path(PlannerInfo *root, RelOptInfo *rel,
List *subpaths, List *partial_subpaths, List *subpaths, List *partial_subpaths,
Relids required_outer, Relids required_outer,
int parallel_workers, bool parallel_aware, int parallel_workers, bool parallel_aware,
......
...@@ -37,9 +37,23 @@ typedef struct PartitionPruneContext ...@@ -37,9 +37,23 @@ typedef struct PartitionPruneContext
/* Partition boundary info */ /* Partition boundary info */
PartitionBoundInfo boundinfo; PartitionBoundInfo boundinfo;
/*
* Can be set when the context is used from the executor to allow params
* found matching the partition key to be evaulated.
*/
PlanState *planstate;
/*
* Parameters that are safe to be used for partition pruning. execparams
* are not safe to use until the executor is running.
*/
Bitmapset *safeparams;
} PartitionPruneContext; } PartitionPruneContext;
extern List *make_partition_pruneinfo(PlannerInfo *root, List *partition_rels,
List *subpaths, List *prunequal);
extern Relids prune_append_rel_partitions(RelOptInfo *rel); extern Relids prune_append_rel_partitions(RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context, extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps); List *pruning_steps);
......
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment