Commit dd4134ea authored by Tom Lane's avatar Tom Lane

Revisit handling of UNION ALL subqueries with non-Var output columns.

In commit 57664ed2 I tried to fix a bug
reported by Teodor Sigaev by making non-simple-Var output columns distinct
(by wrapping their expressions with dummy PlaceHolderVar nodes).  This did
not work too well.  Commit b28ffd0f fixed
some ensuing problems with matching to child indexes, but per a recent
report from Claus Stadler, constraint exclusion of UNION ALL subqueries was
still broken, because constant-simplification didn't handle the injected
PlaceHolderVars well either.  On reflection, the original patch was quite
misguided: there is no reason to expect that EquivalenceClass child members
will be distinct.  So instead of trying to make them so, we should ensure
that we can cope with the situation when they're not.

Accordingly, this patch reverts the code changes in the above-mentioned
commits (though the regression test cases they added stay).  Instead, I've
added assorted defenses to make sure that duplicate EC child members don't
cause any problems.  Teodor's original problem ("MergeAppend child's
targetlist doesn't match MergeAppend") is addressed more directly by
revising prepare_sort_from_pathkeys to let the parent MergeAppend's sort
list guide creation of each child's sort list.

In passing, get rid of add_sort_column; as far as I can tell, testing for
duplicate sort keys at this stage is dead code.  Certainly it doesn't
trigger often enough to be worth expending cycles on in ordinary queries.
And keeping the test would've greatly complicated the new logic in
prepare_sort_from_pathkeys, because comparing pathkey list entries against
a previous output array requires that we not skip any entries in the list.

Back-patch to 9.1, like the previous patches.  The only known issue in
this area that wasn't caused by the ill-advised previous patches was the
MergeAppend planning failure, which of course is not relevant before 9.1.
It's possible that we need some of the new defenses against duplicate child
EC entries in older branches, but until there's some clear evidence of that
I'm going to refrain from back-patching further.
parent aef5fe7e
......@@ -496,6 +496,14 @@ it's possible that it belongs to more than one. We keep track of all the
families to ensure that we can make use of an index belonging to any one of
the families for mergejoin purposes.)
An EquivalenceClass can contain "em_is_child" members, which are copies
of members that contain appendrel parent relation Vars, transposed to
contain the equivalent child-relation variables or expressions. These
members are *not* full-fledged members of the EquivalenceClass and do not
affect the class's overall properties at all. They are kept only to
simplify matching of child-relation expressions to EquivalenceClasses.
Most operations on EquivalenceClasses should ignore child members.
PathKeys
--------
......
......@@ -491,6 +491,15 @@ add_eq_member(EquivalenceClass *ec, Expr *expr, Relids relids,
* sortref is the SortGroupRef of the originating SortGroupClause, if any,
* or zero if not. (It should never be zero if the expression is volatile!)
*
* If rel is not NULL, it identifies a specific relation we're considering
* a path for, and indicates that child EC members for that relation can be
* considered. Otherwise child members are ignored. (Note: since child EC
* members aren't guaranteed unique, a non-NULL value means that there could
* be more than one EC that matches the expression; if so it's order-dependent
* which one you get. This is annoying but it only happens in corner cases,
* so for now we live with just reporting the first match. See also
* generate_implied_equalities_for_indexcol and match_pathkeys_to_index.)
*
* If create_it is TRUE, we'll build a new EquivalenceClass when there is no
* match. If create_it is FALSE, we just return NULL when no match.
*
......@@ -511,6 +520,7 @@ get_eclass_for_sort_expr(PlannerInfo *root,
Oid opcintype,
Oid collation,
Index sortref,
Relids rel,
bool create_it)
{
EquivalenceClass *newec;
......@@ -548,6 +558,13 @@ get_eclass_for_sort_expr(PlannerInfo *root,
{
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
/*
* Ignore child members unless they match the request.
*/
if (cur_em->em_is_child &&
!bms_equal(cur_em->em_relids, rel))
continue;
/*
* If below an outer join, don't match constants: they're not as
* constant as they look.
......@@ -1505,6 +1522,7 @@ reconsider_outer_join_clause(PlannerInfo *root, RestrictInfo *rinfo,
{
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
Assert(!cur_em->em_is_child); /* no children yet */
if (equal(outervar, cur_em->em_expr))
{
match = true;
......@@ -1626,6 +1644,7 @@ reconsider_full_join_clause(PlannerInfo *root, RestrictInfo *rinfo)
foreach(lc2, cur_ec->ec_members)
{
coal_em = (EquivalenceMember *) lfirst(lc2);
Assert(!coal_em->em_is_child); /* no children yet */
if (IsA(coal_em->em_expr, CoalesceExpr))
{
CoalesceExpr *cexpr = (CoalesceExpr *) coal_em->em_expr;
......@@ -1747,6 +1766,8 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
{
EquivalenceMember *em = (EquivalenceMember *) lfirst(lc2);
if (em->em_is_child)
continue; /* ignore children here */
if (equal(item1, em->em_expr))
item1member = true;
else if (equal(item2, em->em_expr))
......@@ -1800,6 +1821,9 @@ add_child_rel_equivalences(PlannerInfo *root,
{
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
if (cur_em->em_is_child)
continue; /* ignore children here */
/* Does it reference (only) parent_rel? */
if (bms_equal(cur_em->em_relids, parent_rel->relids))
{
......@@ -1908,7 +1932,16 @@ generate_implied_equalities_for_indexcol(PlannerInfo *root,
!bms_is_subset(rel->relids, cur_ec->ec_relids))
continue;
/* Scan members, looking for a match to the indexable column */
/*
* Scan members, looking for a match to the indexable column. Note
* that child EC members are considered, but only when they belong to
* the target relation. (Unlike regular members, the same expression
* could be a child member of more than one EC. Therefore, it's
* potentially order-dependent which EC a child relation's index
* column gets matched to. This is annoying but it only happens in
* corner cases, so for now we live with just reporting the first
* match. See also get_eclass_for_sort_expr.)
*/
cur_em = NULL;
foreach(lc2, cur_ec->ec_members)
{
......@@ -1933,6 +1966,9 @@ generate_implied_equalities_for_indexcol(PlannerInfo *root,
Oid eq_op;
RestrictInfo *rinfo;
if (other_em->em_is_child)
continue; /* ignore children here */
/* Make sure it'll be a join to a different rel */
if (other_em == cur_em ||
bms_overlap(other_em->em_relids, rel->relids))
......@@ -2187,8 +2223,10 @@ eclass_useful_for_merging(EquivalenceClass *eclass,
{
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc);
if (!cur_em->em_is_child &&
!bms_overlap(cur_em->em_relids, rel->relids))
if (cur_em->em_is_child)
continue; /* ignore children here */
if (!bms_overlap(cur_em->em_relids, rel->relids))
return true;
}
......
......@@ -2157,7 +2157,14 @@ match_pathkeys_to_index(IndexOptInfo *index, List *pathkeys,
if (pathkey->pk_eclass->ec_has_volatile)
return;
/* Try to match eclass member expression(s) to index */
/*
* Try to match eclass member expression(s) to index. Note that child
* EC members are considered, but only when they belong to the target
* relation. (Unlike regular members, the same expression could be a
* child member of more than one EC. Therefore, the same index could
* be considered to match more than one pathkey list, which is OK
* here. See also get_eclass_for_sort_expr.)
*/
foreach(lc2, pathkey->pk_eclass->ec_members)
{
EquivalenceMember *member = (EquivalenceMember *) lfirst(lc2);
......@@ -2580,15 +2587,6 @@ match_index_to_operand(Node *operand,
{
int indkey;
/*
* Ignore any PlaceHolderVar nodes above the operand. This is needed so
* that we can successfully use expression-index constraints pushed down
* through appendrels (UNION ALL). It's safe because a PlaceHolderVar
* appearing in a relation-scan-level expression is certainly a no-op.
*/
while (operand && IsA(operand, PlaceHolderVar))
operand = (Node *) ((PlaceHolderVar *) operand)->phexpr;
/*
* Ignore any RelabelType node above the operand. This is needed to be
* able to apply indexscanning in binary-compatible-operator cases. Note:
......
......@@ -221,6 +221,11 @@ canonicalize_pathkeys(PlannerInfo *root, List *pathkeys)
* If the PathKey is being generated from a SortGroupClause, sortref should be
* the SortGroupClause's SortGroupRef; otherwise zero.
*
* If rel is not NULL, it identifies a specific relation we're considering
* a path for, and indicates that child EC members for that relation can be
* considered. Otherwise child members are ignored. (See the comments for
* get_eclass_for_sort_expr.)
*
* create_it is TRUE if we should create any missing EquivalenceClass
* needed to represent the sort key. If it's FALSE, we return NULL if the
* sort key isn't already present in any EquivalenceClass.
......@@ -237,6 +242,7 @@ make_pathkey_from_sortinfo(PlannerInfo *root,
bool reverse_sort,
bool nulls_first,
Index sortref,
Relids rel,
bool create_it,
bool canonicalize)
{
......@@ -268,7 +274,7 @@ make_pathkey_from_sortinfo(PlannerInfo *root,
/* Now find or (optionally) create a matching EquivalenceClass */
eclass = get_eclass_for_sort_expr(root, expr, opfamilies,
opcintype, collation,
sortref, create_it);
sortref, rel, create_it);
/* Fail if no EC and !create_it */
if (!eclass)
......@@ -320,6 +326,7 @@ make_pathkey_from_sortop(PlannerInfo *root,
(strategy == BTGreaterStrategyNumber),
nulls_first,
sortref,
NULL,
create_it,
canonicalize);
}
......@@ -546,6 +553,7 @@ build_index_pathkeys(PlannerInfo *root,
reverse_sort,
nulls_first,
0,
index->rel->relids,
false,
true);
......@@ -636,6 +644,7 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
sub_member->em_datatype,
sub_eclass->ec_collation,
0,
rel->relids,
false);
/*
......@@ -680,6 +689,9 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
Oid sub_expr_coll = sub_eclass->ec_collation;
ListCell *k;
if (sub_member->em_is_child)
continue; /* ignore children here */
foreach(k, sub_tlist)
{
TargetEntry *tle = (TargetEntry *) lfirst(k);
......@@ -719,6 +731,7 @@ convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
sub_expr_type,
sub_expr_coll,
0,
rel->relids,
false);
/*
......@@ -910,6 +923,7 @@ initialize_mergeclause_eclasses(PlannerInfo *root, RestrictInfo *restrictinfo)
lefttype,
((OpExpr *) clause)->inputcollid,
0,
NULL,
true);
restrictinfo->right_ec =
get_eclass_for_sort_expr(root,
......@@ -918,6 +932,7 @@ initialize_mergeclause_eclasses(PlannerInfo *root, RestrictInfo *restrictinfo)
righttype,
((OpExpr *) clause)->inputcollid,
0,
NULL,
true);
}
......
......@@ -152,12 +152,17 @@ static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
double limit_tuples);
static Plan *prepare_sort_from_pathkeys(PlannerInfo *root,
Plan *lefttree, List *pathkeys,
Relids relids,
const AttrNumber *reqColIdx,
bool adjust_tlist_in_place,
int *p_numsortkeys,
AttrNumber **p_sortColIdx,
Oid **p_sortOperators,
Oid **p_collations,
bool **p_nullsFirst);
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
TargetEntry *tle,
Relids relids);
static Material *make_material(Plan *lefttree);
......@@ -706,6 +711,8 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path)
/* Compute sort column info, and adjust MergeAppend's tlist as needed */
(void) prepare_sort_from_pathkeys(root, plan, pathkeys,
NULL,
NULL,
true,
&node->numCols,
&node->sortColIdx,
......@@ -733,6 +740,8 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path)
/* Compute sort column info, and adjust subplan's tlist as needed */
subplan = prepare_sort_from_pathkeys(root, subplan, pathkeys,
subpath->parent->relids,
node->sortColIdx,
false,
&numsortkeys,
&sortColIdx,
......@@ -2695,11 +2704,8 @@ fix_indexqual_operand(Node *node, IndexOptInfo *index, int indexcol)
ListCell *indexpr_item;
/*
* Remove any PlaceHolderVars or binary-compatible relabeling of the
* indexkey (this must match logic in match_index_to_operand()).
* Remove any binary-compatible relabeling of the indexkey
*/
while (IsA(node, PlaceHolderVar))
node = (Node *) ((PlaceHolderVar *) node)->phexpr;
if (IsA(node, RelabelType))
node = (Node *) ((RelabelType *) node)->arg;
......@@ -3515,55 +3521,6 @@ make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
return node;
}
/*
* add_sort_column --- utility subroutine for building sort info arrays
*
* We need this routine because the same column might be selected more than
* once as a sort key column; if so, the extra mentions are redundant.
*
* Caller is assumed to have allocated the arrays large enough for the
* max possible number of columns. Return value is the new column count.
*/
static int
add_sort_column(AttrNumber colIdx, Oid sortOp, Oid coll, bool nulls_first,
int numCols, AttrNumber *sortColIdx,
Oid *sortOperators, Oid *collations, bool *nullsFirst)
{
int i;
Assert(OidIsValid(sortOp));
for (i = 0; i < numCols; i++)
{
/*
* Note: we check sortOp because it's conceivable that "ORDER BY foo
* USING <, foo USING <<<" is not redundant, if <<< distinguishes
* values that < considers equal. We need not check nulls_first
* however because a lower-order column with the same sortop but
* opposite nulls direction is redundant.
*
* We could probably consider sort keys with the same sortop and
* different collations to be redundant too, but for the moment treat
* them as not redundant. This will be needed if we ever support
* collations with different notions of equality.
*/
if (sortColIdx[i] == colIdx &&
sortOperators[numCols] == sortOp &&
collations[numCols] == coll)
{
/* Already sorting by this col, so extra sort key is useless */
return numCols;
}
}
/* Add the column */
sortColIdx[numCols] = colIdx;
sortOperators[numCols] = sortOp;
collations[numCols] = coll;
nullsFirst[numCols] = nulls_first;
return numCols + 1;
}
/*
* prepare_sort_from_pathkeys
* Prepare to sort according to given pathkeys
......@@ -3573,8 +3530,10 @@ add_sort_column(AttrNumber colIdx, Oid sortOp, Oid coll, bool nulls_first,
* plan targetlist if needed to add resjunk sort columns.
*
* Input parameters:
* 'lefttree' is the node which yields input tuples
* 'lefttree' is the plan node which yields input tuples
* 'pathkeys' is the list of pathkeys by which the result is to be sorted
* 'relids' identifies the child relation being sorted, if any
* 'reqColIdx' is NULL or an array of required sort key column numbers
* 'adjust_tlist_in_place' is TRUE if lefttree must be modified in-place
*
* We must convert the pathkey information into arrays of sort key column
......@@ -3582,6 +3541,14 @@ add_sort_column(AttrNumber colIdx, Oid sortOp, Oid coll, bool nulls_first,
* which is the representation the executor wants. These are returned into
* the output parameters *p_numsortkeys etc.
*
* When looking for matches to an EquivalenceClass's members, we will only
* consider child EC members if they match 'relids'. This protects against
* possible incorrect matches to child expressions that contain no Vars.
*
* If reqColIdx isn't NULL then it contains sort key column numbers that
* we should match. This is used when making child plans for a MergeAppend;
* it's an error if we can't match the columns.
*
* If the pathkeys include expressions that aren't simple Vars, we will
* usually need to add resjunk items to the input plan's targetlist to
* compute these expressions, since the Sort/MergeAppend node itself won't
......@@ -3596,6 +3563,8 @@ add_sort_column(AttrNumber colIdx, Oid sortOp, Oid coll, bool nulls_first,
*/
static Plan *
prepare_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
Relids relids,
const AttrNumber *reqColIdx,
bool adjust_tlist_in_place,
int *p_numsortkeys,
AttrNumber **p_sortColIdx,
......@@ -3626,6 +3595,7 @@ prepare_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
{
PathKey *pathkey = (PathKey *) lfirst(i);
EquivalenceClass *ec = pathkey->pk_eclass;
EquivalenceMember *em;
TargetEntry *tle = NULL;
Oid pk_datatype = InvalidOid;
Oid sortop;
......@@ -3645,16 +3615,41 @@ prepare_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
Assert(list_length(ec->ec_members) == 1);
pk_datatype = ((EquivalenceMember *) linitial(ec->ec_members))->em_datatype;
}
else if (reqColIdx != NULL)
{
/*
* If we are given a sort column number to match, only consider
* the single TLE at that position. It's possible that there
* is no such TLE, in which case fall through and generate a
* resjunk targetentry (we assume this must have happened in the
* parent plan as well). If there is a TLE but it doesn't match
* the pathkey's EC, we do the same, which is probably the wrong
* thing but we'll leave it to caller to complain about the
* mismatch.
*/
tle = get_tle_by_resno(tlist, reqColIdx[numsortkeys]);
if (tle)
{
em = find_ec_member_for_tle(ec, tle, relids);
if (em)
{
/* found expr at right place in tlist */
pk_datatype = em->em_datatype;
}
else
tle = NULL;
}
}
else
{
/*
* Otherwise, we can sort by any non-constant expression listed in
* the pathkey's EquivalenceClass. For now, we take the first one
* that corresponds to an available item in the tlist. If there
* isn't any, use the first one that is an expression in the
* input's vars. (The non-const restriction only matters if the
* EC is below_outer_join; but if it isn't, it won't contain
* consts anyway, else we'd have discarded the pathkey as
* the pathkey's EquivalenceClass. For now, we take the first
* tlist item found in the EC. If there's no match, we'll generate
* a resjunk entry using the first EC member that is an expression
* in the input's vars. (The non-const restriction only matters
* if the EC is below_outer_join; but if it isn't, it won't
* contain consts anyway, else we'd have discarded the pathkey as
* redundant.)
*
* XXX if we have a choice, is there any way of figuring out which
......@@ -3663,9 +3658,36 @@ prepare_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
* in the same equivalence class...) Not clear that we ever will
* have an interesting choice in practice, so it may not matter.
*/
foreach(j, tlist)
{
tle = (TargetEntry *) lfirst(j);
em = find_ec_member_for_tle(ec, tle, relids);
if (em)
{
/* found expr already in tlist */
pk_datatype = em->em_datatype;
break;
}
tle = NULL;
}
}
if (!tle)
{
/*
* No matching tlist item; look for a computable expression.
* Note that we treat Aggrefs as if they were variables; this
* is necessary when attempting to sort the output from an Agg
* node for use in a WindowFunc (since grouping_planner will
* have treated the Aggrefs as variables, too).
*/
Expr *sortexpr = NULL;
foreach(j, ec->ec_members)
{
EquivalenceMember *em = (EquivalenceMember *) lfirst(j);
List *exprvars;
ListCell *k;
/*
* We shouldn't be trying to sort by an equivalence class that
......@@ -3675,91 +3697,56 @@ prepare_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
if (em->em_is_const)
continue;
tle = tlist_member((Node *) em->em_expr, tlist);
if (tle)
{
pk_datatype = em->em_datatype;
break; /* found expr already in tlist */
}
/*
* We can also use it if the pathkey expression is a relabel
* of the tlist entry, or vice versa. This is needed for
* binary-compatible cases (cf. make_pathkey_from_sortinfo).
* We prefer an exact match, though, so we do the basic search
* first.
* Ignore child members unless they match the rel being sorted.
*/
tle = tlist_member_ignore_relabel((Node *) em->em_expr, tlist);
if (tle)
if (em->em_is_child &&
!bms_equal(em->em_relids, relids))
continue;
sortexpr = em->em_expr;
exprvars = pull_var_clause((Node *) sortexpr,
PVC_INCLUDE_AGGREGATES,
PVC_INCLUDE_PLACEHOLDERS);
foreach(k, exprvars)
{
if (!tlist_member_ignore_relabel(lfirst(k), tlist))
break;
}
list_free(exprvars);
if (!k)
{
pk_datatype = em->em_datatype;
break; /* found expr already in tlist */
break; /* found usable expression */
}
}
if (!j)
elog(ERROR, "could not find pathkey item to sort");
if (!tle)
/*
* Do we need to insert a Result node?
*/
if (!adjust_tlist_in_place &&
!is_projection_capable_plan(lefttree))
{
/*
* No matching tlist item; look for a computable expression.
* Note that we treat Aggrefs as if they were variables; this
* is necessary when attempting to sort the output from an Agg
* node for use in a WindowFunc (since grouping_planner will
* have treated the Aggrefs as variables, too).
*/
Expr *sortexpr = NULL;
foreach(j, ec->ec_members)
{
EquivalenceMember *em = (EquivalenceMember *) lfirst(j);
List *exprvars;
ListCell *k;
if (em->em_is_const)
continue;
sortexpr = em->em_expr;
exprvars = pull_var_clause((Node *) sortexpr,
PVC_INCLUDE_AGGREGATES,
PVC_INCLUDE_PLACEHOLDERS);
foreach(k, exprvars)
{
if (!tlist_member_ignore_relabel(lfirst(k), tlist))
break;
}
list_free(exprvars);
if (!k)
{
pk_datatype = em->em_datatype;
break; /* found usable expression */
}
}
if (!j)
elog(ERROR, "could not find pathkey item to sort");
/*
* Do we need to insert a Result node?
*/
if (!adjust_tlist_in_place &&
!is_projection_capable_plan(lefttree))
{
/* copy needed so we don't modify input's tlist below */
tlist = copyObject(tlist);
lefttree = (Plan *) make_result(root, tlist, NULL,
lefttree);
}
/* copy needed so we don't modify input's tlist below */
tlist = copyObject(tlist);
lefttree = (Plan *) make_result(root, tlist, NULL,
lefttree);
}
/* Don't bother testing is_projection_capable_plan again */
adjust_tlist_in_place = true;
/* Don't bother testing is_projection_capable_plan again */
adjust_tlist_in_place = true;
/*
* Add resjunk entry to input's tlist
*/
tle = makeTargetEntry(sortexpr,
list_length(tlist) + 1,
NULL,
true);
tlist = lappend(tlist, tle);
lefttree->targetlist = tlist; /* just in case NIL before */
}
/*
* Add resjunk entry to input's tlist
*/
tle = makeTargetEntry(sortexpr,
list_length(tlist) + 1,
NULL,
true);
tlist = lappend(tlist, tle);
lefttree->targetlist = tlist; /* just in case NIL before */
}
/*
......@@ -3775,23 +3762,14 @@ prepare_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
pathkey->pk_strategy, pk_datatype, pk_datatype,
pathkey->pk_opfamily);
/*
* The column might already be selected as a sort key, if the pathkeys
* contain duplicate entries. (This can happen in scenarios where
* multiple mergejoinable clauses mention the same var, for example.)
* So enter it only once in the sort arrays.
*/
numsortkeys = add_sort_column(tle->resno,
sortop,
pathkey->pk_eclass->ec_collation,
pathkey->pk_nulls_first,
numsortkeys,
sortColIdx, sortOperators,
collations, nullsFirst);
/* Add the column to the sort arrays */
sortColIdx[numsortkeys] = tle->resno;
sortOperators[numsortkeys] = sortop;
collations[numsortkeys] = ec->ec_collation;
nullsFirst[numsortkeys] = pathkey->pk_nulls_first;
numsortkeys++;
}
Assert(numsortkeys > 0);
/* Return results */
*p_numsortkeys = numsortkeys;
*p_sortColIdx = sortColIdx;
......@@ -3802,6 +3780,57 @@ prepare_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
return lefttree;
}
/*
* find_ec_member_for_tle
* Locate an EquivalenceClass member matching the given TLE, if any
*
* Child EC members are ignored unless they match 'relids'.
*/
static EquivalenceMember *
find_ec_member_for_tle(EquivalenceClass *ec,
TargetEntry *tle,
Relids relids)
{
Expr *tlexpr;
ListCell *lc;
/* We ignore binary-compatible relabeling on both ends */
tlexpr = tle->expr;
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
foreach(lc, ec->ec_members)
{
EquivalenceMember *em = (EquivalenceMember *) lfirst(lc);
Expr *emexpr;
/*
* We shouldn't be trying to sort by an equivalence class that
* contains a constant, so no need to consider such cases any
* further.
*/
if (em->em_is_const)
continue;
/*
* Ignore child members unless they match the rel being sorted.
*/
if (em->em_is_child &&
!bms_equal(em->em_relids, relids))
continue;
/* Match if same expression (after stripping relabel) */
emexpr = em->em_expr;
while (emexpr && IsA(emexpr, RelabelType))
emexpr = ((RelabelType *) emexpr)->arg;
if (equal(emexpr, tlexpr))
return em;
}
return NULL;
}
/*
* make_sort_from_pathkeys
* Create sort plan to sort according to given pathkeys
......@@ -3823,6 +3852,8 @@ make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
/* Compute sort column info, and adjust lefttree as needed */
lefttree = prepare_sort_from_pathkeys(root, lefttree, pathkeys,
NULL,
NULL,
false,
&numsortkeys,
&sortColIdx,
......@@ -3854,9 +3885,7 @@ make_sort_from_sortclauses(PlannerInfo *root, List *sortcls, Plan *lefttree)
Oid *collations;
bool *nullsFirst;
/*
* We will need at most list_length(sortcls) sort columns; possibly less
*/
/* Convert list-ish representation to arrays wanted by executor */
numsortkeys = list_length(sortcls);
sortColIdx = (AttrNumber *) palloc(numsortkeys * sizeof(AttrNumber));
sortOperators = (Oid *) palloc(numsortkeys * sizeof(Oid));
......@@ -3864,27 +3893,18 @@ make_sort_from_sortclauses(PlannerInfo *root, List *sortcls, Plan *lefttree)
nullsFirst = (bool *) palloc(numsortkeys * sizeof(bool));
numsortkeys = 0;
foreach(l, sortcls)
{
SortGroupClause *sortcl = (SortGroupClause *) lfirst(l);
TargetEntry *tle = get_sortgroupclause_tle(sortcl, sub_tlist);
/*
* Check for the possibility of duplicate order-by clauses --- the
* parser should have removed 'em, but no point in sorting
* redundantly.
*/
numsortkeys = add_sort_column(tle->resno, sortcl->sortop,
exprCollation((Node *) tle->expr),
sortcl->nulls_first,
numsortkeys,
sortColIdx, sortOperators,
collations, nullsFirst);
sortColIdx[numsortkeys] = tle->resno;
sortOperators[numsortkeys] = sortcl->sortop;
collations[numsortkeys] = exprCollation((Node *) tle->expr);
nullsFirst[numsortkeys] = sortcl->nulls_first;
numsortkeys++;
}
Assert(numsortkeys > 0);
return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
......@@ -3910,7 +3930,6 @@ make_sort_from_groupcols(PlannerInfo *root,
Plan *lefttree)
{
List *sub_tlist = lefttree->targetlist;
int grpno = 0;
ListCell *l;
int numsortkeys;
AttrNumber *sortColIdx;
......@@ -3918,9 +3937,7 @@ make_sort_from_groupcols(PlannerInfo *root,
Oid *collations;
bool *nullsFirst;
/*
* We will need at most list_length(groupcls) sort columns; possibly less
*/
/* Convert list-ish representation to arrays wanted by executor */
numsortkeys = list_length(groupcls);
sortColIdx = (AttrNumber *) palloc(numsortkeys * sizeof(AttrNumber));
sortOperators = (Oid *) palloc(numsortkeys * sizeof(Oid));
......@@ -3928,28 +3945,18 @@ make_sort_from_groupcols(PlannerInfo *root,
nullsFirst = (bool *) palloc(numsortkeys * sizeof(bool));
numsortkeys = 0;
foreach(l, groupcls)
{
SortGroupClause *grpcl = (SortGroupClause *) lfirst(l);
TargetEntry *tle = get_tle_by_resno(sub_tlist, grpColIdx[grpno]);
TargetEntry *tle = get_tle_by_resno(sub_tlist, grpColIdx[numsortkeys]);
/*
* Check for the possibility of duplicate group-by clauses --- the
* parser should have removed 'em, but no point in sorting
* redundantly.
*/
numsortkeys = add_sort_column(tle->resno, grpcl->sortop,
exprCollation((Node *) tle->expr),
grpcl->nulls_first,
numsortkeys,
sortColIdx, sortOperators,
collations, nullsFirst);
grpno++;
sortColIdx[numsortkeys] = tle->resno;
sortOperators[numsortkeys] = grpcl->sortop;
collations[numsortkeys] = exprCollation((Node *) tle->expr);
nullsFirst[numsortkeys] = grpcl->nulls_first;
numsortkeys++;
}
Assert(numsortkeys > 0);
return make_sort(root, lefttree, numsortkeys,
sortColIdx, sortOperators, collations,
nullsFirst, -1.0);
......
......@@ -99,9 +99,10 @@ preprocess_minmax_aggregates(PlannerInfo *root, List *tlist)
* We also restrict the query to reference exactly one table, since join
* conditions can't be handled reasonably. (We could perhaps handle a
* query containing cartesian-product joins, but it hardly seems worth the
* trouble.) However, the single real table could be buried in several
* levels of FromExpr due to subqueries. Note the single table could be
* an inheritance parent, too.
* trouble.) However, the single table could be buried in several levels
* of FromExpr due to subqueries. Note the "single" table could be an
* inheritance parent, too, including the case of a UNION ALL subquery
* that's been flattened to an appendrel.
*/
jtnode = parse->jointree;
while (IsA(jtnode, FromExpr))
......@@ -114,7 +115,11 @@ preprocess_minmax_aggregates(PlannerInfo *root, List *tlist)
return;
rtr = (RangeTblRef *) jtnode;
rte = planner_rt_fetch(rtr->rtindex, root);
if (rte->rtekind != RTE_RELATION)
if (rte->rtekind == RTE_RELATION)
/* ordinary relation, ok */ ;
else if (rte->rtekind == RTE_SUBQUERY && rte->inh)
/* flattened UNION ALL subquery, ok */ ;
else
return;
/*
......
......@@ -176,7 +176,7 @@ query_planner(PlannerInfo *root, List *tlist,
*/
build_base_rel_tlists(root, tlist);
find_placeholders_in_query(root);
find_placeholders_in_jointree(root);
joinlist = deconstruct_jointree(root);
......
......@@ -866,15 +866,22 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
parse->havingQual = pullup_replace_vars(parse->havingQual, &rvcontext);
/*
* Replace references in the translated_vars lists of appendrels, too.
* We do it this way because we must preserve the AppendRelInfo structs.
* Replace references in the translated_vars lists of appendrels. When
* pulling up an appendrel member, we do not need PHVs in the list of the
* parent appendrel --- there isn't any outer join between. Elsewhere, use
* PHVs for safety. (This analysis could be made tighter but it seems
* unlikely to be worth much trouble.)
*/
foreach(lc, root->append_rel_list)
{
AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
bool save_need_phvs = rvcontext.need_phvs;
if (appinfo == containing_appendrel)
rvcontext.need_phvs = false;
appinfo->translated_vars = (List *)
pullup_replace_vars((Node *) appinfo->translated_vars, &rvcontext);
rvcontext.need_phvs = save_need_phvs;
}
/*
......@@ -1482,31 +1489,14 @@ pullup_replace_vars_callback(Var *var,
if (newnode && IsA(newnode, Var) &&
((Var *) newnode)->varlevelsup == 0)
{
/*
* Simple Vars normally escape being wrapped. However, in
* wrap_non_vars mode (ie, we are dealing with an appendrel
* member), we must ensure that each tlist entry expands to a
* distinct expression, else we may have problems with
* improperly placing identical entries into different
* EquivalenceClasses. Therefore, we wrap a Var in a
* PlaceHolderVar if it duplicates any earlier entry in the
* tlist (ie, we've got "SELECT x, x, ..."). Since each PHV
* is distinct, this fixes the ambiguity. We can use
* tlist_member to detect whether there's an earlier
* duplicate.
*/
wrap = (rcon->wrap_non_vars &&
tlist_member(newnode, rcon->targetlist) != tle);
/* Simple Vars always escape being wrapped */
wrap = false;
}
else if (newnode && IsA(newnode, PlaceHolderVar) &&
((PlaceHolderVar *) newnode)->phlevelsup == 0)
{
/*
* No need to directly wrap a PlaceHolderVar with another one,
* either, unless we need to prevent duplication.
*/
wrap = (rcon->wrap_non_vars &&
tlist_member(newnode, rcon->targetlist) != tle);
/* No need to wrap a PlaceHolderVar with another one, either */
wrap = false;
}
else if (rcon->wrap_non_vars)
{
......
......@@ -104,41 +104,28 @@ find_placeholder_info(PlannerInfo *root, PlaceHolderVar *phv,
}
/*
* find_placeholders_in_query
* Search the query for PlaceHolderVars, and build PlaceHolderInfos
* find_placeholders_in_jointree
* Search the jointree for PlaceHolderVars, and build PlaceHolderInfos
*
* We need to examine the jointree, but not the targetlist, because
* build_base_rel_tlists() will already have made entries for any PHVs
* in the targetlist.
*
* We also need to search for PHVs in AppendRelInfo translated_vars
* lists. In most cases, translated_vars entries aren't directly referenced
* elsewhere, but we need to create PlaceHolderInfo entries for them to
* support set_rel_width() calculations for the appendrel child relations.
* We don't need to look at the targetlist because build_base_rel_tlists()
* will already have made entries for any PHVs in the tlist.
*/
void
find_placeholders_in_query(PlannerInfo *root)
find_placeholders_in_jointree(PlannerInfo *root)
{
/* We need do nothing if the query contains no PlaceHolderVars */
if (root->glob->lastPHId != 0)
{
/* Recursively search the jointree */
/* Start recursion at top of jointree */
Assert(root->parse->jointree != NULL &&
IsA(root->parse->jointree, FromExpr));
(void) find_placeholders_recurse(root, (Node *) root->parse->jointree);
/*
* Also search the append_rel_list for translated vars that are PHVs.
* Barring finding them elsewhere in the query, they do not need any
* ph_may_need bits, only to be present in the PlaceHolderInfo list.
*/
mark_placeholders_in_expr(root, (Node *) root->append_rel_list, NULL);
}
}
/*
* find_placeholders_recurse
* One recursion level of jointree search for find_placeholders_in_query.
* One recursion level of find_placeholders_in_jointree.
*
* jtnode is the current jointree node to examine.
*
......
......@@ -572,12 +572,18 @@ typedef struct EquivalenceClass
* EquivalenceMember - one member expression of an EquivalenceClass
*
* em_is_child signifies that this element was built by transposing a member
* for an inheritance parent relation to represent the corresponding expression
* on an inheritance child. These elements are used for constructing
* inner-indexscan paths for the child relation (other types of join are
* driven from transposed joininfo-list entries) and for constructing
* MergeAppend paths for the whole inheritance tree. Note that the EC's
* ec_relids field does NOT include the child relation.
* for an appendrel parent relation to represent the corresponding expression
* for an appendrel child. These members are used for determining the
* pathkeys of scans on the child relation and for explicitly sorting the
* child when necessary to build a MergeAppend path for the whole appendrel
* tree. An em_is_child member has no impact on the properties of the EC as a
* whole; in particular the EC's ec_relids field does NOT include the child
* relation. An em_is_child member should never be marked em_is_const nor
* cause ec_has_const or ec_has_volatile to be set, either. Thus, em_is_child
* members are not really full-fledged members of the EC, but just reflections
* or doppelgangers of real members. Most operations on EquivalenceClasses
* should ignore em_is_child members, and those that don't should test
* em_relids to make sure they only consider relevant members.
*
* em_datatype is usually the same as exprType(em_expr), but can be
* different when dealing with a binary-compatible opfamily; in particular
......
......@@ -110,6 +110,7 @@ extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
Oid opcintype,
Oid collation,
Index sortref,
Relids rel,
bool create_it);
extern void generate_base_implied_equalities(PlannerInfo *root);
extern List *generate_join_implied_equalities(PlannerInfo *root,
......
......@@ -21,7 +21,7 @@ extern PlaceHolderVar *make_placeholder_expr(PlannerInfo *root, Expr *expr,
Relids phrels);
extern PlaceHolderInfo *find_placeholder_info(PlannerInfo *root,
PlaceHolderVar *phv, bool create_new_ph);
extern void find_placeholders_in_query(PlannerInfo *root);
extern void find_placeholders_in_jointree(PlannerInfo *root);
extern void mark_placeholder_maybe_needed(PlannerInfo *root,
PlaceHolderInfo *phinfo, Relids relids);
extern void update_placeholder_eval_levels(PlannerInfo *root,
......
......@@ -1067,11 +1067,11 @@ drop cascades to table matest2
drop cascades to table matest3
--
-- Test merge-append for UNION ALL append relations
-- Check handling of duplicated, constant, or volatile targetlist items
--
set enable_seqscan = off;
set enable_indexscan = on;
set enable_bitmapscan = off;
-- Check handling of duplicated, constant, or volatile targetlist items
explain (costs off)
SELECT thousand, tenthous FROM tenk1
UNION ALL
......@@ -1120,6 +1120,61 @@ ORDER BY thousand, tenthous;
-> Index Only Scan using tenk1_thous_tenthous on tenk1
(7 rows)
-- Check min/max aggregate optimization
explain (costs off)
SELECT min(x) FROM
(SELECT unique1 AS x FROM tenk1 a
UNION ALL
SELECT unique2 AS x FROM tenk1 b) s;
QUERY PLAN
--------------------------------------------------------------------
Result
InitPlan 1 (returns $0)
-> Limit
-> Merge Append
Sort Key: a.unique1
-> Index Only Scan using tenk1_unique1 on tenk1 a
Index Cond: (unique1 IS NOT NULL)
-> Index Only Scan using tenk1_unique2 on tenk1 b
Index Cond: (unique2 IS NOT NULL)
(9 rows)
explain (costs off)
SELECT min(y) FROM
(SELECT unique1 AS x, unique1 AS y FROM tenk1 a
UNION ALL
SELECT unique2 AS x, unique2 AS y FROM tenk1 b) s;
QUERY PLAN
--------------------------------------------------------------------
Result
InitPlan 1 (returns $0)
-> Limit
-> Merge Append
Sort Key: a.unique1
-> Index Only Scan using tenk1_unique1 on tenk1 a
Index Cond: (unique1 IS NOT NULL)
-> Index Only Scan using tenk1_unique2 on tenk1 b
Index Cond: (unique2 IS NOT NULL)
(9 rows)
-- XXX planner doesn't recognize that index on unique2 is sufficiently sorted
explain (costs off)
SELECT x, y FROM
(SELECT thousand AS x, tenthous AS y FROM tenk1 a
UNION ALL
SELECT unique2 AS x, unique2 AS y FROM tenk1 b) s
ORDER BY x, y;
QUERY PLAN
-------------------------------------------------------------------
Result
-> Merge Append
Sort Key: a.thousand, a.tenthous
-> Index Only Scan using tenk1_thous_tenthous on tenk1 a
-> Sort
Sort Key: b.unique2, b.unique2
-> Index Only Scan using tenk1_unique2 on tenk1 b
(7 rows)
reset enable_seqscan;
reset enable_indexscan;
reset enable_bitmapscan;
......@@ -503,3 +503,17 @@ explain (costs off)
reset enable_seqscan;
reset enable_indexscan;
reset enable_bitmapscan;
-- Test constraint exclusion of UNION ALL subqueries
explain (costs off)
SELECT * FROM
(SELECT 1 AS t, * FROM tenk1 a
UNION ALL
SELECT 2 AS t, * FROM tenk1 b) c
WHERE t = 2;
QUERY PLAN
---------------------------------
Result
-> Append
-> Seq Scan on tenk1 b
(3 rows)
......@@ -326,13 +326,13 @@ drop table matest0 cascade;
--
-- Test merge-append for UNION ALL append relations
-- Check handling of duplicated, constant, or volatile targetlist items
--
set enable_seqscan = off;
set enable_indexscan = on;
set enable_bitmapscan = off;
-- Check handling of duplicated, constant, or volatile targetlist items
explain (costs off)
SELECT thousand, tenthous FROM tenk1
UNION ALL
......@@ -351,6 +351,27 @@ UNION ALL
SELECT thousand, random()::integer FROM tenk1
ORDER BY thousand, tenthous;
-- Check min/max aggregate optimization
explain (costs off)
SELECT min(x) FROM
(SELECT unique1 AS x FROM tenk1 a
UNION ALL
SELECT unique2 AS x FROM tenk1 b) s;
explain (costs off)
SELECT min(y) FROM
(SELECT unique1 AS x, unique1 AS y FROM tenk1 a
UNION ALL
SELECT unique2 AS x, unique2 AS y FROM tenk1 b) s;
-- XXX planner doesn't recognize that index on unique2 is sufficiently sorted
explain (costs off)
SELECT x, y FROM
(SELECT thousand AS x, tenthous AS y FROM tenk1 a
UNION ALL
SELECT unique2 AS x, unique2 AS y FROM tenk1 b) s
ORDER BY x, y;
reset enable_seqscan;
reset enable_indexscan;
reset enable_bitmapscan;
......@@ -199,3 +199,11 @@ explain (costs off)
reset enable_seqscan;
reset enable_indexscan;
reset enable_bitmapscan;
-- Test constraint exclusion of UNION ALL subqueries
explain (costs off)
SELECT * FROM
(SELECT 1 AS t, * FROM tenk1 a
UNION ALL
SELECT 2 AS t, * FROM tenk1 b) c
WHERE t = 2;
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment