Commit cd2a2ce9 authored by Tom Lane's avatar Tom Lane

Change have_join_order_restriction() so that we do not force a clauseless join

if either of the input relations can legally be joined to any other rels using
join clauses.  This avoids uselessly (and expensively) considering a lot of
really stupid join paths when there is a join restriction with a large
footprint, that is, lots of relations inside its LHS or RHS.  My patch of
15-Feb-2007 had been causing the code to consider joining *every* combination
of rels inside such a group, which is exponentially bad :-(.  With this
behavior, clauseless bushy joins will be done if necessary, but they'll be
put off as long as possible.  Per report from Jakub Ouhrabka.

Backpatch to 8.2.  We might someday want to backpatch to 8.1 as well, but 8.1
does not have the problem for OUTER JOIN nests, only for IN-clauses, so it's
not clear anyone's very likely to hit it in practice; and the current patch
doesn't apply cleanly to 8.1.
parent 462227dc
...@@ -174,7 +174,7 @@ than applying a sort to the cheapest other path). ...@@ -174,7 +174,7 @@ than applying a sort to the cheapest other path).
If the query contains one-sided outer joins (LEFT or RIGHT joins), or If the query contains one-sided outer joins (LEFT or RIGHT joins), or
"IN (sub-select)" WHERE clauses that were converted to joins, then some of "IN (sub-select)" WHERE clauses that were converted to joins, then some of
the possible join orders may be illegal. These are excluded by having the possible join orders may be illegal. These are excluded by having
make_join_rel consult side lists of outer joins and IN joins to see join_is_legal consult side lists of outer joins and IN joins to see
whether a proposed join is illegal. (The same consultation allows it whether a proposed join is illegal. (The same consultation allows it
to see which join style should be applied for a valid join, ie, to see which join style should be applied for a valid join, ie,
JOIN_INNER, JOIN_LEFT, etc.) JOIN_INNER, JOIN_LEFT, etc.)
...@@ -217,7 +217,7 @@ FULL JOIN ordering is enforced by not collapsing FULL JOIN nodes when ...@@ -217,7 +217,7 @@ FULL JOIN ordering is enforced by not collapsing FULL JOIN nodes when
translating the jointree to "joinlist" representation. LEFT and RIGHT translating the jointree to "joinlist" representation. LEFT and RIGHT
JOIN nodes are normally collapsed so that they participate fully in the JOIN nodes are normally collapsed so that they participate fully in the
join order search. To avoid generating illegal join orders, the planner join order search. To avoid generating illegal join orders, the planner
creates an OuterJoinInfo node for each outer join, and make_join_rel creates an OuterJoinInfo node for each outer join, and join_is_legal
checks this list to decide if a proposed join is legal. checks this list to decide if a proposed join is legal.
What we store in OuterJoinInfo nodes are the minimum sets of Relids What we store in OuterJoinInfo nodes are the minimum sets of Relids
...@@ -226,7 +226,7 @@ these are minimums; there's no explicit maximum, since joining other ...@@ -226,7 +226,7 @@ these are minimums; there's no explicit maximum, since joining other
rels to the OJ's syntactic rels may be legal. Per identities 1 and 2, rels to the OJ's syntactic rels may be legal. Per identities 1 and 2,
non-FULL joins can be freely associated into the lefthand side of an non-FULL joins can be freely associated into the lefthand side of an
OJ, but in general they can't be associated into the righthand side. OJ, but in general they can't be associated into the righthand side.
So the restriction enforced by make_join_rel is that a proposed join So the restriction enforced by join_is_legal is that a proposed join
can't join a rel within or partly within an RHS boundary to one outside can't join a rel within or partly within an RHS boundary to one outside
the boundary, unless the join validly implements some outer join. the boundary, unless the join validly implements some outer join.
(To support use of identity 3, we have to allow cases where an apparent (To support use of identity 3, we have to allow cases where an apparent
......
...@@ -8,7 +8,7 @@ ...@@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/path/joinrels.c,v 1.87 2007/09/26 18:51:50 tgl Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/path/joinrels.c,v 1.88 2007/10/26 18:10:50 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
...@@ -26,6 +26,7 @@ static List *make_rels_by_clauseless_joins(PlannerInfo *root, ...@@ -26,6 +26,7 @@ static List *make_rels_by_clauseless_joins(PlannerInfo *root,
RelOptInfo *old_rel, RelOptInfo *old_rel,
ListCell *other_rels); ListCell *other_rels);
static bool has_join_restriction(PlannerInfo *root, RelOptInfo *rel); static bool has_join_restriction(PlannerInfo *root, RelOptInfo *rel);
static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
/* /*
...@@ -223,11 +224,11 @@ join_search_one_level(PlannerInfo *root, int level, List **joinrels) ...@@ -223,11 +224,11 @@ join_search_one_level(PlannerInfo *root, int level, List **joinrels)
* y IN (SELECT ... FROM t4,t5 WHERE ...) * y IN (SELECT ... FROM t4,t5 WHERE ...)
* *
* We will flatten this query to a 5-way join problem, but there are * We will flatten this query to a 5-way join problem, but there are
* no 4-way joins that make_join_rel() will consider legal. We have * no 4-way joins that join_is_legal() will consider legal. We have
* to accept failure at level 4 and go on to discover a workable * to accept failure at level 4 and go on to discover a workable
* bushy plan at level 5. * bushy plan at level 5.
* *
* However, if there are no such clauses then make_join_rel() should * However, if there are no such clauses then join_is_legal() should
* never fail, and so the following sanity check is useful. * never fail, and so the following sanity check is useful.
*---------- *----------
*/ */
...@@ -326,32 +327,29 @@ make_rels_by_clauseless_joins(PlannerInfo *root, ...@@ -326,32 +327,29 @@ make_rels_by_clauseless_joins(PlannerInfo *root,
/* /*
* make_join_rel * join_is_legal
* Find or create a join RelOptInfo that represents the join of * Determine whether a proposed join is legal given the query's
* the two given rels, and add to it path information for paths * join order constraints; and if it is, determine the join type.
* created with the two rels as outer and inner rel.
* (The join rel may already contain paths generated from other
* pairs of rels that add up to the same set of base rels.)
* *
* NB: will return NULL if attempted join is not valid. This can happen * Caller must supply not only the two rels, but the union of their relids.
* when working with outer joins, or with IN clauses that have been turned * (We could simplify the API by computing joinrelids locally, but this
* into joins. * would be redundant work in the normal path through make_join_rel.)
*
* On success, *jointype_p is set to the required join type.
*/ */
RelOptInfo * static bool
make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2) join_is_legal(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
Relids joinrelids, JoinType *jointype_p)
{ {
Relids joinrelids;
JoinType jointype; JoinType jointype;
bool is_valid_inner; bool is_valid_inner;
RelOptInfo *joinrel;
List *restrictlist;
ListCell *l; ListCell *l;
/* We should never try to join two overlapping sets of rels. */ /*
Assert(!bms_overlap(rel1->relids, rel2->relids)); * Ensure *jointype_p is set on failure return. This is just to
* suppress uninitialized-variable warnings from overly anal compilers.
/* Construct Relids set that identifies the joinrel. */ */
joinrelids = bms_union(rel1->relids, rel2->relids); *jointype_p = JOIN_INNER;
/* /*
* If we have any outer joins, the proposed join might be illegal; and in * If we have any outer joins, the proposed join might be illegal; and in
...@@ -400,22 +398,14 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2) ...@@ -400,22 +398,14 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
bms_is_subset(ojinfo->min_righthand, rel2->relids)) bms_is_subset(ojinfo->min_righthand, rel2->relids))
{ {
if (jointype != JOIN_INNER) if (jointype != JOIN_INNER)
{ return false; /* invalid join path */
/* invalid join path */
bms_free(joinrelids);
return NULL;
}
jointype = ojinfo->is_full_join ? JOIN_FULL : JOIN_LEFT; jointype = ojinfo->is_full_join ? JOIN_FULL : JOIN_LEFT;
} }
else if (bms_is_subset(ojinfo->min_lefthand, rel2->relids) && else if (bms_is_subset(ojinfo->min_lefthand, rel2->relids) &&
bms_is_subset(ojinfo->min_righthand, rel1->relids)) bms_is_subset(ojinfo->min_righthand, rel1->relids))
{ {
if (jointype != JOIN_INNER) if (jointype != JOIN_INNER)
{ return false; /* invalid join path */
/* invalid join path */
bms_free(joinrelids);
return NULL;
}
jointype = ojinfo->is_full_join ? JOIN_FULL : JOIN_RIGHT; jointype = ojinfo->is_full_join ? JOIN_FULL : JOIN_RIGHT;
} }
else else
...@@ -458,11 +448,7 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2) ...@@ -458,11 +448,7 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
/* Fail if violated some OJ's RHS and didn't match to another OJ */ /* Fail if violated some OJ's RHS and didn't match to another OJ */
if (jointype == JOIN_INNER && !is_valid_inner) if (jointype == JOIN_INNER && !is_valid_inner)
{ return false; /* invalid join path */
/* invalid join path */
bms_free(joinrelids);
return NULL;
}
/* /*
* Similarly, if we are implementing IN clauses as joins, check for * Similarly, if we are implementing IN clauses as joins, check for
...@@ -494,10 +480,7 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2) ...@@ -494,10 +480,7 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
* subselect. * subselect.
*/ */
if (!bms_is_subset(ininfo->righthand, joinrelids)) if (!bms_is_subset(ininfo->righthand, joinrelids))
{ return false;
bms_free(joinrelids);
return NULL;
}
/* /*
* At this point we are considering a join of the IN's RHS to some * At this point we are considering a join of the IN's RHS to some
...@@ -525,10 +508,7 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2) ...@@ -525,10 +508,7 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
* that needs to trigger here. * that needs to trigger here.
*/ */
if (jointype != JOIN_INNER) if (jointype != JOIN_INNER)
{ return false;
bms_free(joinrelids);
return NULL;
}
if (bms_is_subset(ininfo->lefthand, rel1->relids) && if (bms_is_subset(ininfo->lefthand, rel1->relids) &&
bms_equal(ininfo->righthand, rel2->relids)) bms_equal(ininfo->righthand, rel2->relids))
jointype = JOIN_IN; jointype = JOIN_IN;
...@@ -540,12 +520,48 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2) ...@@ -540,12 +520,48 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
else if (bms_equal(ininfo->righthand, rel2->relids)) else if (bms_equal(ininfo->righthand, rel2->relids))
jointype = JOIN_UNIQUE_INNER; jointype = JOIN_UNIQUE_INNER;
else else
return false; /* invalid join path */
}
/* Join is valid */
*jointype_p = jointype;
return true;
}
/*
* make_join_rel
* Find or create a join RelOptInfo that represents the join of
* the two given rels, and add to it path information for paths
* created with the two rels as outer and inner rel.
* (The join rel may already contain paths generated from other
* pairs of rels that add up to the same set of base rels.)
*
* NB: will return NULL if attempted join is not valid. This can happen
* when working with outer joins, or with IN clauses that have been turned
* into joins.
*/
RelOptInfo *
make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
{
Relids joinrelids;
JoinType jointype;
RelOptInfo *joinrel;
List *restrictlist;
/* We should never try to join two overlapping sets of rels. */
Assert(!bms_overlap(rel1->relids, rel2->relids));
/* Construct Relids set that identifies the joinrel. */
joinrelids = bms_union(rel1->relids, rel2->relids);
/* Check validity and determine join type. */
if (!join_is_legal(root, rel1, rel2, joinrelids, &jointype))
{ {
/* invalid join path */ /* invalid join path */
bms_free(joinrelids); bms_free(joinrelids);
return NULL; return NULL;
} }
}
/* /*
* Find or build the join RelOptInfo, and compute the restrictlist that * Find or build the join RelOptInfo, and compute the restrictlist that
...@@ -646,6 +662,7 @@ bool ...@@ -646,6 +662,7 @@ bool
have_join_order_restriction(PlannerInfo *root, have_join_order_restriction(PlannerInfo *root,
RelOptInfo *rel1, RelOptInfo *rel2) RelOptInfo *rel1, RelOptInfo *rel2)
{ {
bool result = false;
ListCell *l; ListCell *l;
/* /*
...@@ -667,10 +684,16 @@ have_join_order_restriction(PlannerInfo *root, ...@@ -667,10 +684,16 @@ have_join_order_restriction(PlannerInfo *root,
/* Can we perform the OJ with these rels? */ /* Can we perform the OJ with these rels? */
if (bms_is_subset(ojinfo->min_lefthand, rel1->relids) && if (bms_is_subset(ojinfo->min_lefthand, rel1->relids) &&
bms_is_subset(ojinfo->min_righthand, rel2->relids)) bms_is_subset(ojinfo->min_righthand, rel2->relids))
return true; {
result = true;
break;
}
if (bms_is_subset(ojinfo->min_lefthand, rel2->relids) && if (bms_is_subset(ojinfo->min_lefthand, rel2->relids) &&
bms_is_subset(ojinfo->min_righthand, rel1->relids)) bms_is_subset(ojinfo->min_righthand, rel1->relids))
return true; {
result = true;
break;
}
/* /*
* Might we need to join these rels to complete the RHS? We have * Might we need to join these rels to complete the RHS? We have
...@@ -679,12 +702,18 @@ have_join_order_restriction(PlannerInfo *root, ...@@ -679,12 +702,18 @@ have_join_order_restriction(PlannerInfo *root,
*/ */
if (bms_overlap(ojinfo->min_righthand, rel1->relids) && if (bms_overlap(ojinfo->min_righthand, rel1->relids) &&
bms_overlap(ojinfo->min_righthand, rel2->relids)) bms_overlap(ojinfo->min_righthand, rel2->relids))
return true; {
result = true;
break;
}
/* Likewise for the LHS. */ /* Likewise for the LHS. */
if (bms_overlap(ojinfo->min_lefthand, rel1->relids) && if (bms_overlap(ojinfo->min_lefthand, rel1->relids) &&
bms_overlap(ojinfo->min_lefthand, rel2->relids)) bms_overlap(ojinfo->min_lefthand, rel2->relids))
return true; {
result = true;
break;
}
} }
/* /*
...@@ -698,10 +727,16 @@ have_join_order_restriction(PlannerInfo *root, ...@@ -698,10 +727,16 @@ have_join_order_restriction(PlannerInfo *root,
/* Can we perform the IN with these rels? */ /* Can we perform the IN with these rels? */
if (bms_is_subset(ininfo->lefthand, rel1->relids) && if (bms_is_subset(ininfo->lefthand, rel1->relids) &&
bms_is_subset(ininfo->righthand, rel2->relids)) bms_is_subset(ininfo->righthand, rel2->relids))
return true; {
result = true;
break;
}
if (bms_is_subset(ininfo->lefthand, rel2->relids) && if (bms_is_subset(ininfo->lefthand, rel2->relids) &&
bms_is_subset(ininfo->righthand, rel1->relids)) bms_is_subset(ininfo->righthand, rel1->relids))
return true; {
result = true;
break;
}
/* /*
* Might we need to join these rels to complete the RHS? It's * Might we need to join these rels to complete the RHS? It's
...@@ -711,15 +746,37 @@ have_join_order_restriction(PlannerInfo *root, ...@@ -711,15 +746,37 @@ have_join_order_restriction(PlannerInfo *root,
*/ */
if (bms_overlap(ininfo->righthand, rel1->relids) && if (bms_overlap(ininfo->righthand, rel1->relids) &&
bms_overlap(ininfo->righthand, rel2->relids)) bms_overlap(ininfo->righthand, rel2->relids))
return true; {
result = true;
break;
}
/* Likewise for the LHS. */ /* Likewise for the LHS. */
if (bms_overlap(ininfo->lefthand, rel1->relids) && if (bms_overlap(ininfo->lefthand, rel1->relids) &&
bms_overlap(ininfo->lefthand, rel2->relids)) bms_overlap(ininfo->lefthand, rel2->relids))
return true; {
result = true;
break;
}
} }
return false; /*
* We do not force the join to occur if either input rel can legally
* be joined to anything else using joinclauses. This essentially
* means that clauseless bushy joins are put off as long as possible.
* The reason is that when there is a join order restriction high up
* in the join tree (that is, with many rels inside the LHS or RHS),
* we would otherwise expend lots of effort considering very stupid
* join combinations within its LHS or RHS.
*/
if (result)
{
if (has_legal_joinclause(root, rel1) ||
has_legal_joinclause(root, rel2))
result = false;
}
return result;
} }
...@@ -729,7 +786,9 @@ have_join_order_restriction(PlannerInfo *root, ...@@ -729,7 +786,9 @@ have_join_order_restriction(PlannerInfo *root,
* due to being inside an outer join or an IN (sub-SELECT). * due to being inside an outer join or an IN (sub-SELECT).
* *
* Essentially, this tests whether have_join_order_restriction() could * Essentially, this tests whether have_join_order_restriction() could
* succeed with this rel and some other one. * succeed with this rel and some other one. It's OK if we sometimes
* say "true" incorrectly. (Therefore, we don't bother with the relatively
* expensive has_legal_joinclause test.)
*/ */
static bool static bool
has_join_restriction(PlannerInfo *root, RelOptInfo *rel) has_join_restriction(PlannerInfo *root, RelOptInfo *rel)
...@@ -772,3 +831,61 @@ has_join_restriction(PlannerInfo *root, RelOptInfo *rel) ...@@ -772,3 +831,61 @@ has_join_restriction(PlannerInfo *root, RelOptInfo *rel)
return false; return false;
} }
/*
* has_legal_joinclause
* Detect whether the specified relation can legally be joined
* to any other rels using join clauses.
*
* We consider only joins to single other relations. This is sufficient
* to get a "true" result in most real queries, and an occasional erroneous
* "false" will only cost a bit more planning time. The reason for this
* limitation is that considering joins to other joins would require proving
* that the other join rel can legally be formed, which seems like too much
* trouble for something that's only a heuristic to save planning time.
*/
static bool
has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel)
{
Index rti;
for (rti = 1; rti < root->simple_rel_array_size; rti++)
{
RelOptInfo *rel2 = root->simple_rel_array[rti];
/* there may be empty slots corresponding to non-baserel RTEs */
if (rel2 == NULL)
continue;
Assert(rel2->relid == rti); /* sanity check on array */
/* ignore RTEs that are "other rels" */
if (rel2->reloptkind != RELOPT_BASEREL)
continue;
/* ignore RTEs that are already in "rel" */
if (bms_overlap(rel->relids, rel2->relids))
continue;
if (have_relevant_joinclause(root, rel, rel2))
{
Relids joinrelids;
JoinType jointype;
/* join_is_legal needs relids of the union */
joinrelids = bms_union(rel->relids, rel2->relids);
if (join_is_legal(root, rel, rel2, joinrelids, &jointype))
{
/* Yes, this will work */
bms_free(joinrelids);
return true;
}
bms_free(joinrelids);
}
}
return false;
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment