Commit 6bef118b authored by Tom Lane's avatar Tom Lane

Restructure code that is responsible for ensuring that clauseless joins are

considered when it is necessary to do so because of a join-order restriction
(that is, an outer-join or IN-subselect construct).  The former coding was a
bit ad-hoc and inconsistent, and it missed some cases, as exposed by Mario
Weilguni's recent bug report.  His specific problem was that an IN could be
turned into a "clauseless" join due to constant-propagation removing the IN's
joinclause, and if the IN's subselect involved more than one relation and
there was more than one such IN linking to the same upper relation, then the
only valid join orders involve "bushy" plans but we would fail to consider the
specific paths needed to get there.  (See the example case added to the join
regression test.)  On examining the code I wonder if there weren't some other
problem cases too; in particular it seems that GEQO was defending against a
different set of corner cases than the main planner was.  There was also an
efficiency problem, in that when we did realize we needed a clauseless join
because of an IN, we'd consider clauseless joins against every other relation
whether this was sensible or not.  It seems a better design is to use the
outer-join and in-clause lists as a backup heuristic, just as the rule of
joining only where there are joinclauses is a heuristic: we'll join two
relations if they have a usable joinclause *or* this might be necessary to
satisfy an outer-join or IN-clause join order restriction.  I refactored the
code to have just one place considering this instead of three, and made sure
that it covered all the cases that any of them had been considering.

Backpatch as far as 8.1 (which has only the IN-clause form of the disease).
By rights 8.0 and 7.4 should have the bug too, but they accidentally fail
to fail, because the joininfo structure used in those releases preserves some
memory of there having once been a joinclause between the inner and outer
sides of an IN, and so it leads the code in the right direction anyway.
I'll be conservative and not touch them.
parent 18206509
......@@ -105,12 +105,16 @@ that are either base rels or joinrels constructed per sub-join-lists.
We can join these rels together in any order the planner sees fit.
The standard (non-GEQO) planner does this as follows:
Consider joining each RelOptInfo to each other RelOptInfo specified in its
RelOptInfo.joininfo, and generate a Path for each possible join method for
each such pair. (If we have a RelOptInfo with no join clauses, we have no
choice but to generate a clauseless Cartesian-product join; so we consider
joining that rel to each other available rel. But in the presence of join
clauses we will only consider joins that use available join clauses.)
Consider joining each RelOptInfo to each other RelOptInfo for which there
is a usable joinclause, and generate a Path for each possible join method
for each such pair. (If we have a RelOptInfo with no join clauses, we have
no choice but to generate a clauseless Cartesian-product join; so we
consider joining that rel to each other available rel. But in the presence
of join clauses we will only consider joins that use available join
clauses. Note that join-order restrictions induced by outer joins and
IN clauses are treated as if they were real join clauses, to ensure that
we find a workable join order in cases where those restrictions force a
clauseless join to be done.)
If we only had two relations in the list, we are done: we just pick
the cheapest path for the join RelOptInfo. If we had more than two, we now
......
......@@ -6,7 +6,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/backend/optimizer/geqo/geqo_eval.c,v 1.84 2007/02/13 02:31:02 tgl Exp $
* $PostgreSQL: pgsql/src/backend/optimizer/geqo/geqo_eval.c,v 1.85 2007/02/16 00:14:01 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -253,52 +253,14 @@ static bool
desirable_join(PlannerInfo *root,
RelOptInfo *outer_rel, RelOptInfo *inner_rel)
{
ListCell *l;
/*
* Join if there is an applicable join clause.
* Join if there is an applicable join clause, or if there is a join
* order restriction forcing these rels to be joined.
*/
if (have_relevant_joinclause(root, outer_rel, inner_rel))
if (have_relevant_joinclause(root, outer_rel, inner_rel) ||
have_join_order_restriction(root, outer_rel, inner_rel))
return true;
/*
* Join if the rels overlap the same outer-join side and don't already
* implement the outer join. This is needed to ensure that we can find a
* valid solution in a case where an OJ contains a clauseless join.
*/
foreach(l, root->oj_info_list)
{
OuterJoinInfo *ojinfo = (OuterJoinInfo *) lfirst(l);
/* ignore full joins --- other mechanisms preserve their ordering */
if (ojinfo->is_full_join)
continue;
if (bms_overlap(outer_rel->relids, ojinfo->min_righthand) &&
bms_overlap(inner_rel->relids, ojinfo->min_righthand) &&
!bms_overlap(outer_rel->relids, ojinfo->min_lefthand) &&
!bms_overlap(inner_rel->relids, ojinfo->min_lefthand))
return true;
if (bms_overlap(outer_rel->relids, ojinfo->min_lefthand) &&
bms_overlap(inner_rel->relids, ojinfo->min_lefthand) &&
!bms_overlap(outer_rel->relids, ojinfo->min_righthand) &&
!bms_overlap(inner_rel->relids, ojinfo->min_righthand))
return true;
}
/*
* Join if the rels are members of the same IN sub-select. This is needed
* to ensure that we can find a valid solution in a case where an IN
* sub-select has a clauseless join.
*/
foreach(l, root->in_info_list)
{
InClauseInfo *ininfo = (InClauseInfo *) lfirst(l);
if (bms_is_subset(outer_rel->relids, ininfo->righthand) &&
bms_is_subset(inner_rel->relids, ininfo->righthand))
return true;
}
/* Otherwise postpone the join till later. */
return false;
}
This diff is collapsed.
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/util/joininfo.c,v 1.47 2007/01/20 20:45:39 tgl Exp $
* $PostgreSQL: pgsql/src/backend/optimizer/util/joininfo.c,v 1.48 2007/02/16 00:14:01 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -62,40 +62,6 @@ have_relevant_joinclause(PlannerInfo *root,
if (!result && rel1->has_eclass_joins && rel2->has_eclass_joins)
result = have_relevant_eclass_joinclause(root, rel1, rel2);
/*
* It's possible that the rels correspond to the left and right sides
* of a degenerate outer join, that is, one with no joinclause mentioning
* the non-nullable side. The above scan will then have failed to locate
* any joinclause indicating we should join, but nonetheless we must
* allow the join to occur.
*
* Note: we need no comparable check for IN-joins because we can handle
* sequential buildup of an IN-join to multiple outer-side rels; therefore
* the "last ditch" case in make_rels_by_joins() always succeeds. We
* could dispense with this hack if we were willing to try bushy plans
* in the "last ditch" case, but that seems too expensive.
*/
if (!result)
{
foreach(l, root->oj_info_list)
{
OuterJoinInfo *ojinfo = (OuterJoinInfo *) lfirst(l);
/* ignore full joins --- other mechanisms handle them */
if (ojinfo->is_full_join)
continue;
if ((bms_is_subset(ojinfo->min_lefthand, rel1->relids) &&
bms_is_subset(ojinfo->min_righthand, rel2->relids)) ||
(bms_is_subset(ojinfo->min_lefthand, rel2->relids) &&
bms_is_subset(ojinfo->min_righthand, rel1->relids)))
{
result = true;
break;
}
}
}
bms_free(join_relids);
return result;
......
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/optimizer/paths.h,v 1.95 2007/01/20 20:45:40 tgl Exp $
* $PostgreSQL: pgsql/src/include/optimizer/paths.h,v 1.96 2007/02/16 00:14:01 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -91,6 +91,8 @@ extern void add_paths_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
extern List *make_rels_by_joins(PlannerInfo *root, int level, List **joinrels);
extern RelOptInfo *make_join_rel(PlannerInfo *root,
RelOptInfo *rel1, RelOptInfo *rel2);
extern bool have_join_order_restriction(PlannerInfo *root,
RelOptInfo *rel1, RelOptInfo *rel2);
/*
* equivclass.c
......
......@@ -2137,6 +2137,19 @@ select count(*) from tenk1 a where unique1 in
1
(1 row)
--
-- regression test: check for failure to generate a plan with multiple
-- degenerate IN clauses
--
select count(*) from tenk1 x where
x.unique1 in (select a.f1 from int4_tbl a,float8_tbl b where a.f1=b.f1) and
x.unique1 = 0 and
x.unique1 in (select aa.f1 from int4_tbl aa,float8_tbl bb where aa.f1=bb.f1);
count
-------
1
(1 row)
--
-- Clean up
--
......
......@@ -2137,6 +2137,19 @@ select count(*) from tenk1 a where unique1 in
1
(1 row)
--
-- regression test: check for failure to generate a plan with multiple
-- degenerate IN clauses
--
select count(*) from tenk1 x where
x.unique1 in (select a.f1 from int4_tbl a,float8_tbl b where a.f1=b.f1) and
x.unique1 = 0 and
x.unique1 in (select aa.f1 from int4_tbl aa,float8_tbl bb where aa.f1=bb.f1);
count
-------
1
(1 row)
--
-- Clean up
--
......
......@@ -334,6 +334,15 @@ select count(*) from tenk1 a where unique1 in
(select unique1 from tenk1 b join tenk1 c using (unique1)
where b.unique2 = 42);
--
-- regression test: check for failure to generate a plan with multiple
-- degenerate IN clauses
--
select count(*) from tenk1 x where
x.unique1 in (select a.f1 from int4_tbl a,float8_tbl b where a.f1=b.f1) and
x.unique1 = 0 and
x.unique1 in (select aa.f1 from int4_tbl aa,float8_tbl bb where aa.f1=bb.f1);
--
-- Clean up
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment