Commit e3b98527 authored by Tom Lane's avatar Tom Lane

Teach planner how to rearrange join order for some classes of OUTER JOIN.

Per my recent proposal.  I ended up basing the implementation on the
existing mechanism for enforcing valid join orders of IN joins --- the
rules for valid outer-join orders are somewhat similar.
parent 1a6aaaa6
<!--
$PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.38 2005/12/09 15:51:13 petere Exp $
$PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.39 2005/12/20 02:30:35 tgl Exp $
-->
<chapter Id="runtime-config">
<title>Server Configuration</title>
......@@ -2028,6 +2028,7 @@ SELECT * FROM parent WHERE key = 2400;
this many items. Smaller values reduce planning time but may
yield inferior query plans. The default is 8. It is usually
wise to keep this less than <xref linkend="guc-geqo-threshold">.
For more information see <xref linkend="explicit-joins">.
</para>
</listitem>
</varlistentry>
......@@ -2039,48 +2040,24 @@ SELECT * FROM parent WHERE key = 2400;
</indexterm>
<listitem>
<para>
The planner will rewrite explicit inner <literal>JOIN</>
constructs into lists of <literal>FROM</> items whenever a
list of no more than this many items in total would
result. Prior to <productname>PostgreSQL</> 7.4, joins
specified via the <literal>JOIN</literal> construct would
never be reordered by the query planner. The query planner has
subsequently been improved so that inner joins written in this
form can be reordered; this configuration parameter controls
the extent to which this reordering is performed.
<note>
<para>
At present, the order of outer joins specified via the
<literal>JOIN</> construct is never adjusted by the query
planner; therefore, <varname>join_collapse_limit</> has no
effect on this behavior. The planner may be improved to
reorder some classes of outer joins in a future release of
<productname>PostgreSQL</productname>.
</para>
</note>
The planner will rewrite explicit <literal>JOIN</>
constructs (except <literal>FULL JOIN</>s) into lists of
<literal>FROM</> items whenever a list of no more than this many items
would result. Smaller values reduce planning time but may
yield inferior query plans.
</para>
<para>
By default, this variable is set the same as
<varname>from_collapse_limit</varname>, which is appropriate
for most uses. Setting it to 1 prevents any reordering of
inner <literal>JOIN</>s. Thus, the explicit join order
explicit <literal>JOIN</>s. Thus, the explicit join order
specified in the query will be the actual order in which the
relations are joined. The query planner does not always choose
the optimal join order; advanced users may elect to
temporarily set this variable to 1, and then specify the join
order they desire explicitly. Another consequence of setting
this variable to 1 is that the query planner will behave more
like the <productname>PostgreSQL</productname> 7.3 query
planner, which some users might find useful for backward
compatibility reasons.
</para>
<para>
Setting this variable to a value between 1 and
<varname>from_collapse_limit</varname> might be useful to
trade off planning time against the quality of the chosen plan
(higher values produce better plans).
order they desire explicitly.
For more information see <xref linkend="explicit-joins">.
</para>
</listitem>
</varlistentry>
......
<!--
$PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.54 2005/11/04 23:14:00 petere Exp $
$PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.55 2005/12/20 02:30:35 tgl Exp $
-->
<chapter id="performance-tips">
......@@ -627,7 +627,7 @@ SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;
</para>
<para>
When the query involves outer joins, the planner has much less freedom
When the query involves outer joins, the planner has less freedom
than it does for plain (inner) joins. For example, consider
<programlisting>
SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
......@@ -637,16 +637,30 @@ SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
emitted for each row of A that has no matching row in the join of B and C.
Therefore the planner has no choice of join order here: it must join
B to C and then join A to that result. Accordingly, this query takes
less time to plan than the previous query.
less time to plan than the previous query. In other cases, the planner
may be able to determine that more than one join order is safe.
For example, given
<programlisting>
SELECT * FROM a LEFT JOIN b ON (a.bid = b.id) LEFT JOIN c ON (a.cid = c.id);
</programlisting>
it is valid to join A to either B or C first. Currently, only
<literal>FULL JOIN</> completely constrains the join order. Most
practical cases involving <literal>LEFT JOIN</> or <literal>RIGHT JOIN</>
can be rearranged to some extent.
</para>
<para>
Explicit inner join syntax (<literal>INNER JOIN</>, <literal>CROSS
JOIN</>, or unadorned <literal>JOIN</>) is semantically the same as
listing the input relations in <literal>FROM</>, so it does not need to
constrain the join order. But it is possible to instruct the
<productname>PostgreSQL</productname> query planner to treat
explicit inner <literal>JOIN</>s as constraining the join order anyway.
listing the input relations in <literal>FROM</>, so it does not
constrain the join order.
</para>
<para>
Even though most kinds of <literal>JOIN</> don't completely constrain
the join order, it is possible to instruct the
<productname>PostgreSQL</productname> query planner to treat all
<literal>JOIN</> clauses as constraining the join order anyway.
For example, these three queries are logically equivalent:
<programlisting>
SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;
......@@ -660,7 +674,8 @@ SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
</para>
<para>
To force the planner to follow the <literal>JOIN</> order for inner joins,
To force the planner to follow the join order laid out by explicit
<literal>JOIN</>s,
set the <xref linkend="guc-join-collapse-limit"> run-time parameter to 1.
(Other possible values are discussed below.)
</para>
......@@ -697,9 +712,9 @@ FROM x, y,
WHERE somethingelse;
</programlisting>
This situation might arise from use of a view that contains a join;
the view's <literal>SELECT</> rule will be inserted in place of the view reference,
yielding a query much like the above. Normally, the planner will try
to collapse the subquery into the parent, yielding
the view's <literal>SELECT</> rule will be inserted in place of the view
reference, yielding a query much like the above. Normally, the planner
will try to collapse the subquery into the parent, yielding
<programlisting>
SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
</programlisting>
......@@ -722,12 +737,12 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
linkend="guc-join-collapse-limit">
are similarly named because they do almost the same thing: one controls
when the planner will <quote>flatten out</> subselects, and the
other controls when it will flatten out explicit inner joins. Typically
other controls when it will flatten out explicit joins. Typically
you would either set <varname>join_collapse_limit</> equal to
<varname>from_collapse_limit</> (so that explicit joins and subselects
act similarly) or set <varname>join_collapse_limit</> to 1 (if you want
to control join order with explicit joins). But you might set them
differently if you are trying to fine-tune the trade off between planning
differently if you are trying to fine-tune the trade-off between planning
time and run time.
</para>
</sect1>
......
......@@ -15,7 +15,7 @@
* Portions Copyright (c) 1994, Regents of the University of California
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/nodes/copyfuncs.c,v 1.322 2005/11/26 22:14:56 tgl Exp $
* $PostgreSQL: pgsql/src/backend/nodes/copyfuncs.c,v 1.323 2005/12/20 02:30:35 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -1277,6 +1277,22 @@ _copyRestrictInfo(RestrictInfo *from)
return newnode;
}
/*
* _copyOuterJoinInfo
*/
static OuterJoinInfo *
_copyOuterJoinInfo(OuterJoinInfo *from)
{
OuterJoinInfo *newnode = makeNode(OuterJoinInfo);
COPY_BITMAPSET_FIELD(min_lefthand);
COPY_BITMAPSET_FIELD(min_righthand);
COPY_SCALAR_FIELD(is_full_join);
COPY_SCALAR_FIELD(lhs_strict);
return newnode;
}
/*
* _copyInClauseInfo
*/
......@@ -2906,6 +2922,9 @@ copyObject(void *from)
case T_RestrictInfo:
retval = _copyRestrictInfo(from);
break;
case T_OuterJoinInfo:
retval = _copyOuterJoinInfo(from);
break;
case T_InClauseInfo:
retval = _copyInClauseInfo(from);
break;
......
......@@ -18,7 +18,7 @@
* Portions Copyright (c) 1994, Regents of the University of California
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/nodes/equalfuncs.c,v 1.258 2005/11/22 18:17:11 momjian Exp $
* $PostgreSQL: pgsql/src/backend/nodes/equalfuncs.c,v 1.259 2005/12/20 02:30:35 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -613,6 +613,17 @@ _equalRestrictInfo(RestrictInfo *a, RestrictInfo *b)
return true;
}
static bool
_equalOuterJoinInfo(OuterJoinInfo *a, OuterJoinInfo *b)
{
COMPARE_BITMAPSET_FIELD(min_lefthand);
COMPARE_BITMAPSET_FIELD(min_righthand);
COMPARE_SCALAR_FIELD(is_full_join);
COMPARE_SCALAR_FIELD(lhs_strict);
return true;
}
static bool
_equalInClauseInfo(InClauseInfo *a, InClauseInfo *b)
{
......@@ -1954,6 +1965,9 @@ equal(void *a, void *b)
case T_RestrictInfo:
retval = _equalRestrictInfo(a, b);
break;
case T_OuterJoinInfo:
retval = _equalOuterJoinInfo(a, b);
break;
case T_InClauseInfo:
retval = _equalInClauseInfo(a, b);
break;
......
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.264 2005/11/28 04:35:30 tgl Exp $
* $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.265 2005/12/20 02:30:35 tgl Exp $
*
* NOTES
* Every node type that can appear in stored rules' parsetrees *must*
......@@ -1167,6 +1167,7 @@ _outPlannerInfo(StringInfo str, PlannerInfo *node)
WRITE_NODE_FIELD(left_join_clauses);
WRITE_NODE_FIELD(right_join_clauses);
WRITE_NODE_FIELD(full_join_clauses);
WRITE_NODE_FIELD(oj_info_list);
WRITE_NODE_FIELD(in_info_list);
WRITE_NODE_FIELD(query_pathkeys);
WRITE_NODE_FIELD(group_pathkeys);
......@@ -1201,7 +1202,6 @@ _outRelOptInfo(StringInfo str, RelOptInfo *node)
WRITE_FLOAT_FIELD(tuples, "%.0f");
WRITE_NODE_FIELD(subplan);
WRITE_NODE_FIELD(baserestrictinfo);
WRITE_BITMAPSET_FIELD(outerjoinset);
WRITE_NODE_FIELD(joininfo);
WRITE_BITMAPSET_FIELD(index_outer_relids);
WRITE_NODE_FIELD(index_inner_paths);
......@@ -1265,6 +1265,17 @@ _outInnerIndexscanInfo(StringInfo str, InnerIndexscanInfo *node)
WRITE_NODE_FIELD(best_innerpath);
}
static void
_outOuterJoinInfo(StringInfo str, OuterJoinInfo *node)
{
WRITE_NODE_TYPE("OUTERJOININFO");
WRITE_BITMAPSET_FIELD(min_lefthand);
WRITE_BITMAPSET_FIELD(min_righthand);
WRITE_BOOL_FIELD(is_full_join);
WRITE_BOOL_FIELD(lhs_strict);
}
static void
_outInClauseInfo(StringInfo str, InClauseInfo *node)
{
......@@ -2019,6 +2030,9 @@ _outNode(StringInfo str, void *obj)
case T_InnerIndexscanInfo:
_outInnerIndexscanInfo(str, obj);
break;
case T_OuterJoinInfo:
_outOuterJoinInfo(str, obj);
break;
case T_InClauseInfo:
_outInClauseInfo(str, obj);
break;
......
......@@ -40,10 +40,11 @@ is derived from the cheapest Path for the RelOptInfo that includes all the
base rels of the query.
Possible Paths for a primitive table relation include plain old sequential
scan, plus index scans for any indexes that exist on the table. A subquery
base relation just has one Path, a "SubqueryScan" path (which links to the
subplan that was built by a recursive invocation of the planner). Likewise
a function-RTE base relation has only one possible Path.
scan, plus index scans for any indexes that exist on the table, plus bitmap
index scans using one or more indexes. A subquery base relation just has
one Path, a "SubqueryScan" path (which links to the subplan that was built
by a recursive invocation of the planner). Likewise a function-RTE base
relation has only one possible Path.
Joins always occur using two RelOptInfos. One is outer, the other inner.
Outers drive lookups of values in the inner. In a nested loop, lookups of
......@@ -84,20 +85,26 @@ If we have only a single base relation in the query, we are done.
Otherwise we have to figure out how to join the base relations into a
single join relation.
2) If the query's FROM clause contains explicit JOIN clauses, we join
those pairs of relations in exactly the tree structure indicated by the
JOIN clauses. (This is absolutely necessary when dealing with outer JOINs.
For inner JOINs we have more flexibility in theory, but don't currently
exploit it in practice.) For each such join pair, we generate a Path
for each feasible join method, and select the cheapest Path. Note that
the JOIN clause structure determines the join Path structure, but it
doesn't constrain the join implementation method at each join (nestloop,
merge, hash), nor does it say which rel is considered outer or inner at
each join. We consider all these possibilities in building Paths.
2) Normally, any explicit JOIN clauses are "flattened" so that we just
have a list of relations to join. However, FULL OUTER JOIN clauses are
never flattened, and other kinds of JOIN might not be either, if the
flattening process is stopped by join_collapse_limit or from_collapse_limit
restrictions. Therefore, we end up with a planning problem that contains
both lists of relations to be joined in any order, and JOIN nodes that
force a particular join order. For each un-flattened JOIN node, we join
exactly that pair of relations (after recursively planning their inputs,
if the inputs aren't single base relations). We generate a Path for each
feasible join method, and select the cheapest Path. Note that the JOIN
clause structure determines the join Path structure, but it doesn't
constrain the join implementation method at each join (nestloop, merge,
hash), nor does it say which rel is considered outer or inner at each
join. We consider all these possibilities in building Paths.
3) At the top level of the FROM clause we will have a list of relations
that are either base rels or joinrels constructed per JOIN directives.
We can join these rels together in any order the planner sees fit.
that are either base rels or joinrels constructed per un-flattened JOIN
directives. (This is also the situation, recursively, when we can flatten
sub-joins underneath an un-flattenable JOIN into a list of relations to
join.) We can join these rels together in any order the planner sees fit.
The standard (non-GEQO) planner does this as follows:
Consider joining each RelOptInfo to each other RelOptInfo specified in its
......@@ -156,12 +163,76 @@ joining {1 2 3} to {4} (left-handed), {4} to {1 2 3} (right-handed), and
scanning code produces these potential join combinations one at a time,
all the ways to produce the same set of joined base rels will share the
same RelOptInfo, so the paths produced from different join combinations
that produce equivalent joinrels will compete in add_path.
that produce equivalent joinrels will compete in add_path().
Once we have built the final join rel, we use either the cheapest path
for it or the cheapest path with the desired ordering (if that's cheaper
than applying a sort to the cheapest other path).
If the query contains one-sided outer joins (LEFT or RIGHT joins), or
"IN (sub-select)" WHERE clauses that were converted to joins, then some of
the possible join orders may be illegal. These are excluded by having
make_join_rel consult side lists of outer joins and IN joins to see
whether a proposed join is illegal. (The same consultation allows it
to see which join style should be applied for a valid join, ie,
JOIN_INNER, JOIN_LEFT, etc.)
Valid OUTER JOIN optimizations
------------------------------
The planner's treatment of outer join reordering is based on the following
identities:
1. (A leftjoin B on (Pab)) innerjoin C on (Pac)
= (A innerjoin C on (Pac)) leftjoin B on (Pab)
where Pac is a predicate referencing A and C, etc (in this case, clearly
Pac cannot reference B, or the transformation is nonsensical).
2. (A leftjoin B on (Pab)) leftjoin C on (Pac)
= (A leftjoin C on (Pac)) leftjoin B on (Pab)
3. (A leftjoin B on (Pab)) leftjoin C on (Pbc)
= A leftjoin (B leftjoin C on (Pbc)) on (Pab)
Identity 3 only holds if predicate Pbc must fail for all-null B rows
(that is, Pbc is strict for at least one column of B). If Pbc is not
strict, the first form might produce some rows with nonnull C columns
where the second form would make those entries null.
RIGHT JOIN is equivalent to LEFT JOIN after switching the two input
tables, so the same identities work for right joins. Only FULL JOIN
cannot be re-ordered at all.
An example of a case that does *not* work is moving an innerjoin into or
out of the nullable side of an outer join:
A leftjoin (B join C on (Pbc)) on (Pab)
!= (A leftjoin B on (Pab)) join C on (Pbc)
FULL JOIN ordering is enforced by not collapsing FULL JOIN nodes when
translating the jointree to "joinlist" representation. LEFT and RIGHT
JOIN nodes are normally collapsed so that they participate fully in the
join order search. To avoid generating illegal join orders, the planner
creates an OuterJoinInfo node for each outer join, and make_join_rel
checks this list to decide if a proposed join is legal.
What we store in OuterJoinInfo nodes are the minimum sets of Relids
required on each side of the join to form the outer join. Note that
these are minimums; there's no explicit maximum, since joining other
rels to the OJ's syntactic rels may be legal. Per identities 1 and 2,
non-FULL joins can be freely associated into the lefthand side of an
OJ, but in general they can't be associated into the righthand side.
So the restriction enforced by make_join_rel is that a proposed join
can't join across a RHS boundary (ie, join anything inside the RHS
to anything else) unless the join validly implements some outer join.
(To support use of identity 3, we have to allow cases where an apparent
violation of a lower OJ's RHS is committed while forming an upper OJ.
If this wouldn't in fact be legal, the upper OJ's minimum LHS or RHS
set must be expanded to include the whole of the lower OJ, thereby
preventing it from being formed before the lower OJ is.)
Pulling up subqueries
---------------------
......@@ -180,13 +251,13 @@ of the join tree. Each FROM-list is planned using the dynamic-programming
search method described above.
If pulling up a subquery produces a FROM-list as a direct child of another
FROM-list (with no explicit JOIN directives between), then we can merge the
two FROM-lists together. Once that's done, the subquery is an absolutely
integral part of the outer query and will not constrain the join tree
search space at all. However, that could result in unpleasant growth of
planning time, since the dynamic-programming search has runtime exponential
in the number of FROM-items considered. Therefore, we don't merge
FROM-lists if the result would have too many FROM-items in one list.
FROM-list, then we can merge the two FROM-lists together. Once that's
done, the subquery is an absolutely integral part of the outer query and
will not constrain the join tree search space at all. However, that could
result in unpleasant growth of planning time, since the dynamic-programming
search has runtime exponential in the number of FROM-items considered.
Therefore, we don't merge FROM-lists if the result would have too many
FROM-items in one list.
Optimizer Functions
......
......@@ -6,7 +6,7 @@
* Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/backend/optimizer/geqo/geqo_eval.c,v 1.78 2005/11/22 18:17:11 momjian Exp $
* $PostgreSQL: pgsql/src/backend/optimizer/geqo/geqo_eval.c,v 1.79 2005/12/20 02:30:35 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -216,12 +216,11 @@ gimme_tree(Gene *tour, int num_gene, GeqoEvalData *evaldata)
/*
* Construct a RelOptInfo representing the join of these two input
* relations. These are always inner joins. Note that we expect
* the joinrel not to exist in root->join_rel_list yet, and so the
* paths constructed for it will only include the ones we want.
* relations. Note that we expect the joinrel not to exist in
* root->join_rel_list yet, and so the paths constructed for it
* will only include the ones we want.
*/
joinrel = make_join_rel(evaldata->root, outer_rel, inner_rel,
JOIN_INNER);
joinrel = make_join_rel(evaldata->root, outer_rel, inner_rel);
/* Can't pop stack here if join order is not valid */
if (!joinrel)
......@@ -262,6 +261,20 @@ desirable_join(PlannerInfo *root,
if (have_relevant_joinclause(outer_rel, inner_rel))
return true;
/*
* Join if the rels are members of the same outer-join RHS. This is needed
* to improve the odds that we will find a valid solution in a case where
* an OJ RHS has a clauseless join.
*/
foreach(l, root->oj_info_list)
{
OuterJoinInfo *ojinfo = (OuterJoinInfo *) lfirst(l);
if (bms_is_subset(outer_rel->relids, ojinfo->min_righthand) &&
bms_is_subset(inner_rel->relids, ojinfo->min_righthand))
return true;
}
/*
* Join if the rels are members of the same IN sub-select. This is needed
* to improve the odds that we will find a valid solution in a case where
......
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/path/allpaths.c,v 1.138 2005/11/22 18:17:12 momjian Exp $
* $PostgreSQL: pgsql/src/backend/optimizer/path/allpaths.c,v 1.139 2005/12/20 02:30:35 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -51,6 +51,7 @@ static void set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
Index rti, RangeTblEntry *rte);
static void set_function_pathlist(PlannerInfo *root, RelOptInfo *rel,
RangeTblEntry *rte);
static RelOptInfo *make_rel_from_joinlist(PlannerInfo *root, List *joinlist);
static RelOptInfo *make_one_rel_by_joins(PlannerInfo *root, int levels_needed,
List *initial_rels);
static bool subquery_is_pushdown_safe(Query *subquery, Query *topquery,
......@@ -73,7 +74,7 @@ static void recurse_push_qual(Node *setOp, Query *topquery,
* single rel that represents the join of all base rels in the query.
*/
RelOptInfo *
make_one_rel(PlannerInfo *root)
make_one_rel(PlannerInfo *root, List *joinlist)
{
RelOptInfo *rel;
......@@ -85,10 +86,7 @@ make_one_rel(PlannerInfo *root)
/*
* Generate access paths for the entire join tree.
*/
Assert(root->parse->jointree != NULL &&
IsA(root->parse->jointree, FromExpr));
rel = make_fromexpr_rel(root, root->parse->jointree);
rel = make_rel_from_joinlist(root, joinlist);
/*
* The result should join all and only the query's base rels.
......@@ -528,43 +526,65 @@ set_function_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
}
/*
* make_fromexpr_rel
* Build access paths for a FromExpr jointree node.
* make_rel_from_joinlist
* Build access paths using a "joinlist" to guide the join path search.
*
* See comments for deconstruct_jointree() for definition of the joinlist
* data structure.
*/
RelOptInfo *
make_fromexpr_rel(PlannerInfo *root, FromExpr *from)
static RelOptInfo *
make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
{
int levels_needed;
List *initial_rels = NIL;
ListCell *jt;
List *initial_rels;
ListCell *jl;
/*
* Count the number of child jointree nodes. This is the depth of the
* Count the number of child joinlist nodes. This is the depth of the
* dynamic-programming algorithm we must employ to consider all ways of
* joining the child nodes.
*/
levels_needed = list_length(from->fromlist);
levels_needed = list_length(joinlist);
if (levels_needed <= 0)
return NULL; /* nothing to do? */
/*
* Construct a list of rels corresponding to the child jointree nodes.
* Construct a list of rels corresponding to the child joinlist nodes.
* This may contain both base rels and rels constructed according to
* explicit JOIN directives.
* sub-joinlists.
*/
foreach(jt, from->fromlist)
initial_rels = NIL;
foreach(jl, joinlist)
{
Node *jtnode = (Node *) lfirst(jt);
Node *jlnode = (Node *) lfirst(jl);
RelOptInfo *thisrel;
if (IsA(jlnode, RangeTblRef))
{
int varno = ((RangeTblRef *) jlnode)->rtindex;
thisrel = find_base_rel(root, varno);
}
else if (IsA(jlnode, List))
{
/* Recurse to handle subproblem */
thisrel = make_rel_from_joinlist(root, (List *) jlnode);
}
else
{
elog(ERROR, "unrecognized joinlist node type: %d",
(int) nodeTag(jlnode));
thisrel = NULL; /* keep compiler quiet */
}
initial_rels = lappend(initial_rels,
make_jointree_rel(root, jtnode));
initial_rels = lappend(initial_rels, thisrel);
}
if (levels_needed == 1)
{
/*
* Single jointree node, so we're done.
* Single joinlist node, so we're done.
*/
return (RelOptInfo *) linitial(initial_rels);
}
......
This diff is collapsed.
This diff is collapsed.
......@@ -14,7 +14,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/plan/planmain.c,v 1.90 2005/11/22 18:17:13 momjian Exp $
* $PostgreSQL: pgsql/src/backend/optimizer/plan/planmain.c,v 1.91 2005/12/20 02:30:36 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -83,6 +83,7 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
{
Query *parse = root->parse;
List *constant_quals;
List *joinlist;
RelOptInfo *final_rel;
Path *cheapestpath;
Path *sortedpath;
......@@ -134,6 +135,7 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
root->left_join_clauses = NIL;
root->right_join_clauses = NIL;
root->full_join_clauses = NIL;
root->oj_info_list = NIL;
/*
* Construct RelOptInfo nodes for all base relations in query.
......@@ -144,7 +146,8 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
* Examine the targetlist and qualifications, adding entries to baserel
* targetlists for all referenced Vars. Restrict and join clauses are
* added to appropriate lists belonging to the mentioned relations. We
* also build lists of equijoined keys for pathkey construction.
* also build lists of equijoined keys for pathkey construction, and
* form a target joinlist for make_one_rel() to work from.
*
* Note: all subplan nodes will have "flat" (var-only) tlists. This
* implies that all expression evaluations are done at the root of the
......@@ -154,7 +157,7 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
*/
build_base_rel_tlists(root, tlist);
(void) distribute_quals_to_rels(root, (Node *) parse->jointree, false);
joinlist = deconstruct_jointree(root);
/*
* Use the completed lists of equijoined keys to deduce any implied but
......@@ -175,7 +178,7 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
/*
* Ready to do the primary planning.
*/
final_rel = make_one_rel(root);
final_rel = make_one_rel(root, joinlist);
if (!final_rel || !final_rel->cheapest_total_path)
elog(ERROR, "failed to construct the join relation");
......
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/plan/planner.c,v 1.195 2005/11/22 18:17:13 momjian Exp $
* $PostgreSQL: pgsql/src/backend/optimizer/plan/planner.c,v 1.196 2005/12/20 02:30:36 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -352,17 +352,6 @@ subquery_planner(Query *parse, double tuple_fraction,
if (root->hasOuterJoins)
reduce_outer_joins(root);
/*
* See if we can simplify the jointree; opportunities for this may come
* from having pulled up subqueries, or from flattening explicit JOIN
* syntax. We must do this after flattening JOIN alias variables, since
* eliminating explicit JOIN nodes from the jointree will cause
* get_relids_for_join() to fail. But it should happen after
* reduce_outer_joins, anyway.
*/
parse->jointree = (FromExpr *)
simplify_jointree(root, (Node *) parse->jointree);
/*
* Do the main planning. If we have an inherited target relation, that
* needs special processing, else go straight to grouping_planner.
......@@ -567,6 +556,8 @@ inheritance_planner(PlannerInfo *root, List *inheritlist)
adjust_inherited_attrs((Node *) root->in_info_list,
parentRTindex, parentOID,
childRTindex, childOID);
/* There shouldn't be any OJ info to translate, though */
Assert(subroot.oj_info_list == NIL);
/* Generate plan */
subplan = grouping_planner(&subroot, 0.0 /* retrieve all tuples */ );
......
This diff is collapsed.
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/util/clauses.c,v 1.203 2005/11/22 18:17:14 momjian Exp $
* $PostgreSQL: pgsql/src/backend/optimizer/util/clauses.c,v 1.204 2005/12/20 02:30:36 tgl Exp $
*
* HISTORY
* AUTHOR DATE MAJOR EVENT
......@@ -69,6 +69,7 @@ static bool contain_subplans_walker(Node *node, void *context);
static bool contain_mutable_functions_walker(Node *node, void *context);
static bool contain_volatile_functions_walker(Node *node, void *context);
static bool contain_nonstrict_functions_walker(Node *node, void *context);
static Relids find_nonnullable_rels_walker(Node *node, bool top_level);
static bool set_coercionform_dontcare_walker(Node *node, void *context);
static Node *eval_const_expressions_mutator(Node *node,
eval_const_expressions_context *context);
......@@ -861,6 +862,131 @@ contain_nonstrict_functions_walker(Node *node, void *context)
}
/*
* find_nonnullable_rels
* Determine which base rels are forced nonnullable by given clause.
*
* Returns the set of all Relids that are referenced in the clause in such
* a way that the clause cannot possibly return TRUE if any of these Relids
* is an all-NULL row. (It is OK to err on the side of conservatism; hence
* the analysis here is simplistic.)
*
* The semantics here are subtly different from contain_nonstrict_functions:
* that function is concerned with NULL results from arbitrary expressions,
* but here we assume that the input is a Boolean expression, and wish to
* see if NULL inputs will provably cause a FALSE-or-NULL result. We expect
* the expression to have been AND/OR flattened and converted to implicit-AND
* format.
*
* We don't use expression_tree_walker here because we don't want to
* descend through very many kinds of nodes; only the ones we can be sure
* are strict. We can descend through the top level of implicit AND'ing,
* but not through any explicit ANDs (or ORs) below that, since those are not
* strict constructs. The List case handles the top-level implicit AND list
* as well as lists of arguments to strict operators/functions.
*/
Relids
find_nonnullable_rels(Node *clause)
{
return find_nonnullable_rels_walker(clause, true);
}
static Relids
find_nonnullable_rels_walker(Node *node, bool top_level)
{
Relids result = NULL;
if (node == NULL)
return NULL;
if (IsA(node, Var))
{
Var *var = (Var *) node;
if (var->varlevelsup == 0)
result = bms_make_singleton(var->varno);
}
else if (IsA(node, List))
{
ListCell *l;
foreach(l, (List *) node)
{
result = bms_join(result,
find_nonnullable_rels_walker(lfirst(l),
top_level));
}
}
else if (IsA(node, FuncExpr))
{
FuncExpr *expr = (FuncExpr *) node;
if (func_strict(expr->funcid))
result = find_nonnullable_rels_walker((Node *) expr->args, false);
}
else if (IsA(node, OpExpr))
{
OpExpr *expr = (OpExpr *) node;
if (op_strict(expr->opno))
result = find_nonnullable_rels_walker((Node *) expr->args, false);
}
else if (IsA(node, ScalarArrayOpExpr))
{
/* Strict if it's "foo op ANY array" and op is strict */
ScalarArrayOpExpr *expr = (ScalarArrayOpExpr *) node;
if (expr->useOr && op_strict(expr->opno))
result = find_nonnullable_rels_walker((Node *) expr->args, false);
}
else if (IsA(node, BoolExpr))
{
BoolExpr *expr = (BoolExpr *) node;
/* NOT is strict, others are not */
if (expr->boolop == NOT_EXPR)
result = find_nonnullable_rels_walker((Node *) expr->args, false);
}
else if (IsA(node, RelabelType))
{
RelabelType *expr = (RelabelType *) node;
result = find_nonnullable_rels_walker((Node *) expr->arg, top_level);
}
else if (IsA(node, ConvertRowtypeExpr))
{
/* not clear this is useful, but it can't hurt */
ConvertRowtypeExpr *expr = (ConvertRowtypeExpr *) node;
result = find_nonnullable_rels_walker((Node *) expr->arg, top_level);
}
else if (IsA(node, NullTest))
{
NullTest *expr = (NullTest *) node;
/*
* IS NOT NULL can be considered strict, but only at top level; else
* we might have something like NOT (x IS NOT NULL).
*/
if (top_level && expr->nulltesttype == IS_NOT_NULL)
result = find_nonnullable_rels_walker((Node *) expr->arg, false);
}
else if (IsA(node, BooleanTest))
{
BooleanTest *expr = (BooleanTest *) node;
/*
* Appropriate boolean tests are strict at top level.
*/
if (top_level &&
(expr->booltesttype == IS_TRUE ||
expr->booltesttype == IS_FALSE ||
expr->booltesttype == IS_NOT_UNKNOWN))
result = find_nonnullable_rels_walker((Node *) expr->arg, false);
}
return result;
}
/*****************************************************************************
* Check for "pseudo-constant" clauses
*****************************************************************************/
......@@ -2794,7 +2920,8 @@ expression_tree_walker(Node *node,
case T_CaseTestExpr:
case T_SetToDefault:
case T_RangeTblRef:
/* primitive node types with no subnodes */
case T_OuterJoinInfo:
/* primitive node types with no expression subnodes */
break;
case T_Aggref:
return walker(((Aggref *) node)->target, context);
......@@ -3191,7 +3318,8 @@ expression_tree_mutator(Node *node,
case T_CaseTestExpr:
case T_SetToDefault:
case T_RangeTblRef:
/* primitive node types with no subnodes */
case T_OuterJoinInfo:
/* primitive node types with no expression subnodes */
return (Node *) copyObject(node);
case T_Aggref:
{
......
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/util/relnode.c,v 1.73 2005/11/22 18:17:15 momjian Exp $
* $PostgreSQL: pgsql/src/backend/optimizer/util/relnode.c,v 1.74 2005/12/20 02:30:36 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -133,7 +133,6 @@ make_reloptinfo(PlannerInfo *root, int relid, RelOptKind reloptkind)
rel->baserestrictinfo = NIL;
rel->baserestrictcost.startup = 0;
rel->baserestrictcost.per_tuple = 0;
rel->outerjoinset = NULL;
rel->joininfo = NIL;
rel->index_outer_relids = NULL;
rel->index_inner_paths = NIL;
......@@ -369,7 +368,6 @@ build_join_rel(PlannerInfo *root,
joinrel->baserestrictinfo = NIL;
joinrel->baserestrictcost.startup = 0;
joinrel->baserestrictcost.per_tuple = 0;
joinrel->outerjoinset = NULL;
joinrel->joininfo = NIL;
joinrel->index_outer_relids = NULL;
joinrel->index_inner_paths = NIL;
......
......@@ -10,7 +10,7 @@
* Written by Peter Eisentraut <peter_e@gmx.net>.
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.301 2005/11/22 18:17:26 momjian Exp $
* $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.302 2005/12/20 02:30:36 tgl Exp $
*
*--------------------------------------------------------------------
*/
......@@ -45,7 +45,7 @@
#include "optimizer/cost.h"
#include "optimizer/geqo.h"
#include "optimizer/paths.h"
#include "optimizer/prep.h"
#include "optimizer/planmain.h"
#include "parser/parse_expr.h"
#include "parser/parse_relation.h"
#include "postmaster/autovacuum.h"
......@@ -1010,7 +1010,7 @@ static struct config_int ConfigureNamesInt[] =
{"join_collapse_limit", PGC_USERSET, QUERY_TUNING_OTHER,
gettext_noop("Sets the FROM-list size beyond which JOIN constructs are not "
"flattened."),
gettext_noop("The planner will flatten explicit inner JOIN "
gettext_noop("The planner will flatten explicit JOIN "
"constructs into lists of FROM items whenever a list of no more "
"than this many items would result.")
},
......
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/nodes/nodes.h,v 1.178 2005/11/22 18:17:30 momjian Exp $
* $PostgreSQL: pgsql/src/include/nodes/nodes.h,v 1.179 2005/12/20 02:30:36 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -186,6 +186,7 @@ typedef enum NodeTag
T_PathKeyItem,
T_RestrictInfo,
T_InnerIndexscanInfo,
T_OuterJoinInfo,
T_InClauseInfo,
/*
......
......@@ -10,7 +10,7 @@
* Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/nodes/primnodes.h,v 1.109 2005/10/15 02:49:45 momjian Exp $
* $PostgreSQL: pgsql/src/include/nodes/primnodes.h,v 1.110 2005/12/20 02:30:36 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -845,12 +845,7 @@ typedef struct TargetEntry
* or qualified join. Also, FromExpr nodes can appear to denote an
* ordinary cross-product join ("FROM foo, bar, baz WHERE ...").
* FromExpr is like a JoinExpr of jointype JOIN_INNER, except that it
* may have any number of child nodes, not just two. Also, there is an
* implementation-defined difference: the planner is allowed to join the
* children of a FromExpr using whatever join order seems good to it.
* At present, JoinExpr nodes are always joined in exactly the order
* implied by the jointree structure (except the planner may choose to
* swap inner and outer members of a join pair).
* may have any number of child nodes, not just two.
*
* NOTE: the top level of a Query's jointree is always a FromExpr.
* Even if the jointree contains no rels, there will be a FromExpr.
......
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/nodes/relation.h,v 1.121 2005/11/26 22:14:57 tgl Exp $
* $PostgreSQL: pgsql/src/include/nodes/relation.h,v 1.122 2005/12/20 02:30:36 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -97,6 +97,8 @@ typedef struct PlannerInfo
List *full_join_clauses; /* list of RestrictInfos for full
* outer join clauses */
List *oj_info_list; /* list of OuterJoinInfos */
List *in_info_list; /* list of InClauseInfos */
List *query_pathkeys; /* desired pathkeys for query_planner(), and
......@@ -201,10 +203,6 @@ typedef struct PlannerInfo
* participates (only used for base rels)
* baserestrictcost - Estimated cost of evaluating the baserestrictinfo
* clauses at a single tuple (only used for base rels)
* outerjoinset - For a base rel: if the rel appears within the nullable
* side of an outer join, the set of all relids
* participating in the highest such outer join; else NULL.
* Otherwise, unused.
* joininfo - List of RestrictInfo nodes, containing info about each
* join clause in which this relation participates
* index_outer_relids - only used for base rels; set of outer relids
......@@ -228,10 +226,6 @@ typedef struct PlannerInfo
* We store baserestrictcost in the RelOptInfo (for base relations) because
* we know we will need it at least once (to price the sequential scan)
* and may need it multiple times to price index scans.
*
* outerjoinset is used to ensure correct placement of WHERE clauses that
* apply to outer-joined relations; we must not apply such WHERE clauses
* until after the outer join is performed.
*----------
*/
typedef enum RelOptKind
......@@ -277,7 +271,6 @@ typedef struct RelOptInfo
List *baserestrictinfo; /* RestrictInfo structures (if base
* rel) */
QualCost baserestrictcost; /* cost of evaluating the above */
Relids outerjoinset; /* set of base relids */
List *joininfo; /* RestrictInfo structures for join clauses
* involving this rel */
......@@ -830,6 +823,40 @@ typedef struct InnerIndexscanInfo
Path *best_innerpath; /* best inner indexscan, or NULL if none */
} InnerIndexscanInfo;
/*
* Outer join info.
*
* One-sided outer joins constrain the order of joining partially but not
* completely. We flatten such joins into the planner's top-level list of
* relations to join, but record information about each outer join in an
* OuterJoinInfo struct. These structs are kept in the PlannerInfo node's
* oj_info_list.
*
* min_lefthand and min_righthand are the sets of base relids that must be
* available on each side when performing the outer join. lhs_strict is
* true if the outer join's condition cannot succeed when the LHS variables
* are all NULL (this means that the outer join can commute with upper-level
* outer joins even if it appears in their RHS). We don't bother to set
* lhs_strict for FULL JOINs, however.
*
* It is not valid for either min_lefthand or min_righthand to be empty sets;
* if they were, this would break the logic that enforces join order.
*
* Note: OuterJoinInfo directly represents only LEFT JOIN and FULL JOIN;
* RIGHT JOIN is handled by switching the inputs to make it a LEFT JOIN.
* We make an OuterJoinInfo for FULL JOINs even though there is no flexibility
* of planning for them, because this simplifies make_join_rel()'s API.
*/
typedef struct OuterJoinInfo
{
NodeTag type;
Relids min_lefthand; /* base relids in minimum LHS for join */
Relids min_righthand; /* base relids in minimum RHS for join */
bool is_full_join; /* it's a FULL OUTER JOIN */
bool lhs_strict; /* joinclause is strict for some LHS rel */
} OuterJoinInfo;
/*
* IN clause info.
*
......
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/optimizer/clauses.h,v 1.80 2005/10/15 02:49:45 momjian Exp $
* $PostgreSQL: pgsql/src/include/optimizer/clauses.h,v 1.81 2005/12/20 02:30:36 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -57,6 +57,7 @@ extern bool contain_subplans(Node *clause);
extern bool contain_mutable_functions(Node *clause);
extern bool contain_volatile_functions(Node *clause);
extern bool contain_nonstrict_functions(Node *clause);
extern Relids find_nonnullable_rels(Node *clause);
extern bool is_pseudo_constant_clause(Node *clause);
extern bool is_pseudo_constant_clause_relids(Node *clause, Relids relids);
......
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/optimizer/paths.h,v 1.89 2005/11/25 19:47:50 tgl Exp $
* $PostgreSQL: pgsql/src/include/optimizer/paths.h,v 1.90 2005/12/20 02:30:36 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -23,8 +23,7 @@
extern bool enable_geqo;
extern int geqo_threshold;
extern RelOptInfo *make_one_rel(PlannerInfo *root);
extern RelOptInfo *make_fromexpr_rel(PlannerInfo *root, FromExpr *from);
extern RelOptInfo *make_one_rel(PlannerInfo *root, List *joinlist);
#ifdef OPTIMIZER_DEBUG
extern void debug_print_rel(PlannerInfo *root, RelOptInfo *rel);
......@@ -88,10 +87,8 @@ extern void add_paths_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
* routines to determine which relations to join
*/
extern List *make_rels_by_joins(PlannerInfo *root, int level, List **joinrels);
extern RelOptInfo *make_jointree_rel(PlannerInfo *root, Node *jtnode);
extern RelOptInfo *make_join_rel(PlannerInfo *root,
RelOptInfo *rel1, RelOptInfo *rel2,
JoinType jointype);
RelOptInfo *rel1, RelOptInfo *rel2);
/*
* pathkeys.c
......
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/optimizer/planmain.h,v 1.90 2005/10/15 02:49:45 momjian Exp $
* $PostgreSQL: pgsql/src/include/optimizer/planmain.h,v 1.91 2005/12/20 02:30:36 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -65,10 +65,12 @@ extern bool is_projection_capable_plan(Plan *plan);
/*
* prototypes for plan/initsplan.c
*/
extern int from_collapse_limit;
extern int join_collapse_limit;
extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
extern Relids distribute_quals_to_rels(PlannerInfo *root, Node *jtnode,
bool below_outer_join);
extern List *deconstruct_jointree(PlannerInfo *root);
extern void process_implied_equality(PlannerInfo *root,
Node *item1, Node *item2,
Oid sortop1, Oid sortop2,
......
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/optimizer/prep.h,v 1.52 2005/10/15 02:49:45 momjian Exp $
* $PostgreSQL: pgsql/src/include/optimizer/prep.h,v 1.53 2005/12/20 02:30:36 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -21,14 +21,10 @@
/*
* prototypes for prepjointree.c
*/
extern int from_collapse_limit;
extern int join_collapse_limit;
extern Node *pull_up_IN_clauses(PlannerInfo *root, Node *node);
extern Node *pull_up_subqueries(PlannerInfo *root, Node *jtnode,
bool below_outer_join);
extern void reduce_outer_joins(PlannerInfo *root);
extern Node *simplify_jointree(PlannerInfo *root, Node *jtnode);
extern Relids get_relids_in_jointree(Node *jtnode);
extern Relids get_relids_for_join(PlannerInfo *root, int joinrelid);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment