Commit e2c2c2e8 authored by Tom Lane's avatar Tom Lane

Improve planner's handling of duplicated index column expressions.

It's potentially useful for an index to repeat the same indexable column
or expression in multiple index columns, if the columns have different
opclasses.  (If they share opclasses too, the duplicate column is pretty
useless, but nonetheless we've allowed such cases since 9.0.)  However,
the planner failed to cope with this, because createplan.c was relying on
simple equal() matching to figure out which index column each index qual
is intended for.  We do have that information available upstream in
indxpath.c, though, so the fix is to not flatten the multi-level indexquals
list when putting it into an IndexPath.  Then we can rely on the sublist
structure to identify target index columns in createplan.c.  There's a
similar issue for index ORDER BYs (the KNNGIST feature), so introduce a
multi-level-list representation for that too.  This adds a bit more
representational overhead, but we might more or less buy that back by not
having to search for matching index columns anymore in createplan.c;
likewise btcostestimate saves some cycles.

Per bug #6351 from Christian Rudolph.  Likely symptoms include the "btree
index keys must be ordered by attribute" failure shown there, as well as
"operator MMMM is not a member of opfamily NNNN".

Although this is a pre-existing problem that can be demonstrated in 9.0 and
9.1, I'm not going to back-patch it, because the API changes in the planner
seem likely to break things such as index plugins.  The corner cases where
this matters seem too narrow to justify possibly breaking things in a minor
release.
parent d5448c7d
......@@ -209,8 +209,10 @@ cost_seqscan(Path *path, PlannerInfo *root,
* Determines and returns the cost of scanning a relation using an index.
*
* 'index' is the index to be used
* 'indexQuals' is the list of applicable qual clauses (implicit AND semantics)
* 'indexOrderBys' is the list of ORDER BY operators for amcanorderbyop indexes
* 'indexQuals' is a list of lists of applicable qual clauses (implicit AND
* semantics, one sub-list per index column)
* 'indexOrderBys' is a list of lists of lists of ORDER BY expressions for
* amcanorderbyop indexes (lists per pathkey and index column)
* 'indexonly' is true if it's an index-only scan
* 'outer_rel' is the outer relation when we are considering using the index
* scan as the inside of a nestloop join (hence, some of the indexQuals
......@@ -221,8 +223,8 @@ cost_seqscan(Path *path, PlannerInfo *root,
* additional fields of the IndexPath besides startup_cost and total_cost.
* These fields are needed if the IndexPath is used in a BitmapIndexScan.
*
* indexQuals is a list of RestrictInfo nodes, but indexOrderBys is a list of
* bare expressions.
* indexQuals is a list of lists of RestrictInfo nodes, but indexOrderBys
* is a list of lists of lists of bare expressions.
*
* NOTE: 'indexQuals' must contain only clauses usable as index restrictions.
* Any additional quals evaluated as qpquals may reduce the number of returned
......
This diff is collapsed.
This diff is collapsed.
......@@ -412,8 +412,8 @@ create_seqscan_path(PlannerInfo *root, RelOptInfo *rel)
* 'index' is a usable index.
* 'clause_groups' is a list of lists of RestrictInfo nodes
* to be used as index qual conditions in the scan.
* 'indexorderbys' is a list of bare expressions (no RestrictInfos)
* to be used as index ordering operators in the scan.
* 'indexorderbys' is a list of lists of lists of bare expressions (not
* RestrictInfos) to be used as index ordering operators.
* 'pathkeys' describes the ordering of the path.
* 'indexscandir' is ForwardScanDirection or BackwardScanDirection
* for an ordered index, or NoMovementScanDirection for
......
......@@ -630,7 +630,7 @@ extract_actual_join_clauses(List *restrictinfo_list,
* being used in an inner indexscan need not be checked again at the join.
*
* "Redundant" means either equal() or derived from the same EquivalenceClass.
* We have to check the latter because indxqual.c may select different derived
* We have to check the latter because indxpath.c may select different derived
* clauses than were selected by generate_join_implied_equalities().
*
* Note that we are *not* checking for local redundancies within the given
......
......@@ -5991,6 +5991,14 @@ genericcostestimate(PlannerInfo *root,
List *selectivityQuals;
ListCell *l;
/*
* For our purposes here, it doesn't matter which index columns the
* individual quals and order-by expressions go with, so flatten the
* lists for convenience.
*/
indexQuals = flatten_clausegroups_list(indexQuals);
indexOrderBys = flatten_indexorderbys_list(indexOrderBys);
/*----------
* If the index is partial, AND the index predicate with the explicitly
* given indexquals to produce a more accurate idea of the index
......@@ -6022,7 +6030,7 @@ genericcostestimate(PlannerInfo *root,
if (!predicate_implied_by(oneQual, indexQuals))
predExtraQuals = list_concat(predExtraQuals, oneQual);
}
/* list_concat avoids modifying the passed-in indexQuals list */
/* list_concat avoids modifying the indexQuals list */
selectivityQuals = list_concat(predExtraQuals, indexQuals);
}
else
......@@ -6250,7 +6258,7 @@ btcostestimate(PG_FUNCTION_ARGS)
bool found_saop;
bool found_is_null_op;
double num_sa_scans;
ListCell *l;
ListCell *lc1;
/*
* For a btree scan, only leading '=' quals plus inequality quals for the
......@@ -6259,8 +6267,7 @@ btcostestimate(PG_FUNCTION_ARGS)
* the index scan). Additional quals can suppress visits to the heap, so
* it's OK to count them in indexSelectivity, but they should not count
* for estimating numIndexTuples. So we must examine the given indexQuals
* to find out which ones count as boundary quals. We rely on the
* knowledge that they are given in index column order.
* to find out which ones count as boundary quals.
*
* For a RowCompareExpr, we consider only the first column, just as
* rowcomparesel() does.
......@@ -6270,14 +6277,25 @@ btcostestimate(PG_FUNCTION_ARGS)
* considered to act the same as it normally does.
*/
indexBoundQuals = NIL;
indexcol = 0;
eqQualHere = false;
found_saop = false;
found_is_null_op = false;
num_sa_scans = 1;
foreach(l, indexQuals)
/* clausegroups must correspond to index columns */
Assert(list_length(indexQuals) <= index->ncolumns);
indexcol = 0;
foreach(lc1, indexQuals)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(l);
List *clausegroup = (List *) lfirst(lc1);
ListCell *lc2;
eqQualHere = false;
foreach(lc2, clausegroup)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc2);
Expr *clause;
Node *leftop,
*rightop;
......@@ -6329,37 +6347,18 @@ btcostestimate(PG_FUNCTION_ARGS)
(int) nodeTag(clause));
continue; /* keep compiler quiet */
}
if (match_index_to_operand(leftop, indexcol, index))
{
/* clause_op is correct */
}
else if (match_index_to_operand(rightop, indexcol, index))
{
/* Must flip operator to get the opfamily member */
clause_op = get_commutator(clause_op);
}
else
{
/* Must be past the end of quals for indexcol, try next */
if (!eqQualHere)
break; /* done if no '=' qual for indexcol */
indexcol++;
eqQualHere = false;
if (match_index_to_operand(leftop, indexcol, index))
{
/* clause_op is correct */
}
else if (match_index_to_operand(rightop, indexcol, index))
{
Assert(match_index_to_operand(rightop, indexcol, index));
/* Must flip operator to get the opfamily member */
clause_op = get_commutator(clause_op);
}
else
{
/* No quals for new indexcol, so we are done */
break;
}
}
/* check for equality operator */
if (OidIsValid(clause_op))
{
......@@ -6371,7 +6370,7 @@ btcostestimate(PG_FUNCTION_ARGS)
}
else if (is_null_op)
{
/* IS NULL is like = for purposes of selectivity determination */
/* IS NULL is like = for selectivity determination */
eqQualHere = true;
}
/* count up number of SA scans induced by indexBoundQuals only */
......@@ -6386,6 +6385,13 @@ btcostestimate(PG_FUNCTION_ARGS)
indexBoundQuals = lappend(indexBoundQuals, rinfo);
}
/* Done with this indexcol, continue to next only if it had = qual */
if (!eqQualHere)
break;
indexcol++;
}
/*
* If index is unique and we found an '=' clause for each column, we can
* just assume numIndexTuples = 1 and skip the expensive
......@@ -6393,7 +6399,7 @@ btcostestimate(PG_FUNCTION_ARGS)
* NullTest invalidates that theory, even though it sets eqQualHere.
*/
if (index->unique &&
indexcol == index->ncolumns - 1 &&
indexcol == index->ncolumns &&
eqQualHere &&
!found_saop &&
!found_is_null_op)
......@@ -6924,6 +6930,14 @@ gincostestimate(PG_FUNCTION_ARGS)
Relation indexRel;
GinStatsData ginStats;
/*
* For our purposes here, it doesn't matter which index columns the
* individual quals and order-by expressions go with, so flatten the
* lists for convenience.
*/
indexQuals = flatten_clausegroups_list(indexQuals);
indexOrderBys = flatten_indexorderbys_list(indexOrderBys);
/*
* Obtain statistic information from the meta page
*/
......@@ -6980,7 +6994,7 @@ gincostestimate(PG_FUNCTION_ARGS)
if (!predicate_implied_by(oneQual, indexQuals))
predExtraQuals = list_concat(predExtraQuals, oneQual);
}
/* list_concat avoids modifying the passed-in indexQuals list */
/* list_concat avoids modifying the indexQuals list */
selectivityQuals = list_concat(predExtraQuals, indexQuals);
}
else
......
......@@ -659,18 +659,25 @@ typedef struct Path
* AND semantics across the list. Each clause is a RestrictInfo node from
* the query's WHERE or JOIN conditions.
*
* 'indexquals' has the same structure as 'indexclauses', but it contains
* the actual indexqual conditions that can be used with the index.
* In simple cases this is identical to 'indexclauses', but when special
* indexable operators appear in 'indexclauses', they are replaced by the
* derived indexscannable conditions in 'indexquals'.
*
* 'indexorderbys', if not NIL, is a list of ORDER BY expressions that have
* been found to be usable as ordering operators for an amcanorderbyop index.
* Note that these are not RestrictInfos, just bare expressions, since they
* generally won't yield booleans. The list will match the path's pathkeys.
* Also, unlike the case for quals, it's guaranteed that each expression has
* the index key on the left side of the operator.
* 'indexquals' is a list of sub-lists of the actual index qual conditions
* that can be used with the index. There is one possibly-empty sub-list
* for each index column (but empty sub-lists for trailing columns can be
* omitted). The qual conditions are RestrictInfos, and in simple cases
* are the same RestrictInfos that appear in the flat indexclauses list.
* But when special indexable operators appear in 'indexclauses', they are
* replaced by their derived indexscannable conditions in 'indexquals'.
* Note that an entirely empty indexquals list denotes a full-index scan.
*
* 'indexorderbys', if not NIL, is a list of lists of lists of ORDER BY
* expressions that have been found to be usable as ordering operators for an
* amcanorderbyop index. These are not RestrictInfos, just bare expressions,
* since they generally won't yield booleans. Also, unlike the case for
* quals, it's guaranteed that each expression has the index key on the left
* side of the operator. The top list has one entry per pathkey in the
* path's pathkeys, and the sub-lists have one sub-sublist per index column.
* This representation is a bit of overkill, since there will be only one
* actual expression per pathkey, but it's convenient because each sub-list
* has the same structure as the indexquals list.
*
* 'isjoininner' is TRUE if the path is a nestloop inner scan (that is,
* some of the index conditions are join rather than restriction clauses).
......
......@@ -61,6 +61,12 @@ extern List *expand_indexqual_conditions(IndexOptInfo *index,
List *clausegroups);
extern void check_partial_indexes(PlannerInfo *root, RelOptInfo *rel);
extern List *flatten_clausegroups_list(List *clausegroups);
extern List *flatten_indexorderbys_list(List *indexorderbys);
extern Expr *adjust_rowcompare_for_index(RowCompareExpr *clause,
IndexOptInfo *index,
int indexcol,
List **indexcolnos,
bool *var_on_left_p);
/*
* orindxpath.c
......
......@@ -2459,3 +2459,27 @@ RESET enable_seqscan;
RESET enable_indexscan;
RESET enable_bitmapscan;
DROP TABLE onek_with_null;
--
-- Check behavior with duplicate index column contents
--
CREATE TABLE dupindexcols AS
SELECT unique1 as id, stringu2::text as f1 FROM tenk1;
CREATE INDEX dupindexcols_i ON dupindexcols (f1, id, f1 text_pattern_ops);
VACUUM ANALYZE dupindexcols;
EXPLAIN (COSTS OFF)
SELECT count(*) FROM dupindexcols
WHERE f1 > 'LX' and id < 1000 and f1 ~<~ 'YX';
QUERY PLAN
---------------------------------------------------------------------------------
Aggregate
-> Index Only Scan using dupindexcols_i on dupindexcols
Index Cond: ((f1 > 'LX'::text) AND (id < 1000) AND (f1 ~<~ 'YX'::text))
(3 rows)
SELECT count(*) FROM dupindexcols
WHERE f1 > 'LX' and id < 1000 and f1 ~<~ 'YX';
count
-------
500
(1 row)
......@@ -39,6 +39,7 @@ SELECT relname, relhasindex
default_tbl | f
defaultexpr_tbl | f
dept | f
dupindexcols | t
e_star | f
emp | f
equipment_r | f
......@@ -164,7 +165,7 @@ SELECT relname, relhasindex
timetz_tbl | f
tinterval_tbl | f
varchar_tbl | f
(153 rows)
(154 rows)
--
-- another sanity check: every system catalog that has OIDs should have
......
......@@ -610,6 +610,7 @@ SELECT user_relns() AS user_relns
default_tbl
defaultexpr_tbl
dept
dupindexcols
e_star
emp
equipment_r
......@@ -685,7 +686,7 @@ SELECT user_relns() AS user_relns
toyemp
varchar_tbl
xacttest
(107 rows)
(108 rows)
SELECT name(equipment(hobby_construct(text 'skywalking', text 'mer')));
name
......
......@@ -804,3 +804,18 @@ RESET enable_indexscan;
RESET enable_bitmapscan;
DROP TABLE onek_with_null;
--
-- Check behavior with duplicate index column contents
--
CREATE TABLE dupindexcols AS
SELECT unique1 as id, stringu2::text as f1 FROM tenk1;
CREATE INDEX dupindexcols_i ON dupindexcols (f1, id, f1 text_pattern_ops);
VACUUM ANALYZE dupindexcols;
EXPLAIN (COSTS OFF)
SELECT count(*) FROM dupindexcols
WHERE f1 > 'LX' and id < 1000 and f1 ~<~ 'YX';
SELECT count(*) FROM dupindexcols
WHERE f1 > 'LX' and id < 1000 and f1 ~<~ 'YX';
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment