• Tom Lane's avatar
    Fix planner's cost estimation for SEMI/ANTI joins with inner indexscans. · 3f59be83
    Tom Lane authored
    When the inner side of a nestloop SEMI or ANTI join is an indexscan that
    uses all the join clauses as indexquals, it can be presumed that both
    matched and unmatched outer rows will be processed very quickly: for
    matched rows, we'll stop after fetching one row from the indexscan, while
    for unmatched rows we'll have an indexscan that finds no matching index
    entries, which should also be quick.  The planner already knew about this,
    but it was nonetheless charging for at least one full run of the inner
    indexscan, as a consequence of concerns about the behavior of materialized
    inner scans --- but those concerns don't apply in the fast case.  If the
    inner side has low cardinality (many matching rows) this could make an
    indexscan plan look far more expensive than it actually is.  To fix,
    rearrange the work in initial_cost_nestloop/final_cost_nestloop so that we
    don't add the inner scan cost until we've inspected the indexquals, and
    then we can add either the full-run cost or just the first tuple's cost as
    appropriate.
    
    Experimentation with this fix uncovered another problem: add_path and
    friends were coded to disregard cheap startup cost when considering
    parameterized paths.  That's usually okay (and desirable, because it thins
    the path herd faster); but in this fast case for SEMI/ANTI joins, it could
    result in throwing away the desired plain indexscan path in favor of a
    bitmap scan path before we ever get to the join costing logic.  In the
    many-matching-rows cases of interest here, a bitmap scan will do a lot more
    work than required, so this is a problem.  To fix, add a per-relation flag
    consider_param_startup that works like the existing consider_startup flag,
    but applies to parameterized paths, and set it for relations that are the
    inside of a SEMI or ANTI join.
    
    To make this patch reasonably safe to back-patch, care has been taken to
    avoid changing the planner's behavior except in the very narrow case of
    SEMI/ANTI joins with inner indexscans.  There are places in
    compare_path_costs_fuzzily and add_path_precheck that are not terribly
    consistent with the new approach, but changing them will affect planner
    decisions at the margins in other cases, so we'll leave that for a
    HEAD-only fix.
    
    Back-patch to 9.3; before that, the consider_startup flag didn't exist,
    meaning that the second aspect of the patch would be too invasive.
    
    Per a complaint from Peter Holzer and analysis by Tomas Vondra.
    3f59be83
pathnode.c 55.2 KB