• Tom Lane's avatar
    Fix planner failures with overlapping mergejoin clauses in an outer join. · 9afd513d
    Tom Lane authored
    Given overlapping or partially redundant join clauses, for example
    	t1 JOIN t2 ON t1.a = t2.x AND t1.b = t2.x
    the planner's EquivalenceClass machinery will ordinarily refactor the
    clauses as "t1.a = t1.b AND t1.a = t2.x", so that join processing doesn't
    see multiple references to the same EquivalenceClass in a list of join
    equality clauses.  However, if the join is outer, it's incorrect to derive
    a restriction clause on the outer side from the join conditions, so the
    clause refactoring does not happen and we end up with overlapping join
    conditions.  The code that attempted to deal with such cases had several
    subtle bugs, which could result in "left and right pathkeys do not match in
    mergejoin" or "outer pathkeys do not match mergeclauses" planner errors,
    if the selected join plan type was a mergejoin.  (It does not appear that
    any actually incorrect plan could have been emitted.)
    
    The core of the problem really was failure to recognize that the outer and
    inner relations' pathkeys have different relationships to the mergeclause
    list.  A join's mergeclause list is constructed by reference to the outer
    pathkeys, so it will always be ordered the same as the outer pathkeys, but
    this cannot be presumed true for the inner pathkeys.  If the inner sides of
    the mergeclauses contain multiple references to the same EquivalenceClass
    ({t2.x} in the above example) then a simplistic rendering of the required
    inner sort order is like "ORDER BY t2.x, t2.x", but the pathkey machinery
    recognizes that the second sort column is redundant and throws it away.
    The mergejoin planning code failed to account for that behavior properly.
    One error was to try to generate cut-down versions of the mergeclause list
    from cut-down versions of the inner pathkeys in the same way as the initial
    construction of the mergeclause list from the outer pathkeys was done; this
    could lead to choosing a mergeclause list that fails to match the outer
    pathkeys.  The other problem was that the pathkey cross-checking code in
    create_mergejoin_plan treated the inner and outer pathkey lists
    identically, whereas actually the expectations for them must be different.
    That led to false "pathkeys do not match" failures in some cases, and in
    principle could have led to failure to detect bogus plans in other cases,
    though there is no indication that such bogus plans could be generated.
    
    Reported by Alexander Kuzmenkov, who also reviewed this patch.  This has
    been broken for years (back to around 8.3 according to my testing), so
    back-patch to all supported branches.
    
    Discussion: https://postgr.es/m/5dad9160-4632-0e47-e120-8e2082000c01@postgrespro.ru
    9afd513d
createplan.c 196 KB