• Tom Lane's avatar
    Avoid regressions in foreign-key-based selectivity estimates. · d8e6b84b
    Tom Lane authored
    David Rowley found that the "use the smallest per-column selectivity"
    heuristic applied in some cases by get_foreign_key_join_selectivity()
    was badly off if the FK columns are independent, producing estimates
    much worse than we got before that code was added in 9.6.
    
    One case where that heuristic was used was for LEFT and FULL outer joins
    with the referenced rel on the outside of the join.  But we should not
    really need to special-case those here.  eqjoinsel() never has had such a
    special case; the correction is applied by calc_joinrel_size_estimate()
    instead.  Let's just estimate such cases like inner joins and rely on that
    later adjustment.  (I think there was something of a thinko here, in that
    the comments seem to be thinking about the selectivity as defined for
    semi/anti joins; but that shouldn't apply to left/full joins.)  Add a
    regression test exercising such a case to show that this is sane in
    at least some cases.
    
    The other case where we used that heuristic was for SEMI/ANTI outer joins,
    either if the referenced rel was on the outside, or if it was on the inside
    but was part of a join within the RHS.  In either case, the FK doesn't give
    us a lot of traction towards estimating the selectivity.  To ensure that
    we don't have regressions from what happened before 9.6, let's punt by
    ignoring the FK in such cases and applying the traditional selectivity
    calculation.  (We might be able to improve on that later, but for now
    I just want to be sure it's not worse than 9.5.)
    
    Report and patch by David Rowley, simplified a bit by me.  Back-patch
    to 9.6 where this code was added.
    
    Discussion: https://postgr.es/m/CAKJS1f8NO8oCDcxrteohG6O72uU1saEVT9qX=R8pENr5QWerXw@mail.gmail.com
    d8e6b84b
join.sql 53.9 KB