• Tom Lane's avatar
    Reconsider the representation of join alias Vars. · 9ce77d75
    Tom Lane authored
    The core idea of this patch is to make the parser generate join alias
    Vars (that is, ones with varno pointing to a JOIN RTE) only when the
    alias Var is actually different from any raw join input, that is a type
    coercion and/or COALESCE is necessary to generate the join output value.
    Otherwise just generate varno/varattno pointing to the relevant join
    input column.
    
    In effect, this means that the planner's flatten_join_alias_vars()
    transformation is already done in the parser, for all cases except
    (a) columns that are merged by JOIN USING and are transformed in the
    process, and (b) whole-row join Vars.  In principle that would allow
    us to skip doing flatten_join_alias_vars() in many more queries than
    we do now, but we don't have quite enough infrastructure to know that
    we can do so --- in particular there's no cheap way to know whether
    there are any whole-row join Vars.  I'm not sure if it's worth the
    trouble to add a Query-level flag for that, and in any case it seems
    like fit material for a separate patch.  But even without skipping the
    work entirely, this should make flatten_join_alias_vars() faster,
    particularly where there are nested joins that it previously had to
    flatten recursively.
    
    An essential part of this change is to replace Var nodes'
    varnoold/varoattno fields with varnosyn/varattnosyn, which have
    considerably more tightly-defined meanings than the old fields: when
    they differ from varno/varattno, they identify the Var's position in
    an aliased JOIN RTE, and the join alias is what ruleutils.c should
    print for the Var.  This is necessary because the varno change
    destroyed ruleutils.c's ability to find the JOIN RTE from the Var's
    varno.
    
    Another way in which this change broke ruleutils.c is that it's no
    longer feasible to determine, from a JOIN RTE's joinaliasvars list,
    which join columns correspond to which columns of the join's immediate
    input relations.  (If those are sub-joins, the joinaliasvars entries
    may point to columns of their base relations, not the sub-joins.)
    But that was a horrid mess requiring a lot of fragile assumptions
    already, so let's just bite the bullet and add some more JOIN RTE
    fields to make it more straightforward to figure that out.  I added
    two integer-List fields containing the relevant column numbers from
    the left and right input rels, plus a count of how many merged columns
    there are.
    
    This patch depends on the ParseNamespaceColumn infrastructure that
    I added in commit 5815696b.  The biggest bit of code change is
    restructuring transformFromClauseItem's handling of JOINs so that
    the ParseNamespaceColumn data is propagated upward correctly.
    
    Other than that and the ruleutils fixes, everything pretty much
    just works, though some processing is now inessential.  I grabbed
    two pieces of low-hanging fruit in that line:
    
    1. In find_expr_references, we don't need to recurse into join alias
    Vars anymore.  There aren't any except for references to merged USING
    columns, which are more properly handled when we scan the join's RTE.
    This change actually fixes an edge-case issue: we will now record a
    dependency on any type-coercion function present in a USING column's
    joinaliasvar, even if that join column has no references in the query
    text.  The odds of the missing dependency causing a problem seem quite
    small: you'd have to posit somebody dropping an implicit cast between
    two data types, without removing the types themselves, and then having
    a stored rule containing a whole-row Var for a join whose USING merge
    depends on that cast.  So I don't feel a great need to change this in
    the back branches.  But in theory this way is more correct.
    
    2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse
    into join alias Vars either, because the cases they care about don't
    apply to alias Vars for USING columns that are semantically distinct
    from the underlying columns.  This removes the only case in which
    markVarForSelectPriv could be called with NULL for the RTE, so adjust
    the comments to describe that hack as being strictly internal to
    markRTEForSelectPriv.
    
    catversion bump required due to changes in stored rules.
    
    Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
    9ce77d75
rewriteManip.c 43.6 KB