• Andres Freund's avatar
    Fix representation of hash keys in Hash/HashJoin nodes. · 2abd7ae9
    Andres Freund authored
    In 5f32b29c I changed the creation of HashState.hashkeys to
    actually use HashState as the parent (instead of HashJoinState, which
    was incorrect, as they were executed below HashState), to fix the
    problem of hashkeys expressions otherwise relying on slot types
    appropriate for HashJoinState, rather than HashState as would be
    correct. That reliance was only introduced in 12, which is why it
    previously worked to use HashJoinState as the parent (although I'd be
    unsurprised if there were problematic cases).
    
    Unfortunately that's not a sufficient solution, because before this
    commit, the to-be-hashed expressions referenced inner/outer as
    appropriate for the HashJoin, not Hash. That didn't have obvious bad
    consequences, because the slots containing the tuples were put into
    ecxt_innertuple when hashing a tuple for HashState (even though Hash
    doesn't have an inner plan).
    
    There are less common cases where this can cause visible problems
    however (rather than just confusion when inspecting such executor
    trees). E.g. "ERROR: bogus varno: 65000", when explaining queries
    containing a HashJoin where the subsidiary Hash node's hash keys
    reference a subplan. While normally hashkeys aren't displayed by
    EXPLAIN, if one of those expressions references a subplan, that
    subplan may be printed as part of the Hash node - which then failed
    because an inner plan was referenced, and Hash doesn't have that.
    
    It seems quite possible that there's other broken cases, too.
    
    Fix the problem by properly splitting the expression for the HashJoin
    and Hash nodes at plan time, and have them reference the proper
    subsidiary node. While other workarounds are possible, fixing this
    correctly seems easy enough. It was a pretty ugly hack to have
    ExecInitHashJoin put the expression into the already initialized
    HashState, in the first place.
    
    I decided to not just split inner/outer hashkeys inside
    make_hashjoin(), but also to separate out hashoperators and
    hashcollations at plan time. Otherwise we would have ended up having
    two very similar loops, one at plan time and the other during executor
    startup. The work seems to more appropriately belong to plan time,
    anyway.
    
    Reported-By: Nikita Glukhov, Alexander Korotkov
    Author: Andres Freund
    Reviewed-By: Tom Lane, in an earlier version
    Discussion: https://postgr.es/m/CAPpHfdvGVegF_TKKRiBrSmatJL2dR9uwFCuR+teQ_8tEXU8mxg@mail.gmail.com
    Backpatch: 12-
    2abd7ae9
nodeHashjoin.c 45.9 KB