• Andres Freund's avatar
    Move targetlist SRF handling from expression evaluation to new executor node. · 69f4b9c8
    Andres Freund authored
    Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT
    generate_series(1,5)) so far was done in the expression evaluation (i.e.
    ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code.
    
    This meant that most executor nodes performing projection, and most
    expression evaluation functions, had to deal with the possibility that an
    evaluated expression could return a set of return values.
    
    That's bad because it leads to repeated code in a lot of places. It also,
    and that's my (Andres's) motivation, made it a lot harder to implement a
    more efficient way of doing expression evaluation.
    
    To fix this, introduce a new executor node (ProjectSet) that can evaluate
    targetlists containing one or more SRFs. To avoid the complexity of the old
    way of handling nested expressions returning sets (e.g. having to pass up
    ExprDoneCond, and dealing with arguments to functions returning sets etc.),
    those SRFs can only be at the top level of the node's targetlist.  The
    planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is
    only necessary in ProjectSet nodes and that SRFs are only present at the
    top level of the node's targetlist. If there are nested SRFs the planner
    creates multiple stacked ProjectSet nodes.  The ProjectSet nodes always get
    input from an underlying node.
    
    We also discussed and prototyped evaluating targetlist SRFs using ROWS
    FROM(), but that turned out to be more complicated than we'd hoped.
    
    While moving SRF evaluation to ProjectSet would allow to retain the old
    "least common multiple" behavior when multiple SRFs are present in one
    targetlist (i.e.  continue returning rows until all SRFs are at the end of
    their input at the same time), we decided to instead only return rows till
    all SRFs are exhausted, returning NULL for already exhausted ones.  We
    deemed the previous behavior to be too confusing, unexpected and actually
    not particularly useful.
    
    As a side effect, the previously prohibited case of multiple set returning
    arguments to a function, is now allowed. Not because it's particularly
    desirable, but because it ends up working and there seems to be no argument
    for adding code to prohibit it.
    
    Currently the behavior for COALESCE and CASE containing SRFs has changed,
    returning multiple rows from the expression, even when the SRF containing
    "arm" of the expression is not evaluated. That's because the SRFs are
    evaluated in a separate ProjectSet node.  As that's quite confusing, we're
    likely to instead prohibit SRFs in those places.  But that's still being
    discussed, and the code would reside in places not touched here, so that's
    a task for later.
    
    There's a lot of, now superfluous, code dealing with set return expressions
    around. But as the changes to get rid of those are verbose largely boring,
    it seems better for readability to keep the cleanup as a separate commit.
    
    Author: Tom Lane and Andres Freund
    Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
    69f4b9c8
README 56.5 KB