• Tom Lane's avatar
    Avoid using a cursor in plpgsql's RETURN QUERY statement. · 2f48ede0
    Tom Lane authored
    plpgsql has always executed the query given in a RETURN QUERY command
    by opening it as a cursor and then fetching a few rows at a time,
    which it turns around and dumps into the function's result tuplestore.
    The point of this was to keep from blowing out memory with an oversized
    SPITupleTable result (note that while a tuplestore can spill tuples
    to disk, SPITupleTable cannot).  However, it's rather inefficient, both
    because of extra data copying and because of executor entry/exit
    overhead.  In recent versions, a new performance problem has emerged:
    use of a cursor prevents use of a parallel plan for the executed query.
    
    We can improve matters by skipping use of a cursor and having the
    executor push result tuples directly into the function's result
    tuplestore.  However, a moderate amount of new infrastructure is needed
    to make that idea work:
    
    * We can use the existing tstoreReceiver.c DestReceiver code to funnel
    executor output to the tuplestore, but it has to be extended to support
    plpgsql's requirement for possibly applying a tuple conversion map.
    
    * SPI needs to be extended to allow use of a caller-supplied
    DestReceiver instead of its usual receiver that puts tuples into
    a SPITupleTable.  Two new API calls are needed to handle both the
    RETURN QUERY and RETURN QUERY EXECUTE cases.
    
    I also felt that I didn't want these new API calls to use the legacy
    method of specifying query parameter values with "char" null flags
    (the old ' '/'n' convention); rather they should accept ParamListInfo
    objects containing the parameter type and value info.  This required
    a bit of additional new infrastructure since we didn't yet have any
    parse analysis callback that would interpret $N parameter symbols
    according to type data supplied in a ParamListInfo.  There seems to be
    no harm in letting makeParamList install that callback by default,
    rather than leaving a new ParamListInfo's parserSetup hook as NULL.
    (Indeed, as of HEAD, I couldn't find anyplace that was using the
    parserSetup field at all; plpgsql was using parserSetupArg for its
    own purposes, but parserSetup seemed to be write-only.)
    
    We can actually get plpgsql out of the business of using legacy null
    flags altogether, and using ParamListInfo instead of its ad-hoc
    PreparedParamsData structure; but this requires inventing one more
    SPI API call that can replace SPI_cursor_open_with_args.  That seems
    worth doing, though.
    
    SPI_execute_with_args and SPI_cursor_open_with_args are now unused
    anywhere in the core PG distribution.  Perhaps someday we could
    deprecate/remove them.  But cleaning up the crufty bits of the SPI
    API is a task for a different patch.
    
    Per bug #16040 from Jeremy Smith.  This is unfortunately too invasive to
    consider back-patching.  Patch by me; thanks to Hamid Akhtar for review.
    
    Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
    2f48ede0
spi.sgml 140 KB