• Tom Lane's avatar
    When appropriate, postpone SELECT output expressions till after ORDER BY. · 9118d03a
    Tom Lane authored
    It is frequently useful for volatile, set-returning, or expensive functions
    in a SELECT's targetlist to be postponed till after ORDER BY and LIMIT are
    done.  Otherwise, the functions might be executed for every row of the
    table despite the presence of LIMIT, and/or be executed in an unexpected
    order.  For example, in
    	SELECT x, nextval('seq') FROM tab ORDER BY x LIMIT 10;
    it's probably desirable that the nextval() values are ordered the same
    as x, and that nextval() is not run more than 10 times.
    
    In the past, Postgres was inconsistent in this area: you would get the
    desirable behavior if the ordering were performed via an indexscan, but
    not if it had to be done by an explicit sort step.  Getting the desired
    behavior reliably required contortions like
    	SELECT x, nextval('seq')
    	  FROM (SELECT x FROM tab ORDER BY x) ss LIMIT 10;
    
    This patch conditionally postpones evaluation of pure-output target
    expressions (that is, those that are not used as DISTINCT, ORDER BY, or
    GROUP BY columns) so that they effectively occur after sorting, even if an
    explicit sort step is necessary.  Volatile expressions and set-returning
    expressions are always postponed, so as to provide consistent semantics.
    Expensive expressions (costing more than 10 times typical operator cost,
    which by default would include any user-defined function) are postponed
    if there is a LIMIT or if there are expressions that must be postponed.
    
    We could be more aggressive and postpone any nontrivial expression, but
    there are costs associated with doing so: it requires an extra Result plan
    node which adds some overhead, and postponement changes the volume of data
    going through the sort step, perhaps for the worse.  Since we tend not to
    have very good estimates of the output width of nontrivial expressions,
    it's hard to have much confidence in our ability to predict whether
    postponement would increase or decrease the cost of the sort; therefore
    this patch doesn't attempt to make decisions conditionally on that.
    Between these factors and a general desire not to change query behavior
    when there's not a demonstrable benefit, it seems best to be conservative
    about applying postponement.  We might tweak the decision rules in the
    future, though.
    
    Konstantin Knizhnik, heavily rewritten by me
    9118d03a
limit.sql 2.3 KB