Commits · 2a64931c4bc76c5e800c3fb1eb5a18dc256899c9 · Abuhujair Javed / Postgres FD Implementation

17 Oct, 2008 1 commit

Salvage a little bit of work from a failed patch: simplify and speed up · 2a64931c

Tom Lane authored 16 years ago

set_rel_width().  The code had been catering for the possibility of different
varnos in the relation targetlist, but this is impossible for a base relation
(and if it were possible, putting all the widths in the same RelOptInfo would
be wrong anyway).

2a64931c

04 Oct, 2008 1 commit

Implement SQL-standard WITH clauses, including WITH RECURSIVE. · 44d5be0e

Tom Lane authored 16 years ago

There are some unimplemented aspects: recursive queries must use UNION ALL
(should allow UNION too), and we don't have SEARCH or CYCLE clauses.
These might or might not get done for 8.4, but even without them it's a
pretty useful feature.

There are also a couple of small loose ends and definitional quibbles,
which I'll send a memo about to pgsql-hackers shortly.  But let's land
the patch now so we can get on with other development.

Yoshiyuki Asaba, with lots of help from Tatsuo Ishii and Tom Lane

44d5be0e

05 Sep, 2008 1 commit

Fix an oversight in the 8.2 patch that improved mergejoin performance by · e540b972

Tom Lane authored 16 years ago

inserting a materialize node above an inner-side sort node, when the sort is
expected to spill to disk. (The materialize protects the sort from having
to support mark/restore, allowing it to do its final merge pass on-the-fly.)
We neglected to teach cost_mergejoin about that hack, so it was failing to
include the materialize's costs in the estimated cost of the mergejoin.
The materialize's costs are generally going to be pretty negligible in
comparison to the sort's, so this is only a small error and probably not
worth back-patching; but it's still wrong.

In the similar case where a materialize is inserted to protect an inner-side
node that can't do mark/restore at all, it's still true that the materialize
should not spill to disk, and so we should cost it cheaply rather than
expensively.

Noted while thinking about a question from Tom Raney.

e540b972

25 Aug, 2008 1 commit

Move exprType(), exprTypmod(), expression_tree_walker(), and related routines · e5536e77

Tom Lane authored 16 years ago

into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside
the backend.  There's probably more that should be done along this line,
but this is a start anyway.

e5536e77

22 Aug, 2008 1 commit

Arrange to convert EXISTS subqueries that are equivalent to hashable IN · bd3dadda

Tom Lane authored 16 years ago

subqueries into the same thing you'd have gotten from IN (except always with
unknownEqFalse = true, so as to get the proper semantics for an EXISTS).
I believe this fixes the last case within CVS HEAD in which an EXISTS could
give worse performance than an equivalent IN subquery.

The tricky part of this is that if the upper query probes the EXISTS for only
a few rows, the hashing implementation can actually be worse than the default,
and therefore we need to make a cost-based decision about which way to use.
But at the time when the planner generates plans for subqueries, it doesn't
really know how many times the subquery will be executed. The least invasive
solution seems to be to generate both plans and postpone the choice until
execution. Therefore, in a query that has been optimized this way, EXPLAIN
will show two subplans for the EXISTS, of which only one will actually get
executed.

There is a lot more that could be done based on this infrastructure: in
particular it's interesting to consider switching to the hash plan if we start
out using the non-hashed plan but find a lot more upper rows going by than we
expected. I have therefore left some minor inefficiencies in place, such as
initializing both subplans even though we will currently only use one.

bd3dadda

16 Aug, 2008 1 commit

Clean up the loose ends in selectivity estimation left by my patch for semi · d4af2a64

Tom Lane authored 16 years ago

and anti joins.  To do this, pass the SpecialJoinInfo struct for the current
join as an additional optional argument to operator join selectivity
estimation functions.  This allows the estimator to tell not only what kind
of join is being formed, but which variable is on which side of the join;
a requirement long recognized but not dealt with till now.  This also leaves
the door open for future improvements in the estimators, such as accounting
for the null-insertion effects of lower outer joins.  I didn't do anything
about that in the current patch but the information is in principle deducible
from what's passed.

The patch also clarifies the definition of join selectivity for semi/anti
joins: it's the fraction of the left input that has (at least one) match
in the right input.  This allows getting rid of some very fuzzy thinking
that I had committed in the original 7.4-era IN-optimization patch.
There's probably room to estimate this better than the present patch does,
but at least we know what to estimate.

Since I had to touch CREATE OPERATOR anyway to allow a variant signature
for join estimator functions, I took the opportunity to add a couple of
additional checks that were missing, per my recent message to -hackers:
* Check that estimator functions return float8;
* Require execute permission at the time of CREATE OPERATOR on the
operator's function as well as the estimator functions;
* Require ownership of any pre-existing operator that's modified by
the command.
I also moved the lookup of the functions out of OperatorCreate() and
into operatorcmds.c, since that seemed more consistent with most of
the other catalog object creation processes, eg CREATE TYPE.

d4af2a64

14 Aug, 2008 1 commit

Implement SEMI and ANTI joins in the planner and executor. (Semijoins replace · e006a24a

Tom Lane authored 16 years ago

the old JOIN_IN code, but antijoins are new functionality.) Teach the planner
to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti
joins respectively. Also, LEFT JOINs with suitable upper-level IS NULL
filters are recognized as being anti joins. Unify the InClauseInfo and
OuterJoinInfo infrastructure into "SpecialJoinInfo". With that change,
it becomes possible to associate a SpecialJoinInfo with every join attempt,
which permits some cleanup of join selectivity estimation. That needs to be
taken much further than this patch does, but the next step is to change the
API for oprjoin selectivity functions, which seems like material for a
separate patch. So for the moment the output size estimates for semi and
especially anti joins are quite bogus.

e006a24a

24 Mar, 2008 1 commit

When a relation has been proven empty by constraint exclusion, propagate that · fd791e7b

Tom Lane authored 16 years ago

knowledge up through any joins it participates in. We were doing that already
in some special cases but not in the general case. Also, defend against zero
row estimates for the input relations in cost_mergejoin --- this fix may have
eliminated the only scenario in which that can happen, but be safe. Per
report from Alex Solovey.

fd791e7b

01 Jan, 2008 1 commit
- Update copyrights in source tree to 2008. · 9098ab9e
  Bruce Momjian authored 17 years ago
  
  9098ab9e
08 Dec, 2007 1 commit

Fix mergejoin cost estimation so that we consider the statistical ranges of · 9fd88436

Tom Lane authored 17 years ago

the two join variables at both ends: not only trailing rows that need not be
scanned because there cannot be a match on the other side, but initial rows
that will be scanned without possibly having a match. This allows a more
realistic estimate of startup cost to be made, per recent pgsql-performance
discussion. In passing, fix a couple of bugs that had crept into
mergejoinscansel: it was not quite up to speed for the task of estimating
descending-order scans, which is a new requirement in 8.3.

9fd88436

15 Nov, 2007 2 commits
- Re-run pgindent with updated list of typedefs. (Updated README should · f6e8730d
  Bruce Momjian authored 17 years ago
```
avoid this problem in the future.)
```
  f6e8730d
- pgindent run for 8.3. · fdf5a5ef
  Bruce Momjian authored 17 years ago
  
  fdf5a5ef
24 Oct, 2007 1 commit

Fix UPDATE/DELETE WHERE CURRENT OF to support repeated update and update- · c29a9c37

Tom Lane authored 17 years ago

then-delete on the current cursor row.  The basic fix is that nodeTidscan.c
has to apply heap_get_latest_tid() to the current-scan-TID obtained from the
cursor query; this ensures we get the latest row version to work with.
However, since that only works if the query plan is a TID scan, we also have
to hack the planner to make sure only that type of plan will be selected.
(Formerly, the planner might decide to apply a seqscan if the table is very
small.  This change is probably a Good Thing anyway, since it's hard to see
how a seqscan could really win.)  That means the execQual.c code to support
CurrentOfExpr as a regular expression type is dead code, so replace it with
just an elog().  Also, add regression tests covering these cases.  Note
that the added tests expose the fact that re-fetching an updated row
misbehaves if the cursor used FOR UPDATE.  That's an independent bug that
should be fixed later.  Per report from Dharmendra Goyal.

c29a9c37

22 Sep, 2007 1 commit

Fix cost estimates for EXISTS subqueries that are evaluated as initPlans · 71256875

Tom Lane authored 17 years ago

(because they are uncorrelated with the immediate parent query).  We were
charging the full run cost to the parent node, disregarding the fact that
only one row need be fetched for EXISTS.  While this would only be a
cosmetic issue in most cases, it might possibly affect planning outcomes
if the parent query were itself a subquery to some upper query.
Per recent discussion with Steve Crawford.

71256875

11 Jun, 2007 1 commit

Support UPDATE/DELETE WHERE CURRENT OF cursor_name, per SQL standard. · 6808f1b1

Tom Lane authored 17 years ago

Along the way, allow FOR UPDATE in non-WITH-HOLD cursors; there may once
have been a reason to disallow that, but it seems to work now, and it's
really rather necessary if you want to select a row via a cursor and then
update it in a concurrent-safe fashion.

Original patch by Arul Shaji, rather heavily editorialized by Tom Lane.

6808f1b1

05 Jun, 2007 1 commit

Downgrade implicit casts to text to be assignment-only, except for the ones · 31edbadf

Tom Lane authored 17 years ago

from the other string-category types; this eliminates a lot of surprising
interpretations that the parser could formerly make when there was no directly
applicable operator.

Create a general mechanism that supports casts to and from the standard string
types (text,varchar,bpchar) for *every* datatype, by invoking the datatype's
I/O functions.  These new casts are assignment-only in the to-string direction,
explicit-only in the other, and therefore should create no surprising behavior.
Remove a bunch of thereby-obsoleted datatype-specific casting functions.

The "general mechanism" is a new expression node type CoerceViaIO that can
actually convert between *any* two datatypes if their external text
representations are compatible.  This is more general than needed for the
immediate feature, but might be useful in plpgsql or other places in future.

This commit does nothing about the issue that applying the concatenation
operator || to non-text types will now fail, often with strange error messages
due to misinterpreting the operator as array concatenation.  Since it often
(not always) worked before, we should either make it succeed or at least give
a more user-friendly error; but details are still under debate.

Peter Eisentraut and Tom Lane

31edbadf

21 May, 2007 1 commit

Teach tuplestore.c to throw away data before the "mark" point when the caller · 2415ad98

Tom Lane authored 17 years ago

is using mark/restore but not rewind or backward-scan capability. Insert a
materialize plan node between a mergejoin and its inner child if the inner
child is a sort that is expected to spill to disk. The materialize shields
the sort from the need to do mark/restore and thereby allows it to perform
its final merge pass on-the-fly; while the materialize itself is normally
cheap since it won't spill to disk unless the number of tuples with equal
key values exceeds work_mem.

Greg Stark, with some kibitzing from Tom Lane.

2415ad98

04 May, 2007 1 commit

Teach tuplesort.c about "top N" sorting, in which only the first N tuples · d26559db

Tom Lane authored 17 years ago

need be returned. We keep a heap of the current best N tuples and sift-up
new tuples into it as we scan the input. For M input tuples this means
only about M*log(N) comparisons instead of M*log(M), not to mention a lot
less workspace when N is small --- avoiding spill-to-disk for large M
is actually the most attractive thing about it. Patch includes planner
and executor support for invoking this facility in ORDER BY ... LIMIT
queries. Greg Stark, with some editorialization by moi.

d26559db

21 Apr, 2007 2 commits

Some further performance tweaks for planning large inheritance trees that · afcf09dd

Tom Lane authored 17 years ago

are mostly excluded by constraints: do the CE test a bit earlier to save
some adjust_appendrel_attrs() work on excluded children, and arrange to
use array indexing rather than rt_fetch() to fetch RTEs in the main body
of the planner. The latter is something I'd wanted to do for awhile anyway,
but seeing list_nth_cell() as 35% of the runtime gets one's attention.

afcf09dd

Tweak set_rel_width() to avoid redundant executions of getrelid(). · ca3d14f2
Tom Lane authored 17 years ago
```
In very large queries this accounts for a noticeable fraction of
planning time.  Per an example from Greg Stark.
```
ca3d14f2

27 Mar, 2007 1 commit

Fix array coercion expressions to ensure that the correct volatility is · bf940763

Tom Lane authored 17 years ago

seen by code inspecting the expression. The best way to do this seems
to be to drop the original representation as a function invocation, and
instead make a special expression node type that represents applying
the element-type coercion function to each array element. In this way
the element function is exposed and will be checked for volatility.
Per report from Guillaume Smet.

bf940763

22 Feb, 2007 1 commit

Turn the rangetable used by the executor into a flat list, and avoid storing · eab6b8b2

Tom Lane authored 18 years ago

useless substructure for its RangeTblEntry nodes. (I chose to keep using the
same struct node type and just zero out the link fields for unneeded info,
rather than making a separate ExecRangeTblEntry type --- it seemed too
fragile to have two different rangetable representations.)

Along the way, put subplans into a list in the toplevel PlannedStmt node,
and have SubPlan nodes refer to them by list index instead of direct pointers.
Vadim wanted to do that years ago, but I never understood what he was on about
until now. It makes things a *whole* lot more robust, because we can stop
worrying about duplicate processing of subplans during expression tree
traversals. That's been a constant source of bugs, and it's finally gone.

There are some consequent simplifications yet to be made, like not using
a separate EState for subplans in the executor, but I'll tackle that later.

eab6b8b2

22 Jan, 2007 2 commits

Put back planner's ability to cache the results of mergejoinscansel(), · 4f06c688

Tom Lane authored 18 years ago

which I had removed in the first cut of the EquivalenceClass rewrite to
simplify that patch a little. But it's still important --- in a four-way
join problem mergejoinscansel() was eating about 40% of the planning time
according to gprof. Also, improve the EquivalenceClass code to re-use
join RestrictInfos rather than generating fresh ones for each join
considered. This saves some memory space but more importantly improves
the effectiveness of caching planning info in RestrictInfos.

4f06c688

Add COST and ROWS options to CREATE/ALTER FUNCTION, plus underlying pg_proc · 5a7471c3

Tom Lane authored 18 years ago

columns procost and prorows, to allow simple user adjustment of the estimated
cost of a function call, as well as control of the estimated number of rows
returned by a set-returning function. We might eventually wish to extend this
to allow function-specific estimation routines, but there seems to be
consensus that we should try a simple constant estimate first. In particular
this provides a relatively simple way to control the order in which different
WHERE clauses are applied in a plan node, which is a Good Thing in view of the
fact that the recent EquivalenceClass planner rewrite made that much less
predictable than before.

5a7471c3

20 Jan, 2007 1 commit

Refactor planner's pathkeys data structure to create a separate, explicit · f41803bb

Tom Lane authored 18 years ago

representation of equivalence classes of variables.  This is an extensive
rewrite, but it brings a number of benefits:
* planner no longer fails in the presence of "incomplete" operator families
that don't offer operators for every possible combination of datatypes.
* avoid generating and then discarding redundant equality clauses.
* remove bogus assumption that derived equalities always use operators
named "=".
* mergejoins can work with a variety of sort orders (e.g., descending) now,
instead of tying each mergejoinable operator to exactly one sort order.
* better recognition of redundant sort columns.
* can make use of equalities appearing underneath an outer join.

f41803bb

10 Jan, 2007 1 commit

Change the planner-to-executor API so that the planner tells the executor · a191a169

Tom Lane authored 18 years ago

which comparison operators to use for plan nodes involving tuple comparison
(Agg, Group, Unique, SetOp).  Formerly the executor looked up the default
equality operator for the datatype, which was really pretty shaky, since it's
possible that the data being fed to the node is sorted according to some
nondefault operator class that could have an incompatible idea of equality.
The planner knows what it has sorted by and therefore can provide the right
equality operator to use.  Also, this change moves a couple of catalog lookups
out of the executor and into the planner, which should help startup time for
pre-planned queries by some small amount.  Modify the planner to remove some
other cavalier assumptions about always being able to use the default
operators.  Also add "nulls first/last" info to the Plan node for a mergejoin
--- neither the executor nor the planner can cope yet, but at least the API is
in place.

a191a169

08 Jan, 2007 1 commit

Remove cost_hashjoin's very ancient hack to discourage (once, entirely forbid) · 9a9a143a

Tom Lane authored 18 years ago

hash joins with the estimated-larger relation on the inside. There are
several cases where doing that makes perfect sense, and in cases where it
doesn't, the regular cost computation really ought to be able to figure that
out. Make some marginal tweaks in said computation to try to get results
approximating reality a bit better. Per an example from Shane Ambler.

Also, fix an oversight in the original patch to add seq_page_cost: the costs
of spilling a hash join to disk should be scaled by seq_page_cost.

9a9a143a

05 Jan, 2007 1 commit
- Update CVS HEAD for 2007 copyright. Back branches are typically not · 29dccf5f
  Bruce Momjian authored 18 years ago
```
back-stamped for this.
```
  29dccf5f
23 Dec, 2006 1 commit

Restructure operator classes to allow improved handling of cross-data-type · a78fcfb5

Tom Lane authored 18 years ago

cases. Operator classes now exist within "operator families". While most
families are equivalent to a single class, related classes can be grouped
into one family to represent the fact that they are semantically compatible.
Cross-type operators are now naturally adjunct parts of a family, without
having to wedge them into a particular opclass as we had done originally.

This commit restructures the catalogs and cleans up enough of the fallout so
that everything still works at least as well as before, but most of the work
needed to actually improve the planner's behavior will come later. Also,
there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way
to create a new family right now is to allow CREATE OPERATOR CLASS to make
one by default. I owe some more documentation work, too. But that can all
be done in smaller pieces once this infrastructure is in place.

a78fcfb5

15 Dec, 2006 1 commit

Fix some planner bugs exposed by reports from Arjen van der Meijden. These · 281f4018

Tom Lane authored 18 years ago

are all in new-in-8.2 logic associated with indexability of ScalarArrayOpExpr
(IN-clauses) or amortization of indexscan costs across repeated indexscans
on the inside of a nestloop.  In particular:

Fix some logic errors in the estimation for multiple scans induced by a
ScalarArrayOpExpr indexqual.

Include a small cost component in bitmap index scans to reflect the costs of
manipulating the bitmap itself; this is mainly to prevent a bitmap scan from
appearing to have the same cost as a plain indexscan for fetching a single
tuple.

Also add a per-index-scan-startup CPU cost component; while prior releases
were clearly too pessimistic about the cost of repeated indexscans, the
original 8.2 coding allowed the cost of an indexscan to effectively go to zero
if repeated often enough, which is overly optimistic.

Pay some attention to index correlation when estimating costs for a nestloop
inner indexscan: this is significant when the plan fetches multiple heap
tuples per iteration, since high correlation means those tuples are probably
on the same or adjacent heap pages.

281f4018

11 Nov, 2006 1 commit
- Suppress a few 'uninitialized variable' warnings that gcc emits only at · a46ca619
  Tom Lane authored 18 years ago
```
-O3 or higher (presumably because it inlines more things).  Per gripe
from Mark Mielke.
```
  a46ca619
10 Nov, 2006 1 commit

Fix set_joinrel_size_estimates() to estimate outer-join sizes more · d19798e5

Tom Lane authored 18 years ago

accurately: we have to distinguish the effects of the join's own ON
clauses from the effects of pushed-down clauses.  Failing to do so
was a quick hack long ago, but it's time to be smarter.  Per example
from Thomas H.

d19798e5

04 Oct, 2006 1 commit
- pgindent run for 8.2. · f99a569a
  Bruce Momjian authored 18 years ago
  
  f99a569a
19 Sep, 2006 1 commit

Improve usage of effective_cache_size parameter by assuming that all the · b74c5436

Tom Lane authored 18 years ago

tables in the query compete for cache space, not just the one we are
currently costing an indexscan for. This seems more realistic, and it
definitely will help in examples recently exhibited by Stefan
Kaltenbrunner. To get the total size of all the tables involved, we must
tweak the handling of 'append relations' a bit --- formerly we looked up
information about the child tables on-the-fly during set_append_rel_pathlist,
but it needs to be done before we start doing any cost estimation, so
push it into the add_base_rels_to_query scan.

b74c5436

02 Aug, 2006 1 commit

Add support for multi-row VALUES clauses as part of INSERT statements · 9caafda5

Joe Conway authored 18 years ago

(e.g. "INSERT ... VALUES (...), (...), ...") and elsewhere as allowed
by the spec. (e.g. similar to a FROM clause subselect). initdb required.
Joe Conway and Tom Lane.

9caafda5

26 Jul, 2006 1 commit
- Convert effective_cache_size to an integer, for better integration with · 79bc99a4
  Peter Eisentraut authored 18 years ago
```
upcoming units feature.
```
  79bc99a4
22 Jul, 2006 1 commit

In the recent changes to make the planner account better for cache · 98359c3e

Tom Lane authored 18 years ago

effects in a nestloop inner indexscan, I had only dealt with plain index
scans and the index portion of bitmap scans.  But there will be cache
benefits for the heap accesses of bitmap scans too, so fix
cost_bitmap_heap_scan() to account for that.

98359c3e

14 Jul, 2006 1 commit
- Remove 576 references of include files that were not needed. · e0522505
  Bruce Momjian authored 18 years ago
  
  e0522505
11 Jul, 2006 1 commit
- Sort reference of include files, "A" - "F". · fa601357
  Bruce Momjian authored 18 years ago
  
  fa601357
01 Jul, 2006 1 commit

Fix oversight in planning for multiple indexscans driven by · 08ccdf02

Tom Lane authored 18 years ago

ScalarArrayOpExpr index quals: we were estimating the right total
number of rows returned, but treating the index-access part of the
cost as if a single scan were fetching that many consecutive index
tuples. Actually we should treat it as a multiple indexscan, and
if there are enough of 'em the Mackert-Lohman discount should kick in.

08ccdf02