Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
P
Postgres FD Implementation
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Abuhujair Javed
Postgres FD Implementation
Commits
ddb93cac
Commit
ddb93cac
authored
Jul 21, 2007
by
Tom Lane
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Provide a bit more high-level documentation for the GEQO planner.
Per request from Luca Ferrari.
parent
7abe764f
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
85 additions
and
21 deletions
+85
-21
doc/src/sgml/arch-dev.sgml
doc/src/sgml/arch-dev.sgml
+33
-15
doc/src/sgml/geqo.sgml
doc/src/sgml/geqo.sgml
+52
-6
No files found.
doc/src/sgml/arch-dev.sgml
View file @
ddb93cac
<!-- $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.
29 2007/01/31 20:56:16 momjian
Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.
30 2007/07/21 04:02:41 tgl
Exp $ -->
<chapter id="overview">
<title>Overview of PostgreSQL Internals</title>
...
...
@@ -345,9 +345,10 @@
can be executed would take an excessive amount of time and memory
space. In particular, this occurs when executing queries
involving large numbers of join operations. In order to determine
a reasonable (not optimal) query plan in a reasonable amount of
time, <productname>PostgreSQL</productname> uses a <xref
linkend="geqo" endterm="geqo-title">.
a reasonable (not necessarily optimal) query plan in a reasonable amount
of time, <productname>PostgreSQL</productname> uses a <xref
linkend="geqo" endterm="geqo-title"> when the number of joins
exceeds a threshold (see <xref linkend="guc-geqo-threshold">).
</para>
</note>
...
...
@@ -380,20 +381,17 @@
the index's <firstterm>operator class</>, another plan is created using
the B-tree index to scan the relation. If there are further indexes
present and the restrictions in the query happen to match a key of an
index further plans will be considered.
index, further plans will be considered. Index scan plans are also
generated for indexes that have a sort ordering that can match the
query's <literal>ORDER BY</> clause (if any), or a sort ordering that
might be useful for merge joining (see below).
</para>
<para>
After all feasible plans have been found for scanning single relations,
plans for joining relations are created. The planner/optimizer
preferentially considers joins between any two relations for which there
exist a corresponding join clause in the <literal>WHERE</literal> qualification (i.e. for
which a restriction like <literal>where rel1.attr1=rel2.attr2</literal>
exists). Join pairs with no join clause are considered only when there
is no other choice, that is, a particular relation has no available
join clauses to any other relation. All possible plans are generated for
every join pair considered
by the planner/optimizer. The three possible join strategies are:
If the query requires joining two or more relations,
plans for joining relations are considered
after all feasible plans have been found for scanning single relations.
The three available join strategies are:
<itemizedlist>
<listitem>
...
...
@@ -439,6 +437,26 @@
cheapest one.
</para>
<para>
If the query uses fewer than <xref linkend="guc-geqo-threshold">
relations, a near-exhaustive search is conducted to find the best
join sequence. The planner preferentially considers joins between any
two relations for which there exist a corresponding join clause in the
<literal>WHERE</literal> qualification (i.e. for
which a restriction like <literal>where rel1.attr1=rel2.attr2</literal>
exists). Join pairs with no join clause are considered only when there
is no other choice, that is, a particular relation has no available
join clauses to any other relation. All possible plans are generated for
every join pair considered by the planner, and the one that is
(estimated to be) the cheapest is chosen.
</para>
<para>
When <varname>geqo_threshold</varname> is exceeded, the join
sequences considered are determined by heuristics, as described
in <xref linkend="geqo">. Otherwise the process is the same.
</para>
<para>
The finished plan tree consists of sequential or index scans of
the base relations, plus nested-loop, merge, or hash join nodes as
...
...
doc/src/sgml/geqo.sgml
View file @
ddb93cac
<!-- $PostgreSQL: pgsql/doc/src/sgml/geqo.sgml,v 1.
39 2007/02/16 03:50:29 momjian
Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/geqo.sgml,v 1.
40 2007/07/21 04:02:41 tgl
Exp $ -->
<chapter id="geqo">
<chapterinfo>
...
...
@@ -186,11 +186,6 @@
<productname>PostgreSQL</productname> optimizer.
</para>
<para>
Parts of the <acronym>GEQO</acronym> module are adapted from D. Whitley's Genitor
algorithm.
</para>
<para>
Specific characteristics of the <acronym>GEQO</acronym>
implementation in <productname>PostgreSQL</productname>
...
...
@@ -224,6 +219,11 @@
</itemizedlist>
</para>
<para>
Parts of the <acronym>GEQO</acronym> module are adapted from D. Whitley's
Genitor algorithm.
</para>
<para>
The <acronym>GEQO</acronym> module allows
the <productname>PostgreSQL</productname> query optimizer to
...
...
@@ -231,6 +231,42 @@
non-exhaustive search.
</para>
<sect2>
<title>Generating Possible Plans with <acronym>GEQO</acronym></title>
<para>
The <acronym>GEQO</acronym> planning process uses the standard planner
code to generate plans for scans of individual relations. Then join
plans are developed using the genetic approach. As shown above, each
candidate join plan is represented by a sequence in which to join
the base relations. In the initial stage, the <acronym>GEQO</acronym>
code simply generates some possible join sequences at random. For each
join sequence considered, the standard planner code is invoked to
estimate the cost of performing the query using that join sequence.
(For each step of the join sequence, all three possible join strategies
are considered; and all the initially-determined relation scan plans
are available. The estimated cost is the cheapest of these
possibilities.) Join sequences with lower estimated cost are considered
<quote>more fit</> than those with higher cost. The genetic algorithm
discards the least fit candidates. Then new candidates are generated
by combining genes of more-fit candidates — that is, by using
randomly-chosen portions of known low-cost join sequences to create
new sequences for consideration. This process is repeated until a
preset number of join sequences have been considered; then the best
one found at any time during the search is used to generate the finished
plan.
</para>
<para>
This process is inherently nondeterministic, because of the randomized
choices made during both the initial population selection and subsequent
<quote>mutation</> of the best candidates. Hence different plans may
be selected from one run to the next, resulting in varying run time
and varying output row order.
</para>
</sect2>
<sect2 id="geqo-future">
<title>Future Implementation Tasks for
<productname>PostgreSQL</> <acronym>GEQO</acronym></title>
...
...
@@ -257,6 +293,16 @@
</itemizedlist>
</para>
<para>
In the current implementation, the fitness of each candidate join
sequence is estimated by running the standard planner's join selection
and cost estimation code from scratch. To the extent that different
candidates use similar sub-sequences of joins, a great deal of work
will be repeated. This could be made significantly faster by retaining
cost estimates for sub-joins. The problem is to avoid expending
unreasonable amounts of memory on retaining that state.
</para>
<para>
At a more basic level, it is not clear that solving query optimization
with a GA algorithm designed for TSP is appropriate. In the TSP case,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment