Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
P
Postgres FD Implementation
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Abuhujair Javed
Postgres FD Implementation
Commits
81c41e3d
Commit
81c41e3d
authored
Jan 05, 2005
by
Tom Lane
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
More minor updates and copy-editing.
parent
b4b984bc
Changes
5
Expand all
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
265 additions
and
174 deletions
+265
-174
doc/src/sgml/arch-dev.sgml
doc/src/sgml/arch-dev.sgml
+49
-32
doc/src/sgml/bki.sgml
doc/src/sgml/bki.sgml
+69
-57
doc/src/sgml/catalogs.sgml
doc/src/sgml/catalogs.sgml
+117
-65
doc/src/sgml/geqo.sgml
doc/src/sgml/geqo.sgml
+25
-15
doc/src/sgml/plhandler.sgml
doc/src/sgml/plhandler.sgml
+5
-5
No files found.
doc/src/sgml/arch-dev.sgml
View file @
81c41e3d
<!--
$PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.2
4 2003/11/29 19:51:36 pgsq
l Exp $
$PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.2
5 2005/01/05 23:42:02 tg
l Exp $
-->
<chapter id="overview">
...
...
@@ -63,11 +63,11 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E
<firstterm>system catalogs</firstterm>) to apply to
the query tree. It performs the
transformations given in the <firstterm>rule bodies</firstterm>.
One application of the rewrite system is in the realization of
<firstterm>views</firstterm>.
</para>
<para>
One application of the rewrite system is in the realization of
<firstterm>views</firstterm>.
Whenever a query against a view
(i.e. a <firstterm>virtual table</firstterm>) is made,
the rewrite system rewrites the user's query to
...
...
@@ -90,8 +90,8 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E
relation to be scanned, there are two paths for the
scan. One possibility is a simple sequential scan and the other
possibility is to use the index. Next the cost for the execution of
each p
lan is estimated and the
cheapest plan is chosen and handed back
.
each p
ath is estimated and the cheapest path is chosen. The cheapest
path is expanded into a complete plan that the executor can use
.
</para>
</step>
...
...
@@ -142,7 +142,8 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E
<productname>PostgreSQL</productname> protocol described in
<xref linkend="protocol">. Many clients are based on the
C-language library <application>libpq</>, but several independent
implementations exist, such as the Java <application>JDBC</> driver.
implementations of the protocol exist, such as the Java
<application>JDBC</> driver.
</para>
<para>
...
...
@@ -339,7 +340,7 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E
different ways, each of which will produce the same set of
results. If it is computationally feasible, the query optimizer
will examine each of these possible execution plans, ultimately
selecting the execution plan that
will
run the fastest.
selecting the execution plan that
is expected to
run the fastest.
</para>
<note>
...
...
@@ -355,20 +356,26 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E
</note>
<para>
After the cheapest path is determined, a <firstterm>plan tree</>
is built to pass to the executor. This represents the desired
execution plan in sufficient detail for the executor to run it.
The planner's search procedure actually works with data structures
called <firstterm>paths</>, which are simply cut-down representations of
plans containing only as much information as the planner needs to make
its decisions. After the cheapest path is determined, a full-fledged
<firstterm>plan tree</> is built to pass to the executor. This represents
the desired execution plan in sufficient detail for the executor to run it.
In the rest of this section we'll ignore the distinction between paths
and plans.
</para>
<sect2>
<title>Generating Possible Plans</title>
<para>
The planner/optimizer decides which plans should be generated
based upon the types of indexes defined on the relations appearing in
a query. There is always the possibility of performing a
sequential scan on a relation, so a plan using only
sequential scans is always created. Assume an index is defined on a
The planner/optimizer starts by generating plans for scanning each
individual relation (table) used in the query. The possible plans
are determined by the available indexes on each relation.
There is always the possibility of performing a
sequential scan on a relation, so a sequential scan plan is always
created. Assume an index is defined on a
relation (for example a B-tree index) and a query contains the
restriction
<literal>relation.attribute OPR constant</literal>. If
...
...
@@ -395,37 +402,47 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E
<itemizedlist>
<listitem>
<para>
<firstterm>nested loop join</firstterm>: The right relation is scanned
once for every row found in the left relation. This strategy
is easy to implement but can be very time consuming. (However,
if the right relation can be scanned with an index scan, this can
be a good strategy. It is possible to use values from the current
row of the left relation as keys for the index scan of the right.)
<firstterm>nested loop join</firstterm>: The right relation is scanned
once for every row found in the left relation. This strategy
is easy to implement but can be very time consuming. (However,
if the right relation can be scanned with an index scan, this can
be a good strategy. It is possible to use values from the current
row of the left relation as keys for the index scan of the right.)
</para>
</listitem>
<listitem>
<para>
<firstterm>merge sort join</firstterm>: Each relation is sorted on the join
attributes before the join starts. Then the two relations are
merged together taking into account that both relations are
ordered on the join attributes. This kind of join is more
attractive because each relation has to be scanned only once.
<firstterm>merge sort join</firstterm>: Each relation is sorted on the join
attributes before the join starts. Then the two relations are
scanned in parallel, and matching rows are combined to form
join rows. This kind of join is more
attractive because each relation has to be scanned only once.
The required sorting may be achieved either by an explicit sort
step, or by scanning the relation in the proper order using an
index on the join key.
</para>
</listitem>
<listitem>
<para>
<firstterm>hash join</firstterm>: the right relation is first scanned
and loaded into a hash table, using its join attributes as hash keys.
Next the left relation is scanned and the
appropriate values of every row found are used as hash keys to
locate the matching rows in the table.
<firstterm>hash join</firstterm>: the right relation is first scanned
and loaded into a hash table, using its join attributes as hash keys.
Next the left relation is scanned and the
appropriate values of every row found are used as hash keys to
locate the matching rows in the table.
</para>
</listitem>
</itemizedlist>
</para>
<para>
When the query involves more than two relations, the final result
must be built up by a tree of join steps, each with two inputs.
The planner examines different possible join sequences to find the
cheapest one.
</para>
<para>
The finished plan tree consists of sequential or index scans of
the base relations, plus nested-loop, merge, or hash join nodes as
...
...
@@ -512,7 +529,7 @@ $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.24 2003/11/29 19:51:36 pgsql E
the executor top level uses this information to create a new updated row
and mark the old row deleted. For <command>DELETE</>, the only column
that is actually returned by the plan is the TID, and the executor top
level simply uses the TID to visit
the target rows and mark them
deleted.
level simply uses the TID to visit
each target row and mark it
deleted.
</para>
</sect1>
...
...
doc/src/sgml/bki.sgml
View file @
81c41e3d
<!--
$PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.1
2 2003/11/29 19:51:36 pgsq
l Exp $
$PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.1
3 2005/01/05 23:42:03 tg
l Exp $
-->
<chapter id="bki">
...
...
@@ -7,10 +7,11 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $
<para>
Backend Interface (<acronym>BKI</acronym>) files are scripts in a
special language that are input to the
<productname>PostgreSQL</productname> backend running in the special
<quote>bootstrap</quote> mode that allows it to perform database
functions without a database system already existing.
special language that is understood by the
<productname>PostgreSQL</productname> backend when running in the
<quote>bootstrap</quote> mode. The bootstrap mode allows system catalogs
to be created and filled from scratch, whereas ordinary SQL commands
require the catalogs to exist already.
<acronym>BKI</acronym> files can therefore be used to create the
database system in the first place. (And they are probably not
useful for anything else.)
...
...
@@ -21,8 +22,9 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $
to do part of its job when creating a new database cluster. The
input file used by <application>initdb</application> is created as
part of building and installing <productname>PostgreSQL</productname>
by a program named <filename>genbki.sh</filename> from some
specially formatted C header files in the source tree. The created
by a program named <filename>genbki.sh</filename>, which reads some
specially formatted C header files in the <filename>src/include/catalog/</>
directory of the source tree. The created
<acronym>BKI</acronym> file is called <filename>postgres.bki</filename> and is
normally installed in the
<filename>share</filename> subdirectory of the installation tree.
...
...
@@ -40,9 +42,7 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $
This section describes how the <productname>PostgreSQL</productname>
backend interprets <acronym>BKI</acronym> files. This description
will be easier to understand if the <filename>postgres.bki</filename>
file is at hand as an example. You should also study the source
code of <application>initdb</application> to get an idea of how the
backend is invoked.
file is at hand as an example.
</para>
<para>
...
...
@@ -67,6 +67,61 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $
<title><acronym>BKI</acronym> Commands</title>
<variablelist>
<varlistentry>
<term>
create
<optional>bootstrap</optional>
<optional>shared_relation</optional>
<optional>without_oids</optional>
<replaceable class="parameter">tablename</replaceable>
(<replaceable class="parameter">name1</replaceable> =
<replaceable class="parameter">type1</replaceable> <optional>,
<replaceable class="parameter">name2</replaceable> = <replaceable
class="parameter">type2</replaceable>, ...</optional>)
</term>
<listitem>
<para>
Create a table named <replaceable
class="parameter">tablename</replaceable> with the columns given
in parentheses.
</para>
<para>
The following column types are supported directly by
<filename>bootstrap.c</>: <type>bool</type>,
<type>bytea</type>, <type>char</type> (1 byte),
<type>name</type>, <type>int2</type>,
<type>int4</type>, <type>regproc</type>, <type>regclass</type>,
<type>regtype</type>, <type>text</type>,
<type>oid</type>, <type>tid</type>, <type>xid</type>,
<type>cid</type>, <type>int2vector</type>, <type>oidvector</type>,
<type>_int4</type> (array), <type>_text</type> (array),
<type>_aclitem</type> (array). Although it is possible to create
tables containing columns of other types, this cannot be done until
after <structname>pg_type</> has been created and filled with
appropriate entries.
</para>
<para>
When <literal>bootstrap</> is specified,
the table will only be created on disk; nothing is entered into
<structname>pg_class</structname>,
<structname>pg_attribute</structname>, etc, for it. Thus the
table will not be accessible by ordinary SQL operations until
such entries are made the hard way (with <literal>insert</>
commands). This option is used for creating
<structname>pg_class</structname> etc themselves.
</para>
<para>
The table is created as shared if <literal>shared_relation</> is
specified.
It will have OIDs unless <literal>without_oids</> is specified.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
open <replaceable class="parameter">tablename</replaceable>
...
...
@@ -98,51 +153,6 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $
</listitem>
</varlistentry>
<varlistentry>
<term>
create <replaceable class="parameter">tablename</replaceable>
(<replaceable class="parameter">name1</replaceable> =
<replaceable class="parameter">type1</replaceable> <optional>,
<replaceable class="parameter">name2</replaceable> = <replaceable
class="parameter">type2</replaceable>, ...</optional>)
</term>
<listitem>
<para>
Create a table named <replaceable
class="parameter">tablename</replaceable> with the columns given
in parentheses.
</para>
<para>
The <replaceable>type</replaceable> is not necessarily the data
type that the column will have in the SQL environment; that is
determined by the <structname>pg_attribute</structname> system
catalog. The type here is essentially only used to allocate
storage. The following types are allowed: <type>bool</type>,
<type>bytea</type>, <type>char</type> (1 byte),
<type>name</type>, <type>int2</type>, <type>int2vector</type>,
<type>int4</type>, <type>regproc</type>, <type>regclass</type>,
<type>regtype</type>, <type>text</type>,
<type>oid</type>, <type>tid</type>, <type>xid</type>,
<type>cid</type>, <type>oidvector</type>, <type>smgr</type>,
<type>_int4</type> (array), <type>_aclitem</type> (array).
Array types can also be indicated by writing
<literal>[]</literal> after the name of the element type.
</para>
<note>
<para>
The table will only be created on disk, it will not
automatically be registered in the system catalogs and will
therefore not be accessible unless appropriate rows are
inserted in <structname>pg_class</structname>,
<structname>pg_attribute</structname>, etc.
</para>
</note>
</listitem>
</varlistentry>
<varlistentry>
<term>
insert <optional>OID = <replaceable class="parameter">oid_value</replaceable></optional> (<replaceable class="parameter">value1</replaceable> <replaceable class="parameter">value2</replaceable> ...)
...
...
@@ -190,6 +200,8 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $
classes to use are <replaceable
class="parameter">opclass1</replaceable>, <replaceable
class="parameter">opclass2</replaceable> etc., respectively.
The index file is created and appropriate catalog entries are
made for it, but the index contents are not initialized by this command.
</para>
</listitem>
</varlistentry>
...
...
@@ -199,7 +211,7 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $
<listitem>
<para>
Build
the indices that have previously been declared.
Fill in
the indices that have previously been declared.
</para>
</listitem>
</varlistentry>
...
...
@@ -212,7 +224,7 @@ $PostgreSQL: pgsql/doc/src/sgml/bki.sgml,v 1.12 2003/11/29 19:51:36 pgsql Exp $
<para>
The following sequence of commands will create the
<literal>test_table</literal> table with the
two columns
table <literal>test_table</literal> with
two columns
<literal>cola</literal> and <literal>colb</literal> of type
<type>int4</type> and <type>text</type>, respectively, and insert
two rows into the table.
...
...
doc/src/sgml/catalogs.sgml
View file @
81c41e3d
This diff is collapsed.
Click to expand it.
doc/src/sgml/geqo.sgml
View file @
81c41e3d
<!--
$PostgreSQL: pgsql/doc/src/sgml/geqo.sgml,v 1.2
6 2003/11/29 19:51:37 pgsq
l Exp $
$PostgreSQL: pgsql/doc/src/sgml/geqo.sgml,v 1.2
7 2005/01/05 23:42:03 tg
l Exp $
Genetic Optimizer
-->
...
...
@@ -65,8 +65,7 @@ Genetic Optimizer
enormous amount of time and memory space when the number of joins
in the query grows large. This makes the ordinary
<productname>PostgreSQL</productname> query optimizer
inappropriate for database application domains that involve the
need for extensive queries, such as artificial intelligence.
inappropriate for queries that join a large number of tables.
</para>
<para>
...
...
@@ -97,7 +96,7 @@ Genetic Optimizer
<para>
The genetic algorithm (<acronym>GA</acronym>) is a heuristic optimization method which
operates through
determined
, randomized search. The set of possible solutions for the
nondeterministic
, randomized search. The set of possible solutions for the
optimization problem is considered as a
<firstterm>population</firstterm> of <firstterm>individuals</firstterm>.
The degree of adaptation of an individual to its environment is specified
...
...
@@ -176,11 +175,12 @@ Genetic Optimizer
<title>Genetic Query Optimization (<acronym>GEQO</acronym>) in PostgreSQL</title>
<para>
The <acronym>GEQO</acronym> module is intended for the solution of the query
optimization problem similar to a traveling salesman problem (<acronym>TSP</acronym>).
The <acronym>GEQO</acronym> module approaches the query
optimization problem as though it were the well-known traveling salesman
problem (<acronym>TSP</acronym>).
Possible query plans are encoded as integer strings. Each string
represents the join order from one relation of the query to the next.
E. g., the query
tree
For example, the join
tree
<literallayout class="monospaced">
/\
/\ 2
...
...
@@ -245,29 +245,39 @@ Genetic Optimizer
<para>
Work is still needed to improve the genetic algorithm parameter
settings.
In file <filename>backend/optimizer/geqo/geqo_params.c</filename>, routines
In file <filename>src/backend/optimizer/geqo/geqo_main.c</filename>,
routines
<function>gimme_pool_size</function> and <function>gimme_number_generations</function>,
we have to find a compromise for the parameter settings
to satisfy two competing demands:
<itemizedlist spacing="compact">
<listitem>
<para>
Optimality of the query plan
</para>
<para>
Optimality of the query plan
</para>
</listitem>
<listitem>
<para>
Computing time
</para>
<para>
Computing time
</para>
</listitem>
</itemizedlist>
</para>
<para>
At a more basic level, it is not clear that solving query optimization
with a GA algorithm designed for TSP is appropriate. In the TSP case,
the cost associated with any substring (partial tour) is independent
of the rest of the tour, but this is certainly not true for query
optimization. Thus it is questionable whether edge recombination
crossover is the most effective mutation procedure.
</para>
</sect2>
</sect1>
<sect1 id="geqo-biblio">
<title>Further Reading
s
</title>
<title>Further Reading</title>
<para>
The following resources contain additional information about
...
...
doc/src/sgml/plhandler.sgml
View file @
81c41e3d
<!--
$PostgreSQL: pgsql/doc/src/sgml/plhandler.sgml,v 1.
3 2004/12/30 21:45:36
tgl Exp $
$PostgreSQL: pgsql/doc/src/sgml/plhandler.sgml,v 1.
4 2005/01/05 23:42:03
tgl Exp $
-->
<chapter id="plhandler">
...
...
@@ -56,11 +56,11 @@ $PostgreSQL: pgsql/doc/src/sgml/plhandler.sgml,v 1.3 2004/12/30 21:45:36 tgl Exp
system table
<classname>pg_proc</classname> and to analyze the argument
and return types of the called function. The <literal>AS</> clause from the
<command>CREATE FUNCTION</command>
of
the function will be found
<command>CREATE FUNCTION</command>
command for
the function will be found
in the <literal>prosrc</literal> column of the
<classname>pg_proc</classname> row. This
may be the
source
text in the procedural language
itself (like for PL/Tcl), a
path name to a file, or anything else that tells the call handler
<classname>pg_proc</classname> row. This
is commonly
source
text in the procedural language
, but in theory it could be something else,
such as a
path name to a file, or anything else that tells the call handler
what to do in detail.
</para>
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment