Commit bc11dd61 authored by Bruce Momjian's avatar Bruce Momjian

Update FAQ.

parent 1ba85751
...@@ -691,7 +691,7 @@ Maximum number of indexes on a table? unlimited ...@@ -691,7 +691,7 @@ Maximum number of indexes on a table? unlimited
4.7)How much database disk space is required to store data from a typical 4.7)How much database disk space is required to store data from a typical
text file? text file?
A PostgreSQL database may need six and a half times the disk space A PostgreSQL database may need six-and-a-half times the disk space
required to store the data in a flat file. required to store the data in a flat file.
Consider a file of 300,000 lines with two integers on each line. The Consider a file of 300,000 lines with two integers on each line. The
...@@ -738,7 +738,7 @@ Maximum number of indexes on a table? unlimited ...@@ -738,7 +738,7 @@ Maximum number of indexes on a table? unlimited
faster. faster.
For column-specific optimization statistics, use VACUUM ANALYZE. For column-specific optimization statistics, use VACUUM ANALYZE.
VACUUM ANALYZE is important for complex multi-join queries, so the VACUUM ANALYZE is important for complex multijoin queries, so the
optimizer can estimate the number of rows returned from each table, optimizer can estimate the number of rows returned from each table,
and choose the proper join order. The backend does not keep track of and choose the proper join order. The backend does not keep track of
column statistics on its own, so VACUUM ANALYZE must be run to collect column statistics on its own, so VACUUM ANALYZE must be run to collect
...@@ -763,34 +763,34 @@ Maximum number of indexes on a table? unlimited ...@@ -763,34 +763,34 @@ Maximum number of indexes on a table? unlimited
handle range searches. A B-tree index only handles range searches in a handle range searches. A B-tree index only handles range searches in a
single dimension. R-tree's can handle multi-dimensional data. For single dimension. R-tree's can handle multi-dimensional data. For
example, if an R-tree index can be built on an attribute of type example, if an R-tree index can be built on an attribute of type
point, the system can more efficient answer queries like select all point, the system can more efficiently answer queries such as "select
points within a bounding rectangle. all points within a bounding rectangle."
The canonical paper that describes the original R-Tree design is: The canonical paper that describes the original R-tree design is:
Guttman, A. "R-Trees: A Dynamic Index Structure for Spatial Guttman, A. "R-trees: A Dynamic Index Structure for Spatial
Searching." Proc of the 1984 ACM SIGMOD Int'l Conf on Mgmt of Data, Searching." Proc of the 1984 ACM SIGMOD Int'l Conf on Mgmt of Data,
45-57. 45-57.
You can also find this paper in Stonebraker's "Readings in Database You can also find this paper in Stonebraker's "Readings in Database
Systems" Systems"
Builtin R-Trees can handle polygons and boxes. In theory, R-trees can Built-in R-trees can handle polygons and boxes. In theory, R-trees can
be extended to handle higher number of dimensions. In practice, be extended to handle higher number of dimensions. In practice,
extending R-trees require a bit of work and we don't currently have extending R-trees requires a bit of work and we don't currently have
any documentation on how to do it. any documentation on how to do it.
4.12) What is Genetic Query Optimization? 4.12) What is Genetic Query Optimization?
The GEQO module speeds query optimization when joining many tables by The GEQO module speeds query optimization when joining many tables by
means of a Genetic Algorithm (GA). It allows the handling of large means of a Genetic Algorithm (GA). It allows the handling of large
join queries through non-exhaustive search. join queries through nonexhaustive search.
4.13) How do I do regular expression searches and case-insensitive regular 4.13) How do I do regular expression searches and case-insensitive regular
expression searches? expression searches?
The ~ operator does regular-expression matching, and ~* does The ~ operator does regular expression matching, and ~* does
case-insensitive regular-expression matching. There is no case-insensitive regular expression matching. There is no
case-insensitive variant of the LIKE operator, but you can get the case-insensitive variant of the LIKE operator, but you can get the
effect of case-insensitive LIKE with this: effect of case-insensitive LIKE with this:
WHERE lower(textfield) LIKE lower(pattern) WHERE lower(textfield) LIKE lower(pattern)
...@@ -812,7 +812,7 @@ BYTEA bytea variable-length array of bytes ...@@ -812,7 +812,7 @@ BYTEA bytea variable-length array of bytes
You will see the internal name when examining system catalogs and in You will see the internal name when examining system catalogs and in
some error messages. some error messages.
The last four types above are "varlena" types (i.e. the first four The last four types above are "varlena" types (i.e., the first four
bytes are the length, followed by the data). char(#) allocates the bytes are the length, followed by the data). char(#) allocates the
maximum number of bytes no matter how much data is stored in the maximum number of bytes no matter how much data is stored in the
field. text, varchar(#), and bytea all have variable length on the field. text, varchar(#), and bytea all have variable length on the
...@@ -855,17 +855,17 @@ BYTEA bytea variable-length array of bytes ...@@ -855,17 +855,17 @@ BYTEA bytea variable-length array of bytes
You would then also have the new value stored in $newSerialID for use You would then also have the new value stored in $newSerialID for use
in other queries (e.g., as a foreign key to the person table). Note in other queries (e.g., as a foreign key to the person table). Note
that the name of the automatically-created SEQUENCE object will be that the name of the automatically created SEQUENCE object will be
named <table>_<serialcolumn>_seq, where table and serialcolumn are the named <table>_<serialcolumn>_seq, where table and serialcolumn are the
names of your table and your SERIAL column, respectively. names of your table and your SERIAL column, respectively.
Alternatively, you could retrieve the just-assigned SERIAL value with Alternatively, you could retrieve the assigned SERIAL value with the
the currval() function after it was inserted by default, e.g., currval() function after it was inserted by default, e.g.,
INSERT INTO person (name) VALUES ('Blaise Pascal'); INSERT INTO person (name) VALUES ('Blaise Pascal');
$newID = currval('person_id_seq'); $newID = currval('person_id_seq');
Finally, you could use the OID returned from the INSERT statement to Finally, you could use the OID returned from the INSERT statement to
lookup the default value, though this is probably the least portable look up the default value, though this is probably the least portable
approach. In Perl, using DBI with Edmund Mergl's DBD::Pg module, the approach. In Perl, using DBI with Edmund Mergl's DBD::Pg module, the
oid value is made available via $sth->{pg_oid_status} after oid value is made available via $sth->{pg_oid_status} after
$sth->execute(). $sth->execute().
...@@ -880,8 +880,8 @@ BYTEA bytea variable-length array of bytes ...@@ -880,8 +880,8 @@ BYTEA bytea variable-length array of bytes
OIDs are PostgreSQL's answer to unique row ids. Every row that is OIDs are PostgreSQL's answer to unique row ids. Every row that is
created in PostgreSQL gets a unique OID. All OIDs generated during created in PostgreSQL gets a unique OID. All OIDs generated during
initdb are less than 16384 (from backend/access/transam.h). All initdb are less than 16384 (from backend/access/transam.h). All
user-created OIDs are equal or greater that this. By default, all user-created OIDs are equal to or greater than this. By default, all
these OIDs are unique not only within a table, or database, but unique these OIDs are unique not only within a table or database, but unique
within the entire PostgreSQL installation. within the entire PostgreSQL installation.
PostgreSQL uses OIDs in its internal system tables to link rows PostgreSQL uses OIDs in its internal system tables to link rows
...@@ -965,8 +965,8 @@ BYTEA bytea variable-length array of bytes ...@@ -965,8 +965,8 @@ BYTEA bytea variable-length array of bytes
4.23) Why are my subqueries using IN so slow? 4.23) Why are my subqueries using IN so slow?
Currently, we join subqueries to outer queries by sequential scanning Currently, we join subqueries to outer queries by sequentially
the result of the subquery for each row of the outer query. A scanning the result of the subquery for each row of the outer query. A
workaround is to replace IN with EXISTS: workaround is to replace IN with EXISTS:
SELECT * SELECT *
FROM tab FROM tab
...@@ -1001,7 +1001,7 @@ BYTEA bytea variable-length array of bytes ...@@ -1001,7 +1001,7 @@ BYTEA bytea variable-length array of bytes
dump core? dump core?
The problem could be a number of things. Try testing your user-defined The problem could be a number of things. Try testing your user-defined
function in a stand alone test program first. function in a stand-alone test program first.
5.2) What does the message "NOTICE:PortalHeapMemoryFree: 0x402251d0 not in 5.2) What does the message "NOTICE:PortalHeapMemoryFree: 0x402251d0 not in
alloc set!" mean? alloc set!" mean?
......
...@@ -861,7 +861,7 @@ The row length limit will be removed in 7.1.<P> ...@@ -861,7 +861,7 @@ The row length limit will be removed in 7.1.<P>
<H4><A NAME="4.7">4.7</A>)How much database disk space is required to <H4><A NAME="4.7">4.7</A>)How much database disk space is required to
store data from a typical text file?<BR></H4><P> store data from a typical text file?<BR></H4><P>
A PostgreSQL database may need six and a half times the disk space A PostgreSQL database may need six-and-a-half times the disk space
required to store the data in a flat file.<P> required to store the data in a flat file.<P>
Consider a file of 300,000 lines with two integers on each line. The Consider a file of 300,000 lines with two integers on each line. The
...@@ -914,7 +914,7 @@ sequential scan would be faster.<P> ...@@ -914,7 +914,7 @@ sequential scan would be faster.<P>
For column-specific optimization statistics, use <SMALL>VACUUM For column-specific optimization statistics, use <SMALL>VACUUM
ANALYZE.</SMALL> <SMALL>VACUUM ANALYZE</SMALL> is important for complex ANALYZE.</SMALL> <SMALL>VACUUM ANALYZE</SMALL> is important for complex
multi-join queries, so the optimizer can estimate the number of rows multijoin queries, so the optimizer can estimate the number of rows
returned from each table, and choose the proper join order. The backend returned from each table, and choose the proper join order. The backend
does not keep track of column statistics on its own, so <SMALL>VACUUM does not keep track of column statistics on its own, so <SMALL>VACUUM
ANALYZE</SMALL> must be run to collect them periodically.<P> ANALYZE</SMALL> must be run to collect them periodically.<P>
...@@ -941,20 +941,20 @@ An R-tree index is used for indexing spatial data. A hash index can't ...@@ -941,20 +941,20 @@ An R-tree index is used for indexing spatial data. A hash index can't
handle range searches. A B-tree index only handles range searches in a handle range searches. A B-tree index only handles range searches in a
single dimension. R-tree's can handle multi-dimensional data. For single dimension. R-tree's can handle multi-dimensional data. For
example, if an R-tree index can be built on an attribute of type <I>point,</I> example, if an R-tree index can be built on an attribute of type <I>point,</I>
the system can more efficient answer queries like select all points the system can more efficiently answer queries such as "select all points
within a bounding rectangle.<P> within a bounding rectangle."<P>
The canonical paper that describes the original R-Tree design is:<P> The canonical paper that describes the original R-tree design is:<P>
Guttman, A. "R-Trees: A Dynamic Index Structure for Spatial Searching." Guttman, A. "R-trees: A Dynamic Index Structure for Spatial Searching."
Proc of the 1984 ACM SIGMOD Int'l Conf on Mgmt of Data, 45-57.<P> Proc of the 1984 ACM SIGMOD Int'l Conf on Mgmt of Data, 45-57.<P>
You can also find this paper in Stonebraker's "Readings in Database You can also find this paper in Stonebraker's "Readings in Database
Systems"<P> Systems"<P>
Builtin R-Trees can handle polygons and boxes. In theory, R-trees can Built-in R-trees can handle polygons and boxes. In theory, R-trees can
be extended to handle higher number of dimensions. In practice, be extended to handle higher number of dimensions. In practice,
extending R-trees require a bit of work and we don't currently have any extending R-trees requires a bit of work and we don't currently have any
documentation on how to do it.<P> documentation on how to do it.<P>
...@@ -964,13 +964,13 @@ Optimization?</H4><P> ...@@ -964,13 +964,13 @@ Optimization?</H4><P>
The GEQO module speeds query The GEQO module speeds query
optimization when joining many tables by means of a Genetic optimization when joining many tables by means of a Genetic
Algorithm (GA). It allows the handling of large join queries through Algorithm (GA). It allows the handling of large join queries through
non-exhaustive search.<P> nonexhaustive search.<P>
<H4><A NAME="4.13">4.13</A>) How do I do regular expression searches and <H4><A NAME="4.13">4.13</A>) How do I do regular expression searches and
case-insensitive regular expression searches?</H4><P> case-insensitive regular expression searches?</H4><P>
The <I>~</I> operator does regular-expression matching, and <I>~*</I> The <I>~</I> operator does regular expression matching, and <I>~*</I>
does case-insensitive regular-expression matching. There is no does case-insensitive regular expression matching. There is no
case-insensitive variant of the LIKE operator, but you can get the case-insensitive variant of the LIKE operator, but you can get the
effect of case-insensitive <SMALL>LIKE</SMALL> with this: effect of case-insensitive <SMALL>LIKE</SMALL> with this:
<PRE> <PRE>
...@@ -999,7 +999,7 @@ BYTEA bytea variable-length array of bytes ...@@ -999,7 +999,7 @@ BYTEA bytea variable-length array of bytes
You will see the internal name when examining system catalogs You will see the internal name when examining system catalogs
and in some error messages.<P> and in some error messages.<P>
The last four types above are "varlena" types (i.e. the first four bytes The last four types above are "varlena" types (i.e., the first four bytes
are the length, followed by the data). <I>char(#)</I> allocates the are the length, followed by the data). <I>char(#)</I> allocates the
maximum number of bytes no matter how much data is stored in the field. maximum number of bytes no matter how much data is stored in the field.
<I>text, varchar(#),</I> and <I>bytea</I> all have variable length on the disk, <I>text, varchar(#),</I> and <I>bytea</I> all have variable length on the disk,
...@@ -1043,17 +1043,26 @@ One approach is to to retrieve the next SERIAL value from the sequence object wi ...@@ -1043,17 +1043,26 @@ One approach is to to retrieve the next SERIAL value from the sequence object wi
$newSerialID = nextval('person_id_seq'); $newSerialID = nextval('person_id_seq');
INSERT INTO person (id, name) VALUES ($newSerialID, 'Blaise Pascal'); INSERT INTO person (id, name) VALUES ($newSerialID, 'Blaise Pascal');
</PRE> </PRE>
You would then also have the new value stored in <CODE>$newSerialID</CODE> for use in other queries (e.g., as a foreign key to the <CODE>person</CODE> table). Note that the name of the automatically-created SEQUENCE object will be named &lt<I>table</I>&gt_&lt<I>serialcolumn</I>&gt_<I>seq</I>, where <I>table</I> and <I>serialcolumn</I> are the names of your table and your SERIAL column, respectively.
You would then also have the new value stored in
<CODE>$newSerialID</CODE> for use in other queries (e.g., as a foreign
key to the <CODE>person</CODE> table). Note that the name of the
automatically created SEQUENCE object will be named
&lt<I>table</I>&gt_&lt<I>serialcolumn</I>&gt_<I>seq</I>, where
<I>table</I> and <I>serialcolumn</I> are the names of your table and
your SERIAL column, respectively.
<P> <P>
Alternatively, you could retrieve the just-assigned SERIAL value with the <I>currval</I>() function <I>after</I> it was inserted by default, e.g., Alternatively, you could retrieve the assigned SERIAL value with the <I>currval</I>() function <I>after</I> it was inserted by default, e.g.,
<PRE> <PRE>
INSERT INTO person (name) VALUES ('Blaise Pascal'); INSERT INTO person (name) VALUES ('Blaise Pascal');
$newID = currval('person_id_seq'); $newID = currval('person_id_seq');
</PRE> </PRE>
Finally, you could use the <A HREF="#4.17"><small>OID</small></A> returned from the
INSERT statement to lookup the default value, though this is probably Finally, you could use the <A HREF="#4.17"><small>OID</small></A>
the least portable approach. In Perl, using DBI with Edmund Mergl's returned from the INSERT statement to look up the default value, though
DBD::Pg module, the oid value is made available via this is probably the least portable approach. In Perl, using DBI with
Edmund Mergl's DBD::Pg module, the oid value is made available via
<I>$sth-&gt;{pg_oid_status} after $sth-&gt;execute().</I> <I>$sth-&gt;{pg_oid_status} after $sth-&gt;execute().</I>
<H4><A NAME="4.16.3">4.16.3</A>) Don't <I>currval()</I> and <I>nextval()</I> lead to <H4><A NAME="4.16.3">4.16.3</A>) Don't <I>currval()</I> and <I>nextval()</I> lead to
...@@ -1065,12 +1074,13 @@ No. This is handled by the backends. ...@@ -1065,12 +1074,13 @@ No. This is handled by the backends.
<H4><A NAME="4.17">4.17</A>) What is an <small>OID</small>? What is a <H4><A NAME="4.17">4.17</A>) What is an <small>OID</small>? What is a
<small>TID</small>?</H4><P> <small>TID</small>?</H4><P>
<small>OID</small>s are PostgreSQL's answer to unique row ids. Every row that is <small>OID</small>s are PostgreSQL's answer to unique row ids. Every
created in PostgreSQL gets a unique <small>OID</small>. All <small>OID</small>s generated during row that is created in PostgreSQL gets a unique <small>OID</small>. All
<I>initdb</I> are less than 16384 (from <I>backend/access/transam.h</I>). All <small>OID</small>s generated during <I>initdb</I> are less than 16384
user-created <small>OID</small>s are equal or greater that this. By default, all these (from <I>backend/access/transam.h</I>). All user-created
<small>OID</small>s are unique not only within a table, or database, but unique within <small>OID</small>s are equal to or greater than this. By default, all
the entire PostgreSQL installation.<P> these <small>OID</small>s are unique not only within a table or
database, but unique within the entire PostgreSQL installation.<P>
PostgreSQL uses <small>OID</small>s in its internal system tables to link rows between PostgreSQL uses <small>OID</small>s in its internal system tables to link rows between
tables. These <small>OID</small>s can be used to identify specific user rows and used tables. These <small>OID</small>s can be used to identify specific user rows and used
...@@ -1175,7 +1185,7 @@ Use <i>now()</i>: ...@@ -1175,7 +1185,7 @@ Use <i>now()</i>:
<P> <P>
<H4><A NAME="4.23">4.23</A>) Why are my subqueries using <H4><A NAME="4.23">4.23</A>) Why are my subqueries using
<CODE><small>IN</small></CODE> so slow?<BR></H4><P> <CODE><small>IN</small></CODE> so slow?<BR></H4><P>
Currently, we join subqueries to outer queries by sequential scanning Currently, we join subqueries to outer queries by sequentially scanning
the result of the subquery for each row of the outer query. A workaround the result of the subquery for each row of the outer query. A workaround
is to replace <CODE>IN</CODE> with <CODE>EXISTS</CODE>: is to replace <CODE>IN</CODE> with <CODE>EXISTS</CODE>:
<CODE><PRE> <CODE><PRE>
...@@ -1216,12 +1226,12 @@ does an <i>outer</i> join of the two tables: ...@@ -1216,12 +1226,12 @@ does an <i>outer</i> join of the two tables:
I run it in <I>psql,</I> why does it dump core?</H4><P> I run it in <I>psql,</I> why does it dump core?</H4><P>
The problem could be a number of things. Try testing your user-defined The problem could be a number of things. Try testing your user-defined
function in a stand alone test program first. function in a stand-alone test program first.
<H4><A NAME="5.2">5.2</A>) What does the message <H4><A NAME="5.2">5.2</A>) What does the message
<I>"NOTICE:PortalHeapMemoryFree: 0x402251d0 not in alloc set!"</I> mean?</H4><P> <I>"NOTICE:PortalHeapMemoryFree: 0x402251d0 not in alloc set!"</I> mean?</H4><P>
You are <I>pfree'ing</I> something that was not <I>palloc'ed.</I> You are <I>pfree'</I>ing something that was not <I>palloc'</I>ed.
Beware of mixing <I>malloc/free</I> and <I>palloc/pfree.</I> Beware of mixing <I>malloc/free</I> and <I>palloc/pfree.</I>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment