Commit 8ea39851 authored by Bruce Momjian's avatar Bruce Momjian

Restructure TODO sections.

parent b9fe8ee2
......@@ -2,7 +2,7 @@
PostgreSQL TODO List
====================
Current maintainer: Bruce Momjian (pgman@candle.pha.pa.us)
Last updated: Mon Jul 4 08:32:37 EDT 2005
Last updated: Mon Jul 4 13:00:23 EDT 2005
The most recent version of this document can be viewed at
http://www.postgresql.org/docs/faqs.TODO.html.
......@@ -22,12 +22,6 @@ Administration
* Remove behavior of postmaster -o after making postmaster/postgres
flags unique
* Allow limits on per-db/role connections
* Allow server log information to be output as INSERT statements
This would allow server log information to be easily loaded into
a database for analysis.
* Prevent dropping user that still owns objects, or auto-drop the objects
* Allow pooled connections to list all prepared queries
This would allow an application inheriting a pooled connection to know
......@@ -35,109 +29,119 @@ Administration
* Allow major upgrades without dump/reload, perhaps using pg_upgrade
[pg_upgrade]
* Allow GRANT/REVOKE permissions to be applied to all schema objects with one
command
The proposed syntax is:
GRANT SELECT ON ALL TABLES IN public TO phpuser;
GRANT SELECT ON NEW TABLES IN public TO phpuser;
* Allow GRANT/REVOKE permissions to be inherited by objects based on
schema permissions
* Check for unreferenced table files created by transactions that were
in-progress when the server terminated abruptly
* Allow reporting of which objects are in which tablespaces
* Allow administrators to safely terminate individual sessions either
via an SQL function or SIGTERM
This item is difficult because a tablespace can contain objects from
multiple databases. There is a server-side function that returns the
databases which use a specific tablespace, so this requires a tool
that will call that function and connect to each database to find the
objects in each database for that tablespace.
Currently SIGTERM of a backend can lead to lock table corruption.
* Allow a database in tablespace t1 with tables created in tablespace t2
to be used as a template for a new database created with default
tablespace t2
* Prevent dropping user that still owns objects, or auto-drop the objects
* Set proper permissions on non-system schemas during db creation
All objects in the default database tablespace must have default tablespace
specifications. This is because new databases are created by copying
directories. If you mix default tablespace tables and tablespace-specified
tables in the same directory, creating a new database from such a mixed
directory would create a new database with tables that had incorrect
explicit tablespaces. To fix this would require modifying pg_class in the
newly copied database, which we don't currently do.
Currently all schemas are owned by the super-user because they are
copied from the template1 database.
* Add a GUC variable to control the tablespace for temporary objects and
sort files
* -Add the client IP address and port to pg_stat_activity
* Support table partitioning that allows a single table to be stored
in subtables that are partitioned based on the primary key or a WHERE
clause
It could start with a random tablespace from a supplied list and cycle
through the list.
* Add ability to monitor the use of temporary sort files
* Allow WAL replay of CREATE TABLESPACE to work when the directory
structure on the recovery computer is different from the original
* Add "include file" functionality in postgresql.conf
* -Add session start time and last statement time to pg_stat_activity
* Allow server logs to be remotely read using SQL commands
* Allow pg_hba.conf settings to be controlled via SQL
* Improve replication solutions
o Load balancing
You can use any of the master/slave replication servers to use a
standby server for data warehousing. To allow read/write queries to
multiple servers, you need multi-master replication like pgcluster.
o Allow replication over unreliable or non-persistent links
* Configuration files
o Add "include file" functionality in postgresql.conf
o Allow postgresql.conf values to be set so they can not be changed
by the user
o Allow commenting of variables in postgresql.conf to restore them
to defaults
o Allow pg_hba.conf settings to be controlled via SQL
This would require a new global table that is dumped to flat file for
use by the postmaster. We do a similar thing for pg_shadow currently.
* Allow administrators to safely terminate individual sessions either
via an SQL function or SIGTERM
Currently SIGTERM of a backend can lead to lock table corruption.
* Tablespaces
* Allow a database in tablespace t1 with tables created in
tablespace t2 to be used as a template for a new database created
with default tablespace t2
All objects in the default database tablespace must have default
tablespace specifications. This is because new databases are
created by copying directories. If you mix default tablespace
tables and tablespace-specified tables in the same directory,
creating a new database from such a mixed directory would create a
new database with tables that had incorrect explicit tablespaces.
To fix this would require modifying pg_class in the newly copied
database, which we don't currently do.
* Allow reporting of which objects are in which tablespaces
This item is difficult because a tablespace can contain objects
from multiple databases. There is a server-side function that
returns the databases which use a specific tablespace, so this
requires a tool that will call that function and connect to each
database to find the objects in each database for that tablespace.
* Un-comment all variables in postgresql.conf
o Add a GUC variable to control the tablespace for temporary objects
and sort files
By not showing commented-out variables, we discourage people from
thinking that re-commenting a variable returns it to its default.
This has to address environment variables that are then overridden
by config file values. Another option is to allow commented values
to return to their default values.
It could start with a random tablespace from a supplied list and
cycle through the list.
o Allow WAL replay of CREATE TABLESPACE to work when the directory
structure on the recovery computer is different from the original
o Allow per-tablespace quotas
* Allow point-in-time recovery to archive partially filled write-ahead
logs [pitr]
Currently only full WAL files are archived. This means that the most
recent transactions aren't available for recovery in case of a disk
failure. This could be triggered by a user command or a timer.
* Point-in-time Recovery (PITR)
o Allow point-in-time recovery to archive partially filled
write-ahead logs [pitr]
* Automatically force archiving of partially-filled WAL files when
Currently only full WAL files are archived. This means that the
most recent transactions aren't available for recovery in case
of a disk failure. This could be triggered by a user command or
a timer.
o Automatically force archiving of partially-filled WAL files when
pg_stop_backup() is called or the server is stopped
Doing this will allow administrators to know more easily when the
archive contins all the files needed for point-in-time recovery.
Doing this will allow administrators to know more easily when
the archive contins all the files needed for point-in-time
recovery.
* Create dump tool for write-ahead logs for use in determining
o Create dump tool for write-ahead logs for use in determining
transaction id for point-in-time recovery
* Set proper permissions on non-system schemas during db creation
o Allow a warm standby system to also allow read-only queries
[pitr]
Currently all schemas are owned by the super-user because they are
copied from the template1 database.
This is useful for checking PITR recovery.
* Add a function that returns the 'uptime' of the postmaster
* Allow a warm standby system to also allow read-only queries [pitr]
o Allow the PITR process to be debugged and data examined
This is useful for checking PITR recovery.
* Allow the PITR process to be debugged and data examined
* -Add the client IP address and port to pg_stat_activity
* Improve replication solutions
o Load balancing
Monitoring
==========
You can use any of the master/slave replication servers to use a
standby server for data warehousing. To allow read/write queries to
multiple servers, you need multi-master replication like pgcluster.
* Allow server log information to be output as INSERT statements
o Allow replication over unreliable or non-persistent links
This would allow server log information to be easily loaded into
a database for analysis.
* Support table partitioning that allows a single table to be stored
in subtables that are partitioned based on the primary key or a WHERE
clause
* Allow postgresql.conf values to be set so they can not be changed by
the user
* Allow per-tablespace quotas
* Add ability to monitor the use of temporary sort files
* -Add session start time and last statement time to pg_stat_activity
* Allow server logs to be remotely read using SQL commands
* -Add a function that returns the start time of the postmaster
Data Types
......@@ -156,25 +160,10 @@ Data Types
The positive modulus result returned by NUMERICs might be considered
inaccurate, in one sense.
* Add function to return compressed length of TOAST data values
* Allow INET subnet tests using non-constants to be indexed
* Add transaction_timestamp(), statement_timestamp(), clock_timestamp()
functionality
Current CURRENT_TIMESTAMP returns the start time of the current
transaction, and gettimeofday() returns the wallclock time. This will
make time reporting more consistent and will allow reporting of
the statement start time.
* Have sequence dependency track use of DEFAULT sequences,
seqname.nextval (?)
* Disallow changing default expression of a SERIAL column (?)
seqname.nextval?
* Disallow changing default expression of a SERIAL column?
* Allow infinite dates just like infinite timestamps
* Have initdb set DateStyle based on locale?
* Add pg_get_acldef(), pg_get_typedefault(), and pg_get_attrdef()
* Allow to_char() to print localized month names
* Allow functions to have a schema search path specified at creation time
* Allow substring/replace() to get/set bit values
* Add a GUC variable to allow output of interval values in ISO8601 format
* Fix data types where equality comparison isn't intuitive, e.g. box
* Merge hardwired timezone names with the TZ database; allow either kind
......@@ -195,24 +184,7 @@ Data Types
* Allow INET + INT4 to increment the host part of the address, or
throw an error on overflow
* Add 'tid != tid ' operator for use in corruption recovery
* Prevent to_char() on interval from returning meaningless values
For example, to_char('1 month', 'mon') is meaningless. Basically,
most date-related parameters to to_char() are meaningless for
intervals because interval is not anchored to a date.
* Allow to_char() on interval values to accumulate the highest unit
requested
Some special format flag would be required to request such
accumulation. Such functionality could also be added to EXTRACT.
Prevent accumulation that crosses the month/day boundary because of
the uneven number of days in a month.
o to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
o to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
o to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
o to_char(INTERVAL '3 years 5 months','MM') => 41
* Add ISO INTERVAL handling
o Add support for day-time syntax, INTERVAL '1 2:03:04' DAY TO SECOND
......@@ -228,6 +200,7 @@ Data Types
INTERVAL '11 months' AS YEAR should return one or zero
o Support precision, CREATE TABLE foo (a INTERVAL MONTH(3))
* ARRAYS
o Allow NULLs in arrays
o Allow MIN()/MAX() on arrays
......@@ -238,7 +211,7 @@ Data Types
* BINARY DATA
o Improve vacuum of large objects, like /contrib/vacuumlo (?)
o Improve vacuum of large objects, like /contrib/vacuumlo?
o Add security checking for large objects
Currently large objects entries do not have owners. Permissions can
......@@ -251,6 +224,43 @@ Data Types
This requires the TOAST column to be stored EXTERNAL.
Functions
=========
* Add function to return compressed length of TOAST data values
* Allow INET subnet tests using non-constants to be indexed
* Add transaction_timestamp(), statement_timestamp(), clock_timestamp()
functionality
Current CURRENT_TIMESTAMP returns the start time of the current
transaction, and gettimeofday() returns the wallclock time. This will
make time reporting more consistent and will allow reporting of
the statement start time.
* Add pg_get_acldef(), pg_get_typedefault(), and pg_get_attrdef()
* Allow to_char() to print localized month names
* Allow functions to have a schema search path specified at creation time
* Allow substring/replace() to get/set bit values
* Allow to_char() on interval values to accumulate the highest unit
requested
Some special format flag would be required to request such
accumulation. Such functionality could also be added to EXTRACT.
Prevent accumulation that crosses the month/day boundary because of
the uneven number of days in a month.
o to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
o to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
o to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
o to_char(INTERVAL '3 years 5 months','MM') => 41
* Prevent to_char() on interval from returning meaningless values
For example, to_char('1 month', 'mon') is meaningless. Basically,
most date-related parameters to to_char() are meaningless for
intervals because interval is not anchored to a date.
Multi-Language Support
======================
......@@ -267,7 +277,7 @@ Multi-Language Support
Right now only one encoding is allowed per database.
* Support multiple simultaneous character sets, per SQL92
* Improve UTF8 combined character handling (?)
* Improve UTF8 combined character handling?
* Add octet_length_server() and octet_length_client()
* Make octet_length_client() the same as octet_length()?
......@@ -287,87 +297,6 @@ Views / Rules
* Allow RULE recompilation
Indexes
=======
* Allow inherited tables to inherit index, UNIQUE constraint, and primary
key, foreign key
* UNIQUE INDEX on base column not honored on INSERTs/UPDATEs from
inherited table: INSERT INTO inherit_table (unique_index_col) VALUES
(dup) should fail
The main difficulty with this item is the problem of creating an index
that can span more than one table.
* Add UNIQUE capability to non-btree indexes
* Add more gist index support for geometric data types
* -Use indexes for MIN() and MAX()
MIN/MAX queries can already be rewritten as SELECT col FROM tab ORDER
BY col {DESC} LIMIT 1. Completing this item involves doing this
transformation automatically.
* -Use index to restrict rows returned by multi-key index when used with
non-consecutive keys to reduce heap accesses
For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and
col3 = 9, spin though the index checking for col1 and col3 matches,
rather than just col1; also called skip-scanning.
* Prevent index uniqueness checks when UPDATE does not modify the column
Uniqueness (index) checks are done when updating a column even if the
column is not modified by the UPDATE.
* Fetch heap pages matching index entries in sequential order
Rather than randomly accessing heap pages based on index entries, mark
heap pages needing access in a bitmap and do the lookups in sequential
order. Another method would be to sort heap ctids matching the index
before accessing the heap rows.
* -Allow non-bitmap indexes to be combined by creating bitmaps in memory
This feature allows separate indexes to be ANDed or ORed together. This
is particularly useful for data warehousing applications that need to
query the database in an many permutations. This feature scans an index
and creates an in-memory bitmap, and allows that bitmap to be combined
with other bitmap created in a similar way. The bitmap can either index
all TIDs, or be lossy, meaning it records just page numbers and each
page tuple has to be checked for validity in a separate pass.
* Allow the creation of on-disk bitmap indexes which can be quickly
combined with other bitmap indexes
Such indexes could be more compact if there are only a few distinct values.
Such indexes can also be compressed. Keeping such indexes updated can be
costly.
* Allow use of indexes to search for NULLs
One solution is to create a partial index on an IS NULL expression.
* -Add concurrency to GIST
* Pack hash index buckets onto disk pages more efficiently
Currently no only one hash bucket can be stored on a page. Ideally
several hash buckets could be stored on a single page and greater
granularity used for the hash algorithm.
* Consider sorting hash buckets so entries can be found using a binary
search, rather than a linear scan
* In hash indexes, consider storing the hash value with or instead
of the key itself
* Allow accurate statistics to be collected on indexes with more than
one column or expression indexes, perhaps using per-index statistics
* Add fillfactor to control reserved free space during index creation
* Allow the creation of indexes with mixed ascending/descending specifiers
* -Fix incorrect rtree results due to wrong assumptions about "over"
operator semantics
* Allow GIST indexes to create certain complex index types, like digital
trees (see Aoki)
Commands
========
......@@ -375,12 +304,9 @@ Commands
* Change LIMIT/OFFSET and FETCH/MOVE to use int8
* Allow CREATE TABLE AS to determine column lengths for complex
expressions like SELECT col1 || col2
* Allow UPDATE to handle complex aggregates [update] (?)
* Allow backslash handling in quoted strings to be disabled for portability
The use of C-style backslashes (.e.g. \n, \r) in quoted strings is not
SQL-spec compliant, so allow such handling to be disabled. However,
disabling backslashes could break many third-party applications and tools.
* Allow UPDATE to handle complex aggregates [update]?
* -Add E'' escape string marker so eventually ordinary strings can treat
backslashes literally, for portability
* Allow an alias to be provided for the target table in UPDATE/DELETE
......@@ -392,6 +318,7 @@ Commands
functionality in DELETE. It's been agreed that the keyword should
be USING, to avoid anything as confusing as DELETE FROM a FROM b.
* Have initdb set DateStyle based on locale?
* Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT
* -Allow REINDEX to rebuild all database indexes
* Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY
......@@ -427,7 +354,7 @@ Commands
Currently the system uses the operating system COPY command to create
a new database.
* Add C code to copy directories for use in creating new databases
* Add C code on Unix to copy directories for use in creating new databases
* Have pg_ctl look at PGHOST in case it is a socket directory?
* Allow pg_ctl to work properly with configuration files located outside
the PGDATA directory
......@@ -437,7 +364,6 @@ Commands
allow pg_ctl to read and understand postgresql.conf to find the
data_directory value.
* Allow column-level GRANT/REVOKE privileges
* Add a GUC variable to warn about non-standard SQL usage in queries
* Add MERGE command that does UPDATE/DELETE, or on failure, INSERT (rules,
triggers?)
......@@ -446,7 +372,7 @@ Commands
creation
* Add COMMENT ON for all cluster global objects (roles, databases
and tablespaces)
* Add an option to automatically use savepoints for each statement in a
* -Add an option to automatically use savepoints for each statement in a
multi-statement transaction.
When enabled, this would allow errors in multi-statement transactions
......@@ -467,6 +393,7 @@ Commands
* Allow FOR UPDATE queries to do NOWAIT locks
* Add GUC to issue notice about queries that use unjoined tables
* ALTER
o Have ALTER TABLE RENAME rename SERIAL sequence names
o Add ALTER DOMAIN TYPE
......@@ -509,12 +436,25 @@ Commands
processed, with ROLLBACK on COPY failure.
o -Allow COPY to understand \x as a hex byte
o Have COPY return the number of rows loaded/unloaded (?)
o Have COPY return the number of rows loaded/unloaded?
o -Allow COPY to optionally include column headings in the first line
o -Allow COPY FROM ... CSV to interpret newlines and carriage
returns in data
* GRANT/REVOKE
o Allow column-level privileges
* Allow GRANT/REVOKE permissions to be applied to all schema objects with one
command
The proposed syntax is:
GRANT SELECT ON ALL TABLES IN public TO phpuser;
GRANT SELECT ON NEW TABLES IN public TO phpuser;
* Allow GRANT/REVOKE permissions to be inherited by objects based on
schema permissions
* CURSOR
o Allow UPDATE/DELETE WHERE CURRENT OF cursor
......@@ -524,7 +464,7 @@ Commands
and no FOR UPDATE lock.
o Prevent DROP TABLE from dropping a row referenced by its own open
cursor (?)
cursor?
o Allow pooled connections to list all open WITH HOLD cursors
......@@ -546,7 +486,7 @@ Commands
o -Have SHOW ALL show descriptions for server-side variables
o Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM
ANALYZE, and CLUSTER
o Add SET PATH for schemas (?)
o Add SET PATH for schemas?
This is basically the same as SET search_path.
......@@ -567,7 +507,7 @@ Commands
o Allow function parameters to be passed by name,
get_employee_salary(emp_id => 12345, tax_year => 2001)
o Add Oracle-style packages
o Add table function support to pltcl, plperl, plpython (?)
o Add table function support to pltcl, plperl, plpython?
o Allow PL/pgSQL to name columns by ordinal position, e.g. rec.(3)
o -Allow PL/pgSQL EXECUTE query_var INTO record_var;
o Add capability to create and call PROCEDURES
......@@ -581,7 +521,7 @@ Clients
=======
* Add a libpq function to support Parse/DescribeStatement capability
* Prevent libpq's PQfnumber() from lowercasing the column name (?)
* Prevent libpq's PQfnumber() from lowercasing the column name?
* Allow libpq to access SQLSTATE so pg_ctl can test for connection failure
This would be used for checking if the server is up.
......@@ -597,6 +537,8 @@ Clients
* Allow psql \pset boolean variables to set to fixed values, rather than toggle
* Consistently display privilege information for all objects in psql
* Improve psql's handling of multi-line queries
* pg_dump
o Have pg_dump use multi-statement transactions for INSERT dumps
o Allow pg_dump to use multiple -t and -n switches [pg_dump]
......@@ -616,13 +558,14 @@ Clients
o Add CSV output format
o Update pg_dump and psql to use the new COPY libpq API (Christopher)
* ECPG
o Docs
Document differences between ecpg and the SQL standard and
information about the Informix-compatibility module.
o Solve cardinality > 1 for input descriptors / variables (?)
o Solve cardinality > 1 for input descriptors / variables?
o Add a semantic check level, e.g. check if a table really exists
o fix handling of DB attributes that are arrays
o Use backend PREPARE/EXECUTE facility for ecpg where possible
......@@ -647,8 +590,8 @@ Referential Integrity
* -Implement shared row locks and use them in RI triggers
* Enforce referential integrity for system tables
* Change foreign key constraint for array -> element to mean element
in array (?)
* Allow DEFERRABLE UNIQUE constraints (?)
in array?
* Allow DEFERRABLE UNIQUE constraints?
* Allow triggers to be disabled [trigger]
Currently the only way to disable triggers is to modify the system
......@@ -710,8 +653,89 @@ Exotic Features
o Allow nested schemas
PERFORMANCE
===========
Indexes
=======
* Allow inherited tables to inherit index, UNIQUE constraint, and primary
key, foreign key
* UNIQUE INDEX on base column not honored on INSERTs/UPDATEs from
inherited table: INSERT INTO inherit_table (unique_index_col) VALUES
(dup) should fail
The main difficulty with this item is the problem of creating an index
that can span more than one table.
* Add UNIQUE capability to non-btree indexes
* Add more GIST index support for geometric data types
* -Use indexes for MIN() and MAX()
MIN/MAX queries can already be rewritten as SELECT col FROM tab ORDER
BY col {DESC} LIMIT 1. Completing this item involves doing this
transformation automatically.
* -Use index to restrict rows returned by multi-key index when used with
non-consecutive keys to reduce heap accesses
For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and
col3 = 9, spin though the index checking for col1 and col3 matches,
rather than just col1; also called skip-scanning.
* Prevent index uniqueness checks when UPDATE does not modify the column
Uniqueness (index) checks are done when updating a column even if the
column is not modified by the UPDATE.
* Fetch heap pages matching index entries in sequential order
Rather than randomly accessing heap pages based on index entries, mark
heap pages needing access in a bitmap and do the lookups in sequential
order. Another method would be to sort heap ctids matching the index
before accessing the heap rows.
* -Allow non-bitmap indexes to be combined by creating bitmaps in memory
This feature allows separate indexes to be ANDed or ORed together. This
is particularly useful for data warehousing applications that need to
query the database in an many permutations. This feature scans an index
and creates an in-memory bitmap, and allows that bitmap to be combined
with other bitmap created in a similar way. The bitmap can either index
all TIDs, or be lossy, meaning it records just page numbers and each
page tuple has to be checked for validity in a separate pass.
* Allow the creation of on-disk bitmap indexes which can be quickly
combined with other bitmap indexes
Such indexes could be more compact if there are only a few distinct values.
Such indexes can also be compressed. Keeping such indexes updated can be
costly.
* Allow use of indexes to search for NULLs
One solution is to create a partial index on an IS NULL expression.
* -Add concurrency to GIST
* Allow accurate statistics to be collected on indexes with more than
one column or expression indexes, perhaps using per-index statistics
* Add fillfactor to control reserved free space during index creation
* Allow the creation of indexes with mixed ascending/descending specifiers
* -Fix incorrect rtree results due to wrong assumptions about "over"
operator semantics
* Allow GIST indexes to create certain complex index types, like digital
trees (see Aoki)
* Hash
o Pack hash index buckets onto disk pages more efficiently
Currently no only one hash bucket can be stored on a page. Ideally
several hash buckets could be stored on a single page and greater
granularity used for the hash algorithm.
o Consider sorting hash buckets so entries can be found using a
binary search, rather than a linear scan
o In hash indexes, consider storing the hash value with or instead
of the key itself
Fsync
......@@ -724,8 +748,8 @@ Fsync
* Add program to test if fsync has a delay compared to non-fsync
Cache
=====
Cache Usage
===========
* Allow free-behind capability for large sequential scans, perhaps using
posix_fadvise()
......@@ -848,8 +872,8 @@ Locking
* Fix priority ordering of read and write light-weight locks (Neil)
Startup Time
============
Startup Time Improvements
=========================
* Experiment with multi-threaded backend [thread]
......@@ -877,7 +901,7 @@ Write-Ahead Log
eliminated from point-in-time archive files.
* Reduce WAL traffic so only modified values are written rather than
entire rows (?)
entire rows?
* Turn off full page writes if fsync is disabled
If fsync is off, there is no purpose in writing full pages to WAL
......@@ -934,18 +958,18 @@ Optimizer / Executor
* ANALYZE should record a pg_statistic entry for an all-NULL column
* Log queries where the optimizer row estimates were dramatically
different from the number of rows actually found (?)
different from the number of rows actually found?
Miscellaneous
=============
Miscellaneous Performance
=========================
* Do async I/O for faster random read-ahead of data
Async I/O allows multiple I/O requests to be sent to the disk with
results coming back asynchronously.
* Use mmap() rather than SYSV shared memory or to write WAL files (?)
* Use mmap() rather than SYSV shared memory or to write WAL files?
This would remove the requirement for SYSV SHM but would introduce
portability issues. Anonymous mmap (or mmap to /dev/zero) is required
......
......@@ -8,7 +8,7 @@
<body bgcolor="#FFFFFF" text="#000000" link="#FF0000" vlink="#A00000" alink="#0000FF">
<h1><a name="section_1">PostgreSQL TODO List</a></h1>
<p>Current maintainer: Bruce Momjian (<a href="mailto:pgman@candle.pha.pa.us">pgman@candle.pha.pa.us</a>)<br/>
Last updated: Mon Jul 4 08:32:37 EDT 2005
Last updated: Mon Jul 4 13:00:23 EDT 2005
</p>
<p>The most recent version of this document can be viewed at<br/>
<a href="http://www.postgresql.org/docs/faqs.TODO.html">http://www.postgresql.org/docs/faqs.TODO.html</a>.
......@@ -27,94 +27,27 @@ first.
<li>Remove behavior of postmaster -o after making postmaster/postgres
flags unique
</li><li>Allow limits on per-db/role connections
</li><li>Allow server log information to be output as INSERT statements
<p> This would allow server log information to be easily loaded into
a database for analysis.
</p>
</li><li>Prevent dropping user that still owns objects, or auto-drop the objects
</li><li>Allow pooled connections to list all prepared queries
<p> This would allow an application inheriting a pooled connection to know
the queries prepared in the current session.
</p>
</li><li>Allow major upgrades without dump/reload, perhaps using pg_upgrade
[<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?pg_upgrade">pg_upgrade</a>]
</li><li>Allow GRANT/REVOKE permissions to be applied to all schema objects with one
command
<p> The proposed syntax is:
</p><p> GRANT SELECT ON ALL TABLES IN public TO phpuser;
GRANT SELECT ON NEW TABLES IN public TO phpuser;
</p>
</li><li>Allow GRANT/REVOKE permissions to be inherited by objects based on
schema permissions
</li><li>Check for unreferenced table files created by transactions that were
in-progress when the server terminated abruptly
</li><li>Allow reporting of which objects are in which tablespaces
<p> This item is difficult because a tablespace can contain objects from
multiple databases. There is a server-side function that returns the
databases which use a specific tablespace, so this requires a tool
that will call that function and connect to each database to find the
objects in each database for that tablespace.
</p>
</li><li>Allow a database in tablespace t1 with tables created in tablespace t2
to be used as a template for a new database created with default
tablespace t2
<p> All objects in the default database tablespace must have default tablespace
specifications. This is because new databases are created by copying
directories. If you mix default tablespace tables and tablespace-specified
tables in the same directory, creating a new database from such a mixed
directory would create a new database with tables that had incorrect
explicit tablespaces. To fix this would require modifying pg_class in the
newly copied database, which we don't currently do.
</p>
</li><li>Add a GUC variable to control the tablespace for temporary objects and
sort files
<p> It could start with a random tablespace from a supplied list and cycle
through the list.
</p>
</li><li>Add ability to monitor the use of temporary sort files
</li><li>Allow WAL replay of CREATE TABLESPACE to work when the directory
structure on the recovery computer is different from the original
</li><li>Add "include file" functionality in postgresql.conf
</li><li>-<em>Add session start time and last statement time to pg_stat_activity</em>
</li><li>Allow server logs to be remotely read using SQL commands
</li><li>Allow pg_hba.conf settings to be controlled via SQL
<p> This would require a new global table that is dumped to flat file for
use by the postmaster. We do a similar thing for pg_shadow currently.
</p>
</li><li>Allow administrators to safely terminate individual sessions either
via an SQL function or SIGTERM
<p> Currently SIGTERM of a backend can lead to lock table corruption.
</p>
</li><li>Un-comment all variables in postgresql.conf
<p> By not showing commented-out variables, we discourage people from
thinking that re-commenting a variable returns it to its default.
This has to address environment variables that are then overridden
by config file values. Another option is to allow commented values
to return to their default values.
</p>
</li><li>Allow point-in-time recovery to archive partially filled write-ahead
logs [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?pitr">pitr</a>]
<p> Currently only full WAL files are archived. This means that the most
recent transactions aren't available for recovery in case of a disk
failure. This could be triggered by a user command or a timer.
</p>
</li><li>Automatically force archiving of partially-filled WAL files when
pg_stop_backup() is called or the server is stopped
<p> Doing this will allow administrators to know more easily when the
archive contins all the files needed for point-in-time recovery.
</p>
</li><li>Create dump tool for write-ahead logs for use in determining
transaction id for point-in-time recovery
</li><li>Prevent dropping user that still owns objects, or auto-drop the objects
</li><li>Set proper permissions on non-system schemas during db creation
<p> Currently all schemas are owned by the super-user because they are
copied from the template1 database.
</p>
</li><li>Add a function that returns the 'uptime' of the postmaster
</li><li>Allow a warm standby system to also allow read-only queries [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?pitr">pitr</a>]
<p> This is useful for checking PITR recovery.
</p>
</li><li>Allow the PITR process to be debugged and data examined
</li><li>-<em>Add the client IP address and port to pg_stat_activity</em>
</li><li>Support table partitioning that allows a single table to be stored
in subtables that are partitioned based on the primary key or a WHERE
clause
</li><li>Improve replication solutions
<ul>
<li>Load balancing
......@@ -124,14 +57,87 @@ first.
</p>
</li><li>Allow replication over unreliable or non-persistent links
</li></ul>
</li><li>Support table partitioning that allows a single table to be stored
in subtables that are partitioned based on the primary key or a WHERE
clause
</li><li>Allow postgresql.conf values to be set so they can not be changed by
the user
</li><li>Configuration files
<ul>
<li>Add "include file" functionality in postgresql.conf
</li><li>Allow postgresql.conf values to be set so they can not be changed
by the user
</li><li>Allow commenting of variables in postgresql.conf to restore them
to defaults
</li><li>Allow pg_hba.conf settings to be controlled via SQL
<p> This would require a new global table that is dumped to flat file for
use by the postmaster. We do a similar thing for pg_shadow currently.
</p>
</li></ul>
</li><li>Tablespaces
<ul>
<li>Allow a database in tablespace t1 with tables created in
tablespace t2 to be used as a template for a new database created
with default tablespace t2
<p> All objects in the default database tablespace must have default
tablespace specifications. This is because new databases are
created by copying directories. If you mix default tablespace
tables and tablespace-specified tables in the same directory,
creating a new database from such a mixed directory would create a
new database with tables that had incorrect explicit tablespaces.
To fix this would require modifying pg_class in the newly copied
database, which we don't currently do.
</p>
</li><li>Allow reporting of which objects are in which tablespaces
<p> This item is difficult because a tablespace can contain objects
from multiple databases. There is a server-side function that
returns the databases which use a specific tablespace, so this
requires a tool that will call that function and connect to each
database to find the objects in each database for that tablespace.
</p>
<ul>
<li>Add a GUC variable to control the tablespace for temporary objects
and sort files
<p> It could start with a random tablespace from a supplied list and
cycle through the list.
</p>
</li><li>Allow WAL replay of CREATE TABLESPACE to work when the directory
structure on the recovery computer is different from the original
</li><li>Allow per-tablespace quotas
</li></ul>
</li></ul>
</li><li>Point-in-time Recovery (PITR)
<ul>
<li>Allow point-in-time recovery to archive partially filled
write-ahead logs [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?pitr">pitr</a>]
<p> Currently only full WAL files are archived. This means that the
most recent transactions aren't available for recovery in case
of a disk failure. This could be triggered by a user command or
a timer.
</p>
</li><li>Automatically force archiving of partially-filled WAL files when
pg_stop_backup() is called or the server is stopped
<p> Doing this will allow administrators to know more easily when
the archive contins all the files needed for point-in-time
recovery.
</p>
</li><li>Create dump tool for write-ahead logs for use in determining
transaction id for point-in-time recovery
</li><li>Allow a warm standby system to also allow read-only queries
[<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?pitr">pitr</a>]
<p> This is useful for checking PITR recovery.
</p>
</li><li>Allow the PITR process to be debugged and data examined
</li></ul>
</li></ul>
<h1><a name="section_3">Monitoring</a></h1>
<ul>
<li>Allow server log information to be output as INSERT statements
<p> This would allow server log information to be easily loaded into
a database for analysis.
</p>
</li><li>Add ability to monitor the use of temporary sort files
</li><li>-<em>Add session start time and last statement time to pg_stat_activity</em>
</li><li>Allow server logs to be remotely read using SQL commands
</li><li>-<em>Add a function that returns the start time of the postmaster</em>
</li></ul>
<h1><a name="section_3">Data Types</a></h1>
<h1><a name="section_4">Data Types</a></h1>
<ul>
<li>Remove Money type, add money formatting for decimal type
......@@ -148,24 +154,10 @@ first.
inaccurate, in one sense.
</p>
<ul>
<li>Add function to return compressed length of TOAST data values
</li><li>Allow INET subnet tests using non-constants to be indexed
</li><li>Add transaction_timestamp(), statement_timestamp(), clock_timestamp()
functionality
<p> Current CURRENT_TIMESTAMP returns the start time of the current
transaction, and gettimeofday() returns the wallclock time. This will
make time reporting more consistent and will allow reporting of
the statement start time.
</p>
</li><li>Have sequence dependency track use of DEFAULT sequences,
seqname.nextval (?)
</li><li>Disallow changing default expression of a SERIAL column (?)
<li>Have sequence dependency track use of DEFAULT sequences,
seqname.nextval?
</li><li>Disallow changing default expression of a SERIAL column?
</li><li>Allow infinite dates just like infinite timestamps
</li><li>Have initdb set DateStyle based on locale?
</li><li>Add pg_get_acldef(), pg_get_typedefault(), and pg_get_attrdef()
</li><li>Allow to_char() to print localized month names
</li><li>Allow functions to have a schema search path specified at creation time
</li><li>Allow substring/replace() to get/set bit values
</li><li>Add a GUC variable to allow output of interval values in ISO8601 format
</li><li>Fix data types where equality comparison isn't intuitive, e.g. box
</li><li>Merge hardwired timezone names with the TZ database; allow either kind
......@@ -185,24 +177,6 @@ first.
</li><li>Allow INET + INT4 to increment the host part of the address, or
throw an error on overflow
</li><li>Add 'tid != tid ' operator for use in corruption recovery
</li><li>Prevent to_char() on interval from returning meaningless values
<p> For example, to_char('1 month', 'mon') is meaningless. Basically,
most date-related parameters to to_char() are meaningless for
intervals because interval is not anchored to a date.
</p>
</li><li>Allow to_char() on interval values to accumulate the highest unit
requested
<p> Some special format flag would be required to request such
accumulation. Such functionality could also be added to EXTRACT.
Prevent accumulation that crosses the month/day boundary because of
the uneven number of days in a month.
</p>
<ul>
<li>to_char(INTERVAL '1 hour 5 minutes', 'MI') =&gt; 65
</li><li>to_char(INTERVAL '43 hours 20 minutes', 'MI' ) =&gt; 2600
</li><li>to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') =&gt; 0:1:19:20
</li><li>to_char(INTERVAL '3 years 5 months','MM') =&gt; 41
</li></ul>
</li><li>Add ISO INTERVAL handling
<ul>
<li>Add support for day-time syntax, INTERVAL '1 2:03:04' DAY TO SECOND
......@@ -229,7 +203,7 @@ first.
</li></ul>
</li><li>BINARY DATA
<ul>
<li>Improve vacuum of large objects, like /contrib/vacuumlo (?)
<li>Improve vacuum of large objects, like /contrib/vacuumlo?
</li><li>Add security checking for large objects
<p> Currently large objects entries do not have owners. Permissions can
only be set at the pg_largeobject table level.
......@@ -240,7 +214,42 @@ first.
</p>
</li></ul>
</li></ul>
<h1><a name="section_4">Multi-Language Support</a></h1>
<h1><a name="section_5">Functions</a></h1>
<ul>
<li>Add function to return compressed length of TOAST data values
</li><li>Allow INET subnet tests using non-constants to be indexed
</li><li>Add transaction_timestamp(), statement_timestamp(), clock_timestamp()
functionality
<p> Current CURRENT_TIMESTAMP returns the start time of the current
transaction, and gettimeofday() returns the wallclock time. This will
make time reporting more consistent and will allow reporting of
the statement start time.
</p>
</li><li>Add pg_get_acldef(), pg_get_typedefault(), and pg_get_attrdef()
</li><li>Allow to_char() to print localized month names
</li><li>Allow functions to have a schema search path specified at creation time
</li><li>Allow substring/replace() to get/set bit values
</li><li>Allow to_char() on interval values to accumulate the highest unit
requested
<p> Some special format flag would be required to request such
accumulation. Such functionality could also be added to EXTRACT.
Prevent accumulation that crosses the month/day boundary because of
the uneven number of days in a month.
</p>
<ul>
<li>to_char(INTERVAL '1 hour 5 minutes', 'MI') =&gt; 65
</li><li>to_char(INTERVAL '43 hours 20 minutes', 'MI' ) =&gt; 2600
</li><li>to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') =&gt; 0:1:19:20
</li><li>to_char(INTERVAL '3 years 5 months','MM') =&gt; 41
</li></ul>
</li><li>Prevent to_char() on interval from returning meaningless values
<p> For example, to_char('1 month', 'mon') is meaningless. Basically,
most date-related parameters to to_char() are meaningless for
intervals because interval is not anchored to a date.
</p>
</li></ul>
<h1><a name="section_6">Multi-Language Support</a></h1>
<ul>
<li>Add NCHAR (as distinguished from ordinary varchar),
......@@ -254,11 +263,11 @@ first.
<p> Right now only one encoding is allowed per database.
</p>
</li><li>Support multiple simultaneous character sets, per SQL92
</li><li>Improve UTF8 combined character handling (?)
</li><li>Improve UTF8 combined character handling?
</li><li>Add octet_length_server() and octet_length_client()
</li><li>Make octet_length_client() the same as octet_length()?
</li></ul>
<h1><a name="section_5">Views / Rules</a></h1>
<h1><a name="section_7">Views / Rules</a></h1>
<ul>
<li>Automatically create rules on views so they are updateable, per SQL99
......@@ -271,90 +280,16 @@ first.
</li><li>Allow temporary views on non-temporary tables
</li><li>Allow RULE recompilation
</li></ul>
<h1><a name="section_6">Indexes</a></h1>
<ul>
<li>Allow inherited tables to inherit index, UNIQUE constraint, and primary
key, foreign key
</li><li>UNIQUE INDEX on base column not honored on INSERTs/UPDATEs from
inherited table: INSERT INTO inherit_table (unique_index_col) VALUES
(dup) should fail
<p> The main difficulty with this item is the problem of creating an index
that can span more than one table.
</p>
</li><li>Add UNIQUE capability to non-btree indexes
</li><li>Add more gist index support for geometric data types
</li><li>-<em>Use indexes for MIN() and MAX()</em>
<p> MIN/MAX queries can already be rewritten as SELECT col FROM tab ORDER
BY col {DESC} LIMIT 1. Completing this item involves doing this
transformation automatically.
</p>
</li><li>-<em>Use index to restrict rows returned by multi-key index when used with</em>
non-consecutive keys to reduce heap accesses
<p> For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and
col3 = 9, spin though the index checking for col1 and col3 matches,
rather than just col1; also called skip-scanning.
</p>
</li><li>Prevent index uniqueness checks when UPDATE does not modify the column
<p> Uniqueness (index) checks are done when updating a column even if the
column is not modified by the UPDATE.
</p>
</li><li>Fetch heap pages matching index entries in sequential order
<p> Rather than randomly accessing heap pages based on index entries, mark
heap pages needing access in a bitmap and do the lookups in sequential
order. Another method would be to sort heap ctids matching the index
before accessing the heap rows.
</p>
</li><li>-<em>Allow non-bitmap indexes to be combined by creating bitmaps in memory</em>
<p> This feature allows separate indexes to be ANDed or ORed together. This
is particularly useful for data warehousing applications that need to
query the database in an many permutations. This feature scans an index
and creates an in-memory bitmap, and allows that bitmap to be combined
with other bitmap created in a similar way. The bitmap can either index
all TIDs, or be lossy, meaning it records just page numbers and each
page tuple has to be checked for validity in a separate pass.
</p>
</li><li>Allow the creation of on-disk bitmap indexes which can be quickly
combined with other bitmap indexes
<p> Such indexes could be more compact if there are only a few distinct values.
Such indexes can also be compressed. Keeping such indexes updated can be
costly.
</p>
</li><li>Allow use of indexes to search for NULLs
<p> One solution is to create a partial index on an IS NULL expression.
</p>
</li><li>-<em>Add concurrency to GIST</em>
</li><li>Pack hash index buckets onto disk pages more efficiently
<p> Currently no only one hash bucket can be stored on a page. Ideally
several hash buckets could be stored on a single page and greater
granularity used for the hash algorithm.
</p>
</li><li>Consider sorting hash buckets so entries can be found using a binary
search, rather than a linear scan
</li><li>In hash indexes, consider storing the hash value with or instead
of the key itself
</li><li>Allow accurate statistics to be collected on indexes with more than
one column or expression indexes, perhaps using per-index statistics
</li><li>Add fillfactor to control reserved free space during index creation
</li><li>Allow the creation of indexes with mixed ascending/descending specifiers
</li><li>-<em>Fix incorrect rtree results due to wrong assumptions about "over"</em>
operator semantics
</li><li>Allow GIST indexes to create certain complex index types, like digital
trees (see Aoki)
</li></ul>
<h1><a name="section_7">Commands</a></h1>
<h1><a name="section_8">Commands</a></h1>
<ul>
<li>-<em>Add BETWEEN SYMMETRIC/ASYMMETRIC</em>
</li><li>Change LIMIT/OFFSET and FETCH/MOVE to use int8
</li><li>Allow CREATE TABLE AS to determine column lengths for complex
expressions like SELECT col1 || col2
</li><li>Allow UPDATE to handle complex aggregates [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?update">update</a>] (?)
</li><li>Allow backslash handling in quoted strings to be disabled for portability
<p> The use of C-style backslashes (.e.g. \n, \r) in quoted strings is not
SQL-spec compliant, so allow such handling to be disabled. However,
disabling backslashes could break many third-party applications and tools.
</p>
</li><li>Allow UPDATE to handle complex aggregates [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?update">update</a>]?
</li><li>-<em>Add E'' escape string marker so eventually ordinary strings can treat</em>
backslashes literally, for portability
</li><li>Allow an alias to be provided for the target table in UPDATE/DELETE
<p> This is not SQL-spec but many DBMSs allow it.
</p>
......@@ -363,6 +298,7 @@ first.
functionality in DELETE. It's been agreed that the keyword should
be USING, to avoid anything as confusing as DELETE FROM a FROM b.
</p>
</li><li>Have initdb set DateStyle based on locale?
</li><li>Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT
</li><li>-<em>Allow REINDEX to rebuild all database indexes</em>
</li><li>Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY
......@@ -394,7 +330,7 @@ first.
<p> Currently the system uses the operating system COPY command to create
a new database.
</p>
</li><li>Add C code to copy directories for use in creating new databases
</li><li>Add C code on Unix to copy directories for use in creating new databases
</li><li>Have pg_ctl look at PGHOST in case it is a socket directory?
</li><li>Allow pg_ctl to work properly with configuration files located outside
the PGDATA directory
......@@ -403,7 +339,6 @@ first.
allow pg_ctl to read and understand postgresql.conf to find the
data_directory value.
</p>
</li><li>Allow column-level GRANT/REVOKE privileges
</li><li>Add a GUC variable to warn about non-standard SQL usage in queries
</li><li>Add MERGE command that does UPDATE/DELETE, or on failure, INSERT (rules,
triggers?)
......@@ -412,7 +347,7 @@ first.
creation
</li><li>Add COMMENT ON for all cluster global objects (roles, databases
and tablespaces)
</li><li>Add an option to automatically use savepoints for each statement in a
</li><li>-<em>Add an option to automatically use savepoints for each statement in a</em>
multi-statement transaction.
<p> When enabled, this would allow errors in multi-statement transactions
to be automatically ignored.
......@@ -470,11 +405,25 @@ first.
processed, with ROLLBACK on COPY failure.
</p>
</li><li>-<em>Allow COPY to understand \x as a hex byte</em>
</li><li>Have COPY return the number of rows loaded/unloaded (?)
</li><li>Have COPY return the number of rows loaded/unloaded?
</li><li>-<em>Allow COPY to optionally include column headings in the first line</em>
</li><li>-<em>Allow COPY FROM ... CSV to interpret newlines and carriage</em>
returns in data
</li></ul>
</li><li>GRANT/REVOKE
<ul>
<li>Allow column-level privileges
<ul>
<li>Allow GRANT/REVOKE permissions to be applied to all schema objects with one
command
<p> The proposed syntax is:
</p><p> GRANT SELECT ON ALL TABLES IN public TO phpuser;
GRANT SELECT ON NEW TABLES IN public TO phpuser;
</p>
</li><li>Allow GRANT/REVOKE permissions to be inherited by objects based on
schema permissions
</li></ul>
</li></ul>
</li><li>CURSOR
<ul>
<li>Allow UPDATE/DELETE WHERE CURRENT OF cursor
......@@ -484,7 +433,7 @@ first.
and no FOR UPDATE lock.
</p>
</li><li>Prevent DROP TABLE from dropping a row referenced by its own open
cursor (?)
cursor?
</li><li>Allow pooled connections to list all open WITH HOLD cursors
<p> Because WITH HOLD cursors exist outside transactions, this allows
them to be listed so they can be closed.
......@@ -505,7 +454,7 @@ first.
<li>-<em>Have SHOW ALL show descriptions for server-side variables</em>
</li><li>Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM
ANALYZE, and CLUSTER
</li><li>Add SET PATH for schemas (?)
</li><li>Add SET PATH for schemas?
<p> This is basically the same as SET search_path.
</p>
</li></ul>
......@@ -524,7 +473,7 @@ first.
</li><li>Allow function parameters to be passed by name,
get_employee_salary(emp_id =&gt; 12345, tax_year =&gt; 2001)
</li><li>Add Oracle-style packages
</li><li>Add table function support to pltcl, plperl, plpython (?)
</li><li>Add table function support to pltcl, plperl, plpython?
</li><li>Allow PL/pgSQL to name columns by ordinal position, e.g. rec.(3)
</li><li>-<em>Allow PL/pgSQL EXECUTE query_var INTO record_var;</em>
</li><li>Add capability to create and call PROCEDURES
......@@ -534,11 +483,11 @@ first.
</li><li>Add support for polymorphic arguments and return types to plperl
</li></ul>
</li></ul>
<h1><a name="section_8">Clients</a></h1>
<h1><a name="section_9">Clients</a></h1>
<ul>
<li>Add a libpq function to support Parse/DescribeStatement capability
</li><li>Prevent libpq's PQfnumber() from lowercasing the column name (?)
</li><li>Prevent libpq's PQfnumber() from lowercasing the column name?
</li><li>Allow libpq to access SQLSTATE so pg_ctl can test for connection failure
<p> This would be used for checking if the server is up.
</p>
......@@ -577,7 +526,7 @@ first.
<p> Document differences between ecpg and the SQL standard and
information about the Informix-compatibility module.
</p>
</li><li>Solve cardinality &gt; 1 for input descriptors / variables (?)
</li><li>Solve cardinality &gt; 1 for input descriptors / variables?
</li><li>Add a semantic check level, e.g. check if a table really exists
</li><li>fix handling of DB attributes that are arrays
</li><li>Use backend PREPARE/EXECUTE facility for ecpg where possible
......@@ -589,7 +538,7 @@ first.
</li><li>Add internationalized message strings
</li></ul>
</li></ul>
<h1><a name="section_9">Referential Integrity</a></h1>
<h1><a name="section_10">Referential Integrity</a></h1>
<ul>
<li>Add MATCH PARTIAL referential integrity
......@@ -601,8 +550,8 @@ first.
</li><li>-<em>Implement shared row locks and use them in RI triggers</em>
</li><li>Enforce referential integrity for system tables
</li><li>Change foreign key constraint for array -&gt; element to mean element
in array (?)
</li><li>Allow DEFERRABLE UNIQUE constraints (?)
in array?
</li><li>Allow DEFERRABLE UNIQUE constraints?
</li><li>Allow triggers to be disabled [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?trigger">trigger</a>]
<p> Currently the only way to disable triggers is to modify the system
tables.
......@@ -624,13 +573,13 @@ first.
to fire triggers.
</p>
</li></ul>
<h1><a name="section_10">Dependency Checking</a></h1>
<h1><a name="section_11">Dependency Checking</a></h1>
<ul>
<li>Flush cached query plans when the dependent objects change
</li><li>Track dependencies in function bodies and recompile/invalidate
</li></ul>
<h1><a name="section_11">Exotic Features</a></h1>
<h1><a name="section_12">Exotic Features</a></h1>
<ul>
<li>Add SQL99 WITH clause to SELECT
......@@ -658,9 +607,81 @@ first.
</li><li>Allow nested schemas
</li></ul>
</li></ul>
<h2><a name="section_11_1">PERFORMANCE</a></h2>
<h1><a name="section_13">Indexes</a></h1>
<h1><a name="section_12">Fsync</a></h1>
<ul>
<li>Allow inherited tables to inherit index, UNIQUE constraint, and primary
key, foreign key
</li><li>UNIQUE INDEX on base column not honored on INSERTs/UPDATEs from
inherited table: INSERT INTO inherit_table (unique_index_col) VALUES
(dup) should fail
<p> The main difficulty with this item is the problem of creating an index
that can span more than one table.
</p>
</li><li>Add UNIQUE capability to non-btree indexes
</li><li>Add more GIST index support for geometric data types
</li><li>-<em>Use indexes for MIN() and MAX()</em>
<p> MIN/MAX queries can already be rewritten as SELECT col FROM tab ORDER
BY col {DESC} LIMIT 1. Completing this item involves doing this
transformation automatically.
</p>
</li><li>-<em>Use index to restrict rows returned by multi-key index when used with</em>
non-consecutive keys to reduce heap accesses
<p> For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and
col3 = 9, spin though the index checking for col1 and col3 matches,
rather than just col1; also called skip-scanning.
</p>
</li><li>Prevent index uniqueness checks when UPDATE does not modify the column
<p> Uniqueness (index) checks are done when updating a column even if the
column is not modified by the UPDATE.
</p>
</li><li>Fetch heap pages matching index entries in sequential order
<p> Rather than randomly accessing heap pages based on index entries, mark
heap pages needing access in a bitmap and do the lookups in sequential
order. Another method would be to sort heap ctids matching the index
before accessing the heap rows.
</p>
</li><li>-<em>Allow non-bitmap indexes to be combined by creating bitmaps in memory</em>
<p> This feature allows separate indexes to be ANDed or ORed together. This
is particularly useful for data warehousing applications that need to
query the database in an many permutations. This feature scans an index
and creates an in-memory bitmap, and allows that bitmap to be combined
with other bitmap created in a similar way. The bitmap can either index
all TIDs, or be lossy, meaning it records just page numbers and each
page tuple has to be checked for validity in a separate pass.
</p>
</li><li>Allow the creation of on-disk bitmap indexes which can be quickly
combined with other bitmap indexes
<p> Such indexes could be more compact if there are only a few distinct values.
Such indexes can also be compressed. Keeping such indexes updated can be
costly.
</p>
</li><li>Allow use of indexes to search for NULLs
<p> One solution is to create a partial index on an IS NULL expression.
</p>
</li><li>-<em>Add concurrency to GIST</em>
</li><li>Allow accurate statistics to be collected on indexes with more than
one column or expression indexes, perhaps using per-index statistics
</li><li>Add fillfactor to control reserved free space during index creation
</li><li>Allow the creation of indexes with mixed ascending/descending specifiers
</li><li>-<em>Fix incorrect rtree results due to wrong assumptions about "over"</em>
operator semantics
</li><li>Allow GIST indexes to create certain complex index types, like digital
trees (see Aoki)
</li><li>Hash
<ul>
<li>Pack hash index buckets onto disk pages more efficiently
<p> Currently no only one hash bucket can be stored on a page. Ideally
several hash buckets could be stored on a single page and greater
granularity used for the hash algorithm.
</p>
</li><li>Consider sorting hash buckets so entries can be found using a
binary search, rather than a linear scan
</li><li>In hash indexes, consider storing the hash value with or instead
of the key itself
</li></ul>
</li></ul>
<h1><a name="section_14">Fsync</a></h1>
<ul>
<li>Improve commit_delay handling to reduce fsync()
......@@ -669,7 +690,7 @@ first.
</li><li>Add an option to sync() before fsync()'ing checkpoint files
</li><li>Add program to test if fsync has a delay compared to non-fsync
</li></ul>
<h1><a name="section_13">Cache</a></h1>
<h1><a name="section_15">Cache Usage</a></h1>
<ul>
<li>Allow free-behind capability for large sequential scans, perhaps using
......@@ -725,7 +746,7 @@ first.
at the start of the table.
</p>
</li></ul>
<h1><a name="section_14">Vacuum</a></h1>
<h1><a name="section_16">Vacuum</a></h1>
<ul>
<li>Improve speed with indexes
......@@ -758,7 +779,7 @@ first.
</li><li>Do VACUUM FULL if table is nearly empty?
</li></ul>
</li></ul>
<h1><a name="section_15">Locking</a></h1>
<h1><a name="section_17">Locking</a></h1>
<ul>
<li>Make locking of shared data structures more fine-grained
......@@ -780,7 +801,7 @@ first.
</li><li>Research use of sched_yield() for spinlock acquisition failure
</li><li>Fix priority ordering of read and write light-weight locks (Neil)
</li></ul>
<h1><a name="section_16">Startup Time</a></h1>
<h1><a name="section_18">Startup Time Improvements</a></h1>
<ul>
<li>Experiment with multi-threaded backend [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?thread">thread</a>]
......@@ -796,7 +817,7 @@ first.
existing backends is one of the difficulties with a backend approach.
</p>
</li></ul>
<h1><a name="section_17">Write-Ahead Log</a></h1>
<h1><a name="section_19">Write-Ahead Log</a></h1>
<ul>
<li>Eliminate need to write full pages to WAL before page modification [<a href="http://momjian.postgresql.org/cgi-bin/pgtodo?wal">wal</a>]
......@@ -806,7 +827,7 @@ first.
eliminated from point-in-time archive files.
</p>
</li><li>Reduce WAL traffic so only modified values are written rather than
entire rows (?)
entire rows?
</li><li>Turn off full page writes if fsync is disabled
<p> If fsync is off, there is no purpose in writing full pages to WAL
</p>
......@@ -831,7 +852,7 @@ first.
</li><li>Eliminate WAL logging for CREATE TABLE AS when not doing WAL archiving
</li><li>-<em>Change WAL to use 32-bit CRC, for performance reasons</em>
</li></ul>
<h1><a name="section_18">Optimizer / Executor</a></h1>
<h1><a name="section_20">Optimizer / Executor</a></h1>
<ul>
<li>Add missing optimizer selectivities for date, r-tree, etc
......@@ -856,16 +877,16 @@ first.
</p>
</li><li>ANALYZE should record a pg_statistic entry for an all-NULL column
</li><li>Log queries where the optimizer row estimates were dramatically
different from the number of rows actually found (?)
different from the number of rows actually found?
</li></ul>
<h1><a name="section_19">Miscellaneous</a></h1>
<h1><a name="section_21">Miscellaneous Performance</a></h1>
<ul>
<li>Do async I/O for faster random read-ahead of data
<p> Async I/O allows multiple I/O requests to be sent to the disk with
results coming back asynchronously.
</p>
</li><li>Use mmap() rather than SYSV shared memory or to write WAL files (?)
</li><li>Use mmap() rather than SYSV shared memory or to write WAL files?
<p> This would remove the requirement for SYSV SHM but would introduce
portability issues. Anonymous mmap (or mmap to /dev/zero) is required
to prevent I/O overhead.
......@@ -882,7 +903,7 @@ first.
</li><li>Use a phantom command counter for nested subtransactions to reduce
per-tuple overhead
</li></ul>
<h1><a name="section_20">Source Code</a></h1>
<h1><a name="section_22">Source Code</a></h1>
<ul>
<li>Add use of 'const' for variables in source tree
......@@ -948,7 +969,7 @@ first.
</li></ul>
<hr/>
<h3><a name="section_20_1_1">Developers who have claimed items are:</a></h3>
<h2><a name="section_22_1">Developers who have claimed items are:</a></h2>
<ul>
<li>Alvaro is Alvaro Herrera &lt;<a href="mailto:alvherre@dcc.uchile.cl">alvherre@dcc.uchile.cl</a>&gt;
</li><li>Andrew is Andrew Dunstan &lt;<a href="mailto:andrew@dunslane.net">andrew@dunslane.net</a>&gt;
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment