FAQ 36.6 KB
Newer Older
Bruce Momjian's avatar
Bruce Momjian committed
1

Bruce Momjian's avatar
Bruce Momjian committed
2
                Frequently Asked Questions (FAQ) for PostgreSQL
Bruce Momjian's avatar
Bruce Momjian committed
3
                                       
Bruce Momjian's avatar
Bruce Momjian committed
4
   Last updated: Wed Jan 4 22:55:13 EST 2006
Bruce Momjian's avatar
Bruce Momjian committed
5
   
Bruce Momjian's avatar
Bruce Momjian committed
6
   Current maintainer: Bruce Momjian (pgman@candle.pha.pa.us)
Bruce Momjian's avatar
Bruce Momjian committed
7
   
Bruce Momjian's avatar
Bruce Momjian committed
8
   The most recent version of this document can be viewed at
9
   http://www.postgresql.org/files/documentation/faqs/FAQ.html.
Bruce Momjian's avatar
Bruce Momjian committed
10
   
Bruce Momjian's avatar
Bruce Momjian committed
11
   Platform-specific questions are answered at
12
   http://www.postgresql.org/docs/faq/.
Bruce Momjian's avatar
Bruce Momjian committed
13 14
     _________________________________________________________________
   
15
                             General Questions
Bruce Momjian's avatar
Bruce Momjian committed
16
                                      
Bruce Momjian's avatar
Bruce Momjian committed
17
   1.1) What is PostgreSQL? How is it pronounced?
18 19 20 21 22 23 24 25 26 27 28 29
   1.2) Who controls PostgreSQL?
   1.3) What is the copyright of PostgreSQL?
   1.4) What platforms does PostgreSQL support?
   1.5) Where can I get PostgreSQL?
   1.6) What is the latest release?
   1.7) Where can I get support?
   1.8) How do I submit a bug report?
   1.9) How do I find out about known bugs or missing features?
   1.10) What documentation is available?
   1.11) How can I learn SQL?
   1.12) How do I join the development team?
   1.13) How does PostgreSQL compare to other DBMSs?
30 31
   
                           User Client Questions
Bruce Momjian's avatar
Bruce Momjian committed
32
                                      
33
   2.1) What interfaces are available for PostgreSQL?
Bruce Momjian's avatar
Bruce Momjian committed
34
   2.2) What tools are available for using PostgreSQL with Web pages?
35
   2.3) Does PostgreSQL have a graphical user interface?
36 37 38
   
                          Administrative Questions
                                      
Bruce Momjian's avatar
Bruce Momjian committed
39
   3.1) How do I install PostgreSQL somewhere other than
Bruce Momjian's avatar
Bruce Momjian committed
40
   /usr/local/pgsql?
41 42 43 44
   3.2) How do I control connections from other hosts?
   3.3) How do I tune the database engine for better performance?
   3.4) What debugging features are available?
   3.5) Why do I get "Sorry, too many clients" when trying to connect?
45
   3.6) Why do I need to do a dump and restore to upgrade PostgreSQL
46
   releases?
47
   3.7) What computer hardware should I use?
48 49
   
                           Operational Questions
Bruce Momjian's avatar
Bruce Momjian committed
50
                                      
51 52
   4.1) How do I SELECT only the first few rows of a query? A random row?
   4.2) How do I find out what tables, indexes, databases, and users are
53
   defined? How do I see the queries used by psql to display them?
54
   4.3) How do you change a column's data type?
55 56
   4.4) What is the maximum size for a row, a table, and a database?
   4.5) How much database disk space is required to store data from a
Bruce Momjian's avatar
Bruce Momjian committed
57
   typical text file?
58
   4.6) Why are my queries slow? Why don't they use my indexes?
59
   4.7) How do I see how the query optimizer is evaluating my query?
60 61 62
   4.8) How do I perform regular expression searches and case-insensitive
   regular expression searches? How do I use an index for
   case-insensitive searches?
63 64
   4.9) In a query, how do I detect if a field is NULL? How can I sort on
   whether a field is NULL or not?
65
   4.10) What is the difference between the various character types?
Bruce Momjian's avatar
Bruce Momjian committed
66 67 68 69
   4.11.1) How do I create a serial/auto-incrementing field?
   4.11.2) How do I get the value of a SERIAL insert?
   4.11.3) Doesn't currval() lead to a race condition with other users?
   4.11.4) Why aren't my sequence numbers reused on transaction abort?
Bruce Momjian's avatar
Bruce Momjian committed
70
   Why are there gaps in the numbering of my sequence/SERIAL column?
71
   4.12) What is an OID? What is a CTID?
72
   4.13) Why do I get the error "ERROR: Memory exhausted in
Bruce Momjian's avatar
Bruce Momjian committed
73
   AllocSetAlloc()"?
74
   4.14) How do I tell what PostgreSQL version I am running?
75 76 77 78 79
   4.15) How do I create a column that will default to the current time?
   4.16) How do I perform an outer join?
   4.17) How do I perform queries using multiple databases?
   4.18) How do I return multiple rows or columns from a function?
   4.19) Why do I get "relation with OID ##### does not exist" errors
80
   when accessing temporary tables in PL/PgSQL functions?
81
   4.20) What replication solutions are available?
Bruce Momjian's avatar
Bruce Momjian committed
82
   4.21) Why are my table and column names not recognized in my query?
Bruce Momjian's avatar
Bruce Momjian committed
83 84
     _________________________________________________________________
   
85 86
                             General Questions
                                      
87 88
  1.1) What is PostgreSQL? How is it pronounced?
  
Bruce Momjian's avatar
Bruce Momjian committed
89 90 91
   PostgreSQL is pronounced Post-Gres-Q-L, and is also sometimes referred
   to as just Postgres. An audio file is available in MP3 format for
   those would like to hear the pronunciation.
92 93 94 95 96 97 98 99 100
   
   PostgreSQL is an object-relational database system that has the
   features of traditional commercial database systems with enhancements
   to be found in next-generation DBMS systems. PostgreSQL is free and
   the complete source code is available.
   
   PostgreSQL development is performed by a team of mostly volunteer
   developers spread throughout the world and communicating via the
   Internet. It is a community project and is not controlled by any
Bruce Momjian's avatar
Bruce Momjian committed
101
   company. To get involved, see the developer's FAQ at
102
   http://www.postgresql.org/files/documentation/faqs/FAQ_DEV.html
Bruce Momjian's avatar
Bruce Momjian committed
103
   
104 105 106 107 108 109 110 111 112 113 114
  1.2) Who controls PostgreSQL?
  
   If you are looking for a PostgreSQL gatekeeper, central committee, or
   controlling company, give up --- there isn't one. We do have a core
   committee and CVS committers, but these groups are more for
   administrative purposes than control. The project is directed by the
   community of developers and users, which anyone can join. All you need
   to do is subscribe to the mailing lists and participate in the
   discussions. (See the Developer's FAQ for information on how to get
   involved in PostgreSQL development.)
   
115
  1.3) What is the copyright of PostgreSQL?
116
  
Bruce Momjian's avatar
Bruce Momjian committed
117 118 119 120 121 122
   PostgreSQL is distributed under the classic BSD license. Basically, it
   allows users to do anything they want with the code, including
   reselling binaries without the source code. The only restriction is
   that you not hold us legally liable for problems with the software.
   There is also the requirement that this copyright appear in all copies
   of the software. Here is the actual BSD license we use:
Bruce Momjian's avatar
Bruce Momjian committed
123 124 125
   
   PostgreSQL Data Base Management System
   
Bruce Momjian's avatar
Bruce Momjian committed
126
   Portions Copyright (c) 1996-2005, PostgreSQL Global Development Group
127 128
   Portions Copyright (c) 1994-1996 Regents of the University of
   California
Bruce Momjian's avatar
Bruce Momjian committed
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148
   
   Permission to use, copy, modify, and distribute this software and its
   documentation for any purpose, without fee, and without a written
   agreement is hereby granted, provided that the above copyright notice
   and this paragraph and the following two paragraphs appear in all
   copies.
   
   IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY
   FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES,
   INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND
   ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN
   ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
   
   THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES,
   INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
   MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE
   PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND THE UNIVERSITY OF
   CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT,
   UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
   
149
  1.4) What platforms does PostgreSQL support?
150
  
Bruce Momjian's avatar
Bruce Momjian committed
151
   In general, any modern Unix-compatible platform should be able to run
Bruce Momjian's avatar
Bruce Momjian committed
152 153 154
   PostgreSQL. The platforms that had received explicit testing at the
   time of release are listed in the installation instructions.
   
155 156 157 158 159
   PostgreSQL also runs natively on Microsoft Windows NT-based operating
   systems like Win2000, WinXP, and Win2003. A prepackaged installer is
   available at http://pgfoundry.org/projects/pginstaller. MSDOS-based
   versions of Windows (Win95, Win98, WinMe) can run PostgreSQL using
   Cygwin.
160
   
Bruce Momjian's avatar
Bruce Momjian committed
161 162
   There is also a Novell Netware 6 port at http://forge.novell.com, and
   an OS/2 (eComStation) version at
Bruce Momjian's avatar
Bruce Momjian committed
163 164
   http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&button=Search&key=postgre
   SQL&stype=all&sort=type&dir=%2F.
Bruce Momjian's avatar
Bruce Momjian committed
165
   
166
  1.5) Where can I get PostgreSQL?
167
  
168 169
   Via web browser, use http://www.postgresql.org/ftp/, and via ftp, use
   ftp://ftp.PostgreSQL.org/pub/.
170
   
171 172
  1.6) What is the latest release?
  
Bruce Momjian's avatar
Bruce Momjian committed
173
   The latest release of PostgreSQL is version 8.1.2.
174 175 176 177
   
   We plan to have a major release every year, with minor releases every
   few months.
   
178
  1.7) Where can I get support?
179
  
180 181 182 183
   The PostgreSQL community provides assistance to many of its users via
   email. The main web site to subscribe to the email lists is
   http://www.postgresql.org/community/lists/. The general or bugs lists
   are a good place to start.
Bruce Momjian's avatar
Bruce Momjian committed
184
   
Bruce Momjian's avatar
Bruce Momjian committed
185
   The major IRC channel is #postgresql on Freenode (irc.freenode.net).
186
   To connect you can use the Unix program irc -c '#postgresql' "$USER"
187 188 189
   irc.freenode.net or use any other IRC clients. A Spanish one also
   exists on the same network, (#postgresql-es), and a French one,
   (#postgresqlfr). There is also a PostgreSQL channel on EFNet.
Bruce Momjian's avatar
Bruce Momjian committed
190
   
Bruce Momjian's avatar
Bruce Momjian committed
191
   A list of commercial support companies is available at
192
   http://techdocs.postgresql.org/companies.php.
193
   
194
  1.8) How do I submit a bug report?
195
  
Bruce Momjian's avatar
Bruce Momjian committed
196
   Visit the PostgreSQL bug form at
Bruce Momjian's avatar
Bruce Momjian committed
197 198 199
   http://www.postgresql.org/support/submitbug. Also check out our ftp
   site ftp://ftp.PostgreSQL.org/pub/ to see if there is a more recent
   PostgreSQL version.
Bruce Momjian's avatar
Bruce Momjian committed
200
   
201 202 203
   Bugs submitted using the bug form or posted to any PostgreSQL mailing
   list typically generates one of the following replies:
     * It is not a bug, and why
Bruce Momjian's avatar
Bruce Momjian committed
204
     * It is a known bug and is already on the TODO list
205 206 207 208 209 210 211 212 213 214
     * The bug has been fixed in the current release
     * The bug has been fixed but is not packaged yet in an official
       release
     * A request is made for more detailed information:
          + Operating system
          + PostgreSQL version
          + Reproducible test case
          + Debugging information
          + Debugger backtrace output
     * The bug is new. The following might happen:
215 216
          + A patch is created and will be included in the next major or
            minor release
217 218 219 220
          + The bug cannot be fixed immediately and is added to the TODO
            list
       
  1.9) How do I find out about known bugs or missing features?
221
  
222 223
   PostgreSQL supports an extended subset of SQL:2003. See our TODO list
   for known bugs, missing features, and future plans.
Bruce Momjian's avatar
Bruce Momjian committed
224
   
225 226 227 228 229 230 231 232 233 234 235 236 237 238
   A feature request usually results in one of the following replies:
     * The feature is already on the TODO list
     * The feature is not desired because:
          + It duplicates existing functionality that already follows the
            SQL standard
          + The feature would increase code complexity but add little
            benefit
          + The feature would be insecure or unreliable
     * The new feature is added to the TODO list
       
   PostgreSQL does not use a bug tracking system because we find it more
   efficient to respond directly to email and keep the TODO list
   up-to-date. In practice, bugs don't last very long in the software,
   and bugs that affect a large number of users are fixed rapidly. The
239 240 241
   only place to find all changes, improvements, and fixes in a
   PostgreSQL release is to read the CVS log messages. Even the release
   notes do not list every change made to the software.
Bruce Momjian's avatar
Bruce Momjian committed
242
   
243
  1.10) What documentation is available?
244
  
245 246 247
   PostgreSQL includes extensive documentation, including a large manual,
   manual pages, and some test examples. See the /doc directory. You can
   also browse the manuals online at http://www.PostgreSQL.org/docs.
Bruce Momjian's avatar
Bruce Momjian committed
248
   
Bruce Momjian's avatar
Bruce Momjian committed
249
   There are two PostgreSQL books available online at
Bruce Momjian's avatar
Bruce Momjian committed
250
   http://www.postgresql.org/docs/books/awbook.html and
Bruce Momjian's avatar
Bruce Momjian committed
251 252 253
   http://www.commandprompt.com/ppbook/. There are a number of PostgreSQL
   books available for purchase. One of the most popular ones is by Korry
   Douglas. A list of book reviews can be found at
Bruce Momjian's avatar
Bruce Momjian committed
254 255
   http://techdocs.PostgreSQL.org/techdocs/bookreviews.php. There is also
   a collection of PostgreSQL technical articles at
256
   http://techdocs.PostgreSQL.org/.
Bruce Momjian's avatar
Bruce Momjian committed
257
   
Bruce Momjian's avatar
Bruce Momjian committed
258 259 260
   The command line client program psql has some \d commands to show
   information about types, operators, functions, aggregates, etc. - use
   \? to display the available commands.
Bruce Momjian's avatar
Bruce Momjian committed
261
   
Bruce Momjian's avatar
Bruce Momjian committed
262
   Our web site contains even more documentation.
263
   
264
  1.11) How can I learn SQL?
265
  
266 267
   First, consider the PostgreSQL-specific books mentioned above. Another
   one is "Teach Yourself SQL in 21 Days, Second Edition" at
Bruce Momjian's avatar
Bruce Momjian committed
268 269 270 271 272 273
   http://members.tripod.com/er4ebus/sql/index.htm. Many of our users
   like The Practical SQL Handbook, Bowman, Judith S., et al.,
   Addison-Wesley. Others like The Complete Reference SQL, Groff et al.,
   McGraw-Hill.
   
   There is also a nice tutorial at
274 275 276
   http://www.intermedia.net/support/sql/sqltut.shtm, at
   http://ourworld.compuserve.com/homepages/graeme_birchall/HTM_COOK.HTM,
   and at http://sqlcourse.com.
277
   
278
  1.12) How do I join the development team?
279
  
280
   See the Developer's FAQ.
Bruce Momjian's avatar
Bruce Momjian committed
281
   
282
  1.13) How does PostgreSQL compare to other DBMSs?
283
  
284 285 286 287
   There are several ways of measuring software: features, performance,
   reliability, support, and price.
   
   Features
Bruce Momjian's avatar
Bruce Momjian committed
288 289 290 291
          PostgreSQL has most features present in large commercial DBMSs,
          like transactions, subselects, triggers, views, foreign key
          referential integrity, and sophisticated locking. We have some
          features they do not have, like user-defined types,
Bruce Momjian's avatar
Bruce Momjian committed
292
          inheritance, rules, and multi-version concurrency control to
Bruce Momjian's avatar
Bruce Momjian committed
293
          reduce lock contention.
294 295
          
   Performance
Bruce Momjian's avatar
Bruce Momjian committed
296 297
          PostgreSQL's performance is comparable to other commercial and
          open source databases. It is faster for some things, slower for
298 299
          others. Our performance is usually +/-10% compared to other
          databases.
300 301 302 303 304 305 306 307 308 309
          
   Reliability
          We realize that a DBMS must be reliable, or it is worthless. We
          strive to release well-tested, stable code that has a minimum
          of bugs. Each release has at least one month of beta testing,
          and our release history shows that we can provide stable, solid
          releases that are ready for production use. We believe we
          compare favorably to other database software in this area.
          
   Support
310 311 312 313 314 315 316
          Our mailing lists provide contact with a large group of
          developers and users to help resolve any problems encountered.
          While we cannot guarantee a fix, commercial DBMSs do not always
          supply a fix either. Direct access to developers, the user
          community, manuals, and the source code often make PostgreSQL
          support superior to other DBMSs. There is commercial
          per-incident support available for those who need it. (See FAQ
317
          section 1.7.)
318 319 320 321 322
          
   Price
          We are free for all use, both commercial and non-commercial.
          You can add our code to your product with no limitations,
          except those outlined in our BSD-style license stated above.
Bruce Momjian's avatar
Bruce Momjian committed
323
     _________________________________________________________________
324
   
325 326
                           User Client Questions
                                      
327 328
  2.1) What interfaces are available for PostgreSQL?
  
329 330 331 332
   The PostgreSQL install includes only the C and embedded C interfaces.
   All other interfaces are independent projects that are downloaded
   separately; being separate allows them to have their own release
   schedule and development teams.
Bruce Momjian's avatar
Bruce Momjian committed
333
   
334 335 336 337
   Some programming languages like PHP include an interface to
   PostgreSQL. Interfaces for languages like Perl, TCL, Python, and many
   others are available at http://gborg.postgresql.org in the
   Drivers/Interfaces section and via Internet search.
Bruce Momjian's avatar
Bruce Momjian committed
338
   
339 340
  2.2) What tools are available for using PostgreSQL with Web pages?
  
Bruce Momjian's avatar
Bruce Momjian committed
341
   A nice introduction to Database-backed Web pages can be seen at:
Bruce Momjian's avatar
Bruce Momjian committed
342
   http://www.webreview.com
Bruce Momjian's avatar
Bruce Momjian committed
343
   
Bruce Momjian's avatar
Bruce Momjian committed
344 345
   For Web integration, PHP (http://www.php.net) is an excellent
   interface.
Bruce Momjian's avatar
Bruce Momjian committed
346
   
Bruce Momjian's avatar
Bruce Momjian committed
347 348
   For complex cases, many use the Perl and DBD::Pg with CGI.pm or
   mod_perl.
Bruce Momjian's avatar
Bruce Momjian committed
349
   
350 351
  2.3) Does PostgreSQL have a graphical user interface?
  
352
   Yes, see http://techdocs.postgresql.org/guides/GUITools for a detailed
353
   list.
Bruce Momjian's avatar
Bruce Momjian committed
354 355
     _________________________________________________________________
   
356 357
                          Administrative Questions
                                      
358 359
  3.1) How do I install PostgreSQL somewhere other than /usr/local/pgsql?
  
Bruce Momjian's avatar
Bruce Momjian committed
360
   Specify the --prefix option when running configure.
Bruce Momjian's avatar
Bruce Momjian committed
361
   
362 363
  3.2) How do I control connections from other hosts?
  
Bruce Momjian's avatar
Bruce Momjian committed
364
   By default, PostgreSQL only allows connections from the local machine
Bruce Momjian's avatar
Bruce Momjian committed
365 366
   using Unix domain sockets or TCP/IP connections. Other machines will
   not be able to connect unless you modify listen_addresses in the
367 368
   postgresql.conf file, enable host-based authentication by modifying
   the $PGDATA/pg_hba.conf file, and restart the server.
Bruce Momjian's avatar
Bruce Momjian committed
369
   
370 371
  3.3) How do I tune the database engine for better performance?
  
372
   There are three major areas for potential performance improvement:
Bruce Momjian's avatar
Bruce Momjian committed
373
   
374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399
   Query Changes
          This involves modifying queries to obtain better performance:
          
          + Creation of indexes, including expression and partial indexes
          + Use of COPY instead of multiple INSERTs
          + Grouping of multiple statements into a single transaction to
            reduce commit overhead
          + Use of CLUSTER when retrieving many rows from an index
          + Use of LIMIT for returning a subset of a query's output
          + Use of Prepared queries
          + Use of ANALYZE to maintain accurate optimizer statistics
          + Regular use of VACUUM or pg_autovacuum
          + Dropping of indexes during large data changes
            
   Server Configuration
          A number of postgresql.conf settings affect performance. For
          more details, see Administration Guide/Server Run-time
          Environment/Run-time Configuration for a full listing, and for
          commentary see
          http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_co
          nf_e.html and
          http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html.
          
   Hardware Selection
          The effect of hardware on performance is detailed in
          http://candle.pha.pa.us/main/writings/pgsql/hw_performance/inde
Bruce Momjian's avatar
Bruce Momjian committed
400
          x.html and http://www.powerpostgresql.com/PerfList/.
401
          
402 403
  3.4) What debugging features are available?
  
Bruce Momjian's avatar
Bruce Momjian committed
404 405 406 407
   There are many log_* server configuration variables that enable
   printing of query and process statistics which can be very useful for
   debugging and performance measurements.
   
408 409
  3.5) Why do I get "Sorry, too many clients" when trying to connect?
  
Bruce Momjian's avatar
Bruce Momjian committed
410 411 412 413
   You have reached the default limit is 100 database sessions. You need
   to increase the postmaster's limit on how many concurrent backend
   processes it can start by changing the max_connections value in
   postgresql.conf and restarting the postmaster.
Bruce Momjian's avatar
Bruce Momjian committed
414
   
415 416 417
  3.6) Why do I need to do a dump and restore to upgrade between major
  PostgreSQL releases?
  
418
   The PostgreSQL team makes only small changes between minor releases,
419
   so upgrading from 7.4.0 to 7.4.1 does not require a dump and restore.
Bruce Momjian's avatar
Bruce Momjian committed
420
   However, major releases (e.g. from 7.3 to 7.4) often change the
421
   internal format of system tables and data files. These changes are
Bruce Momjian's avatar
Bruce Momjian committed
422
   often complex, so we don't maintain backward compatibility for data
423 424
   files. A dump outputs data in a generic format that can then be loaded
   in using the new internal format.
425
   
426 427
  3.7) What computer hardware should I use?
  
428 429 430 431
   Because PC hardware is mostly compatible, people tend to believe that
   all PC hardware is of equal quality. It is not. ECC RAM, SCSI, and
   quality motherboards are more reliable and have better performance
   than less expensive hardware. PostgreSQL will run on almost any
Bruce Momjian's avatar
Bruce Momjian committed
432 433 434
   hardware, but if reliability and performance are important it is wise
   to research your hardware options thoroughly. Our email lists can be
   used to discuss hardware options and tradeoffs.
435
     _________________________________________________________________
Bruce Momjian's avatar
Bruce Momjian committed
436
   
437 438
                           Operational Questions
                                      
439 440
  4.1) How do I SELECT only the first few rows of a query? A random row?
  
Bruce Momjian's avatar
Bruce Momjian committed
441 442 443 444
   To retrieve only a few rows, if you know at the number of rows needed
   at the time of the SELECT use LIMIT . If an index matches the ORDER BY
   it is possible the entire query does not have to be executed. If you
   don't know the number of rows at SELECT time, use a cursor and FETCH.
Bruce Momjian's avatar
Bruce Momjian committed
445
   
446 447 448 449 450 451
   To SELECT a random row, use:
    SELECT col
    FROM tab
    ORDER BY random()
    LIMIT 1;

452 453 454
  4.2) How do I find out what tables, indexes, databases, and users are
  defined? How do I see the queries used by psql to display them?
  
Bruce Momjian's avatar
Bruce Momjian committed
455 456 457 458 459 460
   Use the \dt command to see tables in psql. For a complete list of
   commands inside psql you can use \?. Alternatively you can read the
   source code for psql in file pgsql/src/bin/psql/describe.c, it
   contains SQL commands that generate the output for psql's backslash
   commands. You can also start psql with the -E option so it will print
   out the queries it uses to execute the commands you give. PostgreSQL
461
   also provides an SQL compliant INFORMATION SCHEMA interface you can
Bruce Momjian's avatar
Bruce Momjian committed
462
   query to get information about the database.
Bruce Momjian's avatar
Bruce Momjian committed
463
   
464
   There are also system tables beginning with pg_ that describe these
465 466 467
   too.
   
   Use psql -l will list all databases.
468 469 470 471 472
   
   Also try the file pgsql/src/tutorial/syscat.source. It illustrates
   many of the SELECTs needed to get information from the database system
   tables.
   
473 474
  4.3) How do you change a column's data type?
  
475 476 477 478
   Changing the data type of a column can be done easily in 8.0 and later
   with ALTER TABLE ALTER COLUMN TYPE.
   
   In earlier releases, do this:
479 480 481
    BEGIN;
    ALTER TABLE tab ADD COLUMN new_col new_data_type;
    UPDATE tab SET new_col = CAST(old_col AS new_data_type);
Bruce Momjian's avatar
Bruce Momjian committed
482
    ALTER TABLE tab DROP COLUMN old_col;
483 484 485 486 487
    COMMIT;

   You might then want to do VACUUM FULL tab to reclaim the disk space
   used by the expired rows.
   
488 489
  4.4) What is the maximum size for a row, a table, and a database?
  
Bruce Momjian's avatar
Bruce Momjian committed
490
   These are the limits:
491 492 493 494 495 496 497 498 499 500
   
   Maximum size for a database? unlimited (32 TB databases exist)
   Maximum size for a table? 32 TB
   Maximum size for a row? 1.6TB
   Maximum size for a field? 1 GB
   Maximum number of rows in a table? unlimited
   Maximum number of columns in a table? 250-1600 depending on column
   types
   Maximum number of indexes on a table? unlimited
   
Bruce Momjian's avatar
Bruce Momjian committed
501
   Of course, these are not actually unlimited, but limited to available
Bruce Momjian's avatar
Bruce Momjian committed
502 503
   disk space and memory/swap space. Performance may suffer when these
   values get unusually large.
Bruce Momjian's avatar
Bruce Momjian committed
504
   
505
   The maximum table size of 32 TB does not require large file support
Bruce Momjian's avatar
Bruce Momjian committed
506
   from the operating system. Large tables are stored as multiple 1 GB
Bruce Momjian's avatar
Bruce Momjian committed
507
   files so file system size limits are not important.
Bruce Momjian's avatar
Bruce Momjian committed
508
   
509 510
   The maximum table size and maximum number of columns can be quadrupled
   by increasing the default block size to 32k.
Bruce Momjian's avatar
Bruce Momjian committed
511
   
Bruce Momjian's avatar
Bruce Momjian committed
512 513
   One limitation is that indexes can not be created on columns longer
   than about 2,000 characters. Fortunately, such indexes are rarely
514 515 516
   needed. Uniqueness is best guaranteed by a funtion index of an MD5
   hash of the long column, and full text indexing allows for searching
   of words within the column.
Bruce Momjian's avatar
Bruce Momjian committed
517
   
518 519 520
  4.5) How much database disk space is required to store data from a typical
  text file?
  
Bruce Momjian's avatar
Bruce Momjian committed
521 522
   A PostgreSQL database may require up to five times the disk space to
   store data from a text file.
523
   
Bruce Momjian's avatar
Bruce Momjian committed
524 525
   As an example, consider a file of 100,000 lines with an integer and
   text description on each line. Suppose the text string avergages
Bruce Momjian's avatar
Bruce Momjian committed
526 527 528
   twenty bytes in length. The flat file would be 2.8 MB. The size of the
   PostgreSQL database file containing this data can be estimated as 6.4
   MB:
529
    28 bytes: each row header (approximate)
530
    24 bytes: one int field and one text field
531 532
   + 4 bytes: pointer on page to tuple
   ----------------------------------------
533
    56 bytes per row
534 535 536 537

   The data page size in PostgreSQL is 8192 bytes (8 KB), so:

   8192 bytes per page
538 539
   -------------------   =  146 rows per database page (rounded down)
     56 bytes per row
540

Bruce Momjian's avatar
Bruce Momjian committed
541
   100000 data rows
542 543
   --------------------  =  685 database pages (rounded up)
      146 rows per page
544

545
685 database pages * 8192 bytes per page  =  5,611,520 bytes (5.6 MB)
546

Bruce Momjian's avatar
Bruce Momjian committed
547
   Indexes do not require as much overhead, but do contain the data that
548 549
   is being indexed, so they can be large also.
   
550
   NULLs are stored as bitmaps, so they use very little space.
Bruce Momjian's avatar
Bruce Momjian committed
551
   
552 553
  4.6) Why are my queries slow? Why don't they use my indexes?
  
554 555 556 557 558
   Indexes are not used by every query. Indexes are used only if the
   table is larger than a minimum size, and the query selects only a
   small percentage of the rows in the table. This is because the random
   disk access caused by an index scan can be slower than a straight read
   through the table, or sequential scan.
Bruce Momjian's avatar
Bruce Momjian committed
559 560 561 562 563 564 565 566 567 568 569 570 571
   
   To determine if an index should be used, PostgreSQL must have
   statistics about the table. These statistics are collected using
   VACUUM ANALYZE, or simply ANALYZE. Using statistics, the optimizer
   knows how many rows are in the table, and can better determine if
   indexes should be used. Statistics are also valuable in determining
   optimal join order and join methods. Statistics collection should be
   performed periodically as the contents of the table change.
   
   Indexes are normally not used for ORDER BY or to perform joins. A
   sequential scan followed by an explicit sort is usually faster than an
   index scan of a large table.
   However, LIMIT combined with ORDER BY often will use an index because
572 573 574 575 576
   only a small portion of the table is returned. In fact, though MAX()
   and MIN() don't use indexes, it is possible to retrieve such values
   using an index with ORDER BY and LIMIT:
    SELECT col
    FROM tab
Bruce Momjian's avatar
Bruce Momjian committed
577
    ORDER BY col [ DESC ]
578
    LIMIT 1;
579

580
   If you believe the optimizer is incorrect in choosing a sequential
581 582
   scan, use SET enable_seqscan TO 'off' and run query again to see if an
   index scan is indeed faster.
583
   
Bruce Momjian's avatar
Bruce Momjian committed
584
   When using wild-card operators such as LIKE or ~, indexes can only be
585 586
   used in certain circumstances:
     * The beginning of the search string must be anchored to the start
Bruce Momjian's avatar
Bruce Momjian committed
587 588 589 590 591
       of the string, i.e.
          + LIKE patterns must not start with %.
          + ~ (regular expression) patterns must start with ^.
     * The search string can not start with a character class, e.g.
       [a-e].
Bruce Momjian's avatar
Bruce Momjian committed
592
     * Case-insensitive searches such as ILIKE and ~* do not utilize
593 594
       indexes. Instead, use expression indexes, which are described in
       section 4.8.
595
     * The default C locale must be used during initdb because it is not
596 597 598
       possible to know the next-greatest character in a non-C locale.
       You can create a special text_pattern_ops index for such cases
       that work only for LIKE indexing.
599
       
Bruce Momjian's avatar
Bruce Momjian committed
600
   In pre-8.0 releases, indexes often can not be used unless the data
601
   types exactly match the index's column types. This was particularly
Bruce Momjian's avatar
Bruce Momjian committed
602 603
   true of int2, int8, and numeric column indexes.
   
604 605
  4.7) How do I see how the query optimizer is evaluating my query?
  
Bruce Momjian's avatar
Bruce Momjian committed
606
   See the EXPLAIN manual page.
607
   
608 609 610 611
  4.8) How do I perform regular expression searches and case-insensitive
  regular expression searches? How do I use an index for case-insensitive
  searches?
  
Bruce Momjian's avatar
Bruce Momjian committed
612
   The ~ operator does regular expression matching, and ~* does
Bruce Momjian's avatar
Bruce Momjian committed
613
   case-insensitive regular expression matching. The case-insensitive
614
   variant of LIKE is called ILIKE.
Bruce Momjian's avatar
Bruce Momjian committed
615
   
Bruce Momjian's avatar
Bruce Momjian committed
616 617 618
   Case-insensitive equality comparisons are normally expressed as:
    SELECT *
    FROM tab
619
    WHERE lower(col) = 'abc';
Bruce Momjian's avatar
Bruce Momjian committed
620 621

   This will not use an standard index. However, if you create a
622
   expresssion index, it will be used:
623
    CREATE INDEX tabindex ON tab (lower(col));
Bruce Momjian's avatar
Bruce Momjian committed
624

Bruce Momjian's avatar
Bruce Momjian committed
625 626 627 628 629
   If the above index is created as UNIQUE, though the column can store
   upper and lowercase characters, it can not have identical values that
   differ only in case. To force a particular case to be stored in the
   column, use a CHECK constraint or a trigger.
   
630 631 632
  4.9) In a query, how do I detect if a field is NULL? How can I sort on
  whether a field is NULL or not?
  
633
   You test the column with IS NULL and IS NOT NULL, like this:
634 635 636 637
   SELECT *
   FROM tab
   WHERE col IS NULL;

Bruce Momjian's avatar
Bruce Momjian committed
638
   To sort by the NULL status, use the IS NULL and IS NOT NULL modifiers
639
   in your ORDER BY clause. Things that are true will sort higher than
Bruce Momjian's avatar
Bruce Momjian committed
640 641
   things that are false, so the following will put NULL entries at the
   top of the resulting list:
642 643 644 645
   SELECT *
   FROM tab
   ORDER BY (col IS NOT NULL)

646 647
  4.10) What is the difference between the various character types?
  
648 649 650 651 652 653
        Type    Internal Name                    Notes
     VARCHAR(n) varchar       size specifies maximum length, no padding
     CHAR(n)    bpchar        blank padded to the specified fixed length
     TEXT       text          no specific upper limit on length
     BYTEA      bytea         variable-length byte array (null-byte safe)
     "char"     char          one character
654
   
Bruce Momjian's avatar
Bruce Momjian committed
655 656
   You will see the internal name when examining system catalogs and in
   some error messages.
Bruce Momjian's avatar
Bruce Momjian committed
657
   
658
   The first four types above are "varlena" types (i.e., the first four
Bruce Momjian's avatar
Bruce Momjian committed
659
   bytes on disk are the length, followed by the data). Thus the actual
660 661 662
   space used is slightly greater than the declared size. However, long
   values are also subject to compression, so the space on disk might
   also be less than expected.
Bruce Momjian's avatar
Bruce Momjian committed
663
   VARCHAR(n) is best when storing variable-length strings and it limits
Bruce Momjian's avatar
Bruce Momjian committed
664
   how long a string can be. TEXT is for strings of unlimited length,
Bruce Momjian's avatar
Bruce Momjian committed
665
   with a maximum of one gigabyte.
Bruce Momjian's avatar
Bruce Momjian committed
666 667
   
   CHAR(n) is for storing strings that are all the same length. CHAR(n)
Bruce Momjian's avatar
Bruce Momjian committed
668 669
   pads with blanks to the specified length, while VARCHAR(n) only stores
   the characters supplied. BYTEA is for storing binary data,
670 671
   particularly values that include NULL bytes. All the types described
   here have similar performance characteristics.
Bruce Momjian's avatar
Bruce Momjian committed
672
   
673 674
  4.11.1) How do I create a serial/auto-incrementing field?
  
675 676
   PostgreSQL supports a SERIAL data type. It auto-creates a sequence.
   For example, this:
Bruce Momjian's avatar
Bruce Momjian committed
677 678 679 680
    CREATE TABLE person (
        id   SERIAL,
        name TEXT
    );
Bruce Momjian's avatar
Bruce Momjian committed
681

Bruce Momjian's avatar
Bruce Momjian committed
682
   is automatically translated into this:
Bruce Momjian's avatar
Bruce Momjian committed
683 684 685 686 687
    CREATE SEQUENCE person_id_seq;
    CREATE TABLE person (
        id   INT4 NOT NULL DEFAULT nextval('person_id_seq'),
        name TEXT
    );
Bruce Momjian's avatar
Bruce Momjian committed
688 689

   See the create_sequence manual page for more information about
690
   sequences.
Bruce Momjian's avatar
Bruce Momjian committed
691
   
692 693
  4.11.2) How do I get the value of a SERIAL insert?
  
Bruce Momjian's avatar
Bruce Momjian committed
694
   One approach is to retrieve the next SERIAL value from the sequence
Bruce Momjian's avatar
Bruce Momjian committed
695
   object with the nextval() function before inserting and then insert it
696
   explicitly. Using the example table in 4.11.1, an example in a
697 698 699
   pseudo-language would look like this:
    new_id = execute("SELECT nextval('person_id_seq')");
    execute("INSERT INTO person (id, name) VALUES (new_id, 'Blaise Pascal')");
Bruce Momjian's avatar
Bruce Momjian committed
700

Bruce Momjian's avatar
Bruce Momjian committed
701 702 703
   You would then also have the new value stored in new_id for use in
   other queries (e.g., as a foreign key to the person table). Note that
   the name of the automatically created SEQUENCE object will be named
704 705
   <table>_< serialcolumn>_seq, where table and serialcolumn are the
   names of your table and your SERIAL column, respectively.
Bruce Momjian's avatar
Bruce Momjian committed
706
   
Bruce Momjian's avatar
Bruce Momjian committed
707 708
   Alternatively, you could retrieve the assigned SERIAL value with the
   currval() function after it was inserted by default, e.g.,
709 710
    execute("INSERT INTO person (name) VALUES ('Blaise Pascal')");
    new_id = execute("SELECT currval('person_id_seq')");
Bruce Momjian's avatar
Bruce Momjian committed
711

712 713
  4.11.3) Doesn't currval() lead to a race condition with other users?
  
714 715
   No. currval() returns the current value assigned by your session, not
   by all sessions.
Bruce Momjian's avatar
Bruce Momjian committed
716
   
717 718 719
  4.11.4) Why aren't my sequence numbers reused on transaction abort? Why are
  there gaps in the numbering of my sequence/SERIAL column?
  
Bruce Momjian's avatar
Bruce Momjian committed
720
   To improve concurrency, sequence values are given out to running
Bruce Momjian's avatar
Bruce Momjian committed
721
   transactions as needed and are not locked until the transaction
Bruce Momjian's avatar
Bruce Momjian committed
722 723
   completes. This causes gaps in numbering from aborted transactions.
   
724
  4.12) What is an OID? What is a CTID?
725
  
Bruce Momjian's avatar
Bruce Momjian committed
726 727 728 729 730
   Every row that is created in PostgreSQL gets a unique OID unless
   created WITHOUT OIDS. OIDs are autotomatically assigned unique 4-byte
   integers that are unique across the entire installation. However, they
   overflow at 4 billion, and then the OIDs start being duplicated.
   PostgreSQL uses OIDs to link its internal system tables together.
Bruce Momjian's avatar
Bruce Momjian committed
731
   
Bruce Momjian's avatar
Bruce Momjian committed
732 733 734 735
   To uniquely number columns in user tables, it is best to use SERIAL
   rather than OIDs because SERIAL sequences are unique only within a
   single table. and are therefore less likely to overflow. SERIAL8 is
   available for storing eight-byte sequence values.
Bruce Momjian's avatar
Bruce Momjian committed
736
   
737 738 739
   CTIDs are used to identify specific physical rows with block and
   offset values. CTIDs change after rows are modified or reloaded. They
   are used by index entries to point to physical rows.
Bruce Momjian's avatar
Bruce Momjian committed
740
   
741 742
  4.13) Why do I get the error "ERROR: Memory exhausted in AllocSetAlloc()"?
  
743 744 745
   You probably have run out of virtual memory on your system, or your
   kernel has a low limit for certain resources. Try this before starting
   postmaster:
Bruce Momjian's avatar
Bruce Momjian committed
746 747 748 749 750 751 752 753 754 755
    ulimit -d 262144
    limit datasize 256m

   Depending on your shell, only one of these may succeed, but it will
   set your process data segment limit much higher and perhaps allow the
   query to complete. This command applies to the current process, and
   all subprocesses created after the command is run. If you are having a
   problem with the SQL client because the backend is returning too much
   data, try it before starting the client.
   
756 757
  4.14) How do I tell what PostgreSQL version I am running?
  
758
   From psql, type SELECT version();
Bruce Momjian's avatar
Bruce Momjian committed
759
   
760
  4.15) How do I create a column that will default to the current time?
761
  
Bruce Momjian's avatar
Bruce Momjian committed
762
   Use CURRENT_TIMESTAMP:
763
    CREATE TABLE test (x int, modtime TIMESTAMP DEFAULT CURRENT_TIMESTAMP );
Bruce Momjian's avatar
Bruce Momjian committed
764

765
  4.16) How do I perform an outer join?
766
  
767 768
   PostgreSQL supports outer joins using the SQL standard syntax. Here
   are two examples:
Bruce Momjian's avatar
Bruce Momjian committed
769 770 771 772
    SELECT *
    FROM t1 LEFT OUTER JOIN t2 ON (t1.col = t2.col);

   or
Bruce Momjian's avatar
Bruce Momjian committed
773 774 775
    SELECT *
    FROM t1 LEFT OUTER JOIN t2 USING (col);

Bruce Momjian's avatar
Bruce Momjian committed
776 777 778 779 780 781
   These identical queries join t1.col to t2.col, and also return any
   unjoined rows in t1 (those with no match in t2). A RIGHT join would
   add unjoined rows of t2. A FULL join would return the matched rows
   plus all unjoined rows from t1 and t2. The word OUTER is optional and
   is assumed in LEFT, RIGHT, and FULL joins. Ordinary joins are called
   INNER joins.
Bruce Momjian's avatar
Bruce Momjian committed
782
   
783
  4.17) How do I perform queries using multiple databases?
784
  
785 786 787
   There is no way to query a database other than the current one.
   Because PostgreSQL loads database-specific system catalogs, it is
   uncertain how a cross-database query should even behave.
Bruce Momjian's avatar
Bruce Momjian committed
788
   
Bruce Momjian's avatar
Bruce Momjian committed
789
   contrib/dblink allows cross-database queries using function calls. Of
790
   course, a client can also make simultaneous connections to different
791
   databases and merge the results on the client side.
Bruce Momjian's avatar
Bruce Momjian committed
792
   
793
  4.18) How do I return multiple rows or columns from a function?
794
  
795
   It is easy using set-returning functions,
796 797
   http://techdocs.postgresql.org/guides/SetReturningFunctions
   .
Bruce Momjian's avatar
Bruce Momjian committed
798
   
799
  4.19) Why do I get "relation with OID ##### does not exist" errors when
800 801
  accessing temporary tables in PL/PgSQL functions?
  
802
   PL/PgSQL caches function scripts, and an unfortunate side effect is
803 804 805 806 807 808
   that if a PL/PgSQL function accesses a temporary table, and that table
   is later dropped and recreated, and the function called again, the
   function will fail because the cached function contents still point to
   the old temporary table. The solution is to use EXECUTE for temporary
   table access in PL/PgSQL. This will cause the query to be reparsed
   every time.
809
   
810
  4.20) What replication solutions are available?
Bruce Momjian's avatar
Bruce Momjian committed
811
  
Bruce Momjian's avatar
Bruce Momjian committed
812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827
   Though "replication" is a single term, there are several technologies
   for doing replication, with advantages and disadvantages for each.
   
   Master/slave replication allows a single master to receive read/write
   queries, while slaves can only accept read/SELECT queries. The most
   popular freely available master-slave PostgreSQL replication solution
   is Slony-I.
   
   Multi-master replication allows read/write queries to be sent to
   multiple replicated computers. This capability also has a severe
   impact on performance due to the need to synchronize changes between
   servers. Pgcluster is the most popular such solution freely available
   for PostgreSQL.
   
   There are also commercial and hardware-based replication solutions
   available supporting a variety of replication models.
Bruce Momjian's avatar
Bruce Momjian committed
828
   
Bruce Momjian's avatar
Bruce Momjian committed
829
  4.21) Why are my table and column names not recognized in my query?
Bruce Momjian's avatar
Bruce Momjian committed
830 831 832 833 834 835 836 837 838 839 840
  
   The most common cause is the use of double-quotes around table or
   column names during table creation. When double-quotes are used, table
   and column names (called identifiers) are stored case-sensitive,
   meaning you must use double-quotes when referencing the names in a
   query. Some interfaces, like pgAdmin, automatically double-quote
   identifiers during table creation. So, for identifiers to be
   recognized, you must either:
     * Avoid double-quoting identifiers when creating tables
     * Use only lowercase characters in identifiers
     * Double-quote identifiers when referencing them in queries