1. 27 Jun, 2006 1 commit
  2. 22 Jun, 2006 1 commit
  3. 18 Jun, 2006 1 commit
    • Tom Lane's avatar
      Don't try to call posix_fadvise() unless <fcntl.h> supplies a declaration · 1e8ae136
      Tom Lane authored
      for it.  Hopefully will fix core dump evidenced by some buildfarm members
      since fadvise patch went in.  The actual definition of the function is not
      ABI-compatible with compiler's default assumption in the absence of any
      declaration, so it's clearly unsafe to try to call it without seeing a
      declaration.
      1e8ae136
  4. 16 Jun, 2006 1 commit
  5. 15 Jun, 2006 1 commit
  6. 20 Apr, 2006 1 commit
    • Tom Lane's avatar
      Ensure that we validate the page header of the first page of a WAL file · eac825aa
      Tom Lane authored
      whenever we start to read within that file.  The first page carries
      extra identification information that really ought to be checked, but
      as the code stood, this was only checked when we switched sequentially
      into a new WAL file, or if by chance the starting checkpoint record was
      within the first page.  This patch ensures that we will detect bogus
      'long header' information before we start replaying the WAL sequence.
      eac825aa
  7. 17 Apr, 2006 1 commit
    • Tom Lane's avatar
      Fix the torn-page hazard for PITR base backups by forcing full page writes · 0a873949
      Tom Lane authored
      to occur between pg_start_backup() and pg_stop_backup(), even if the GUC
      setting full_page_writes is OFF.  Per discussion, doing this in combination
      with the already-existing checkpoint during pg_start_backup() should ensure
      safety against partial page updates being included in the backup.  We do
      not have to force full page writes to occur during normal PITR operation,
      as I had first feared.
      0a873949
  8. 14 Apr, 2006 1 commit
    • Tom Lane's avatar
      Make the world safe for full_page_writes. Allow XLOG records that try to · defe9346
      Tom Lane authored
      update no-longer-existing pages to fall through as no-ops, but make a note
      of each page number referenced by such records.  If we don't see a later
      XLOG entry dropping the table or truncating away the page, complain at
      the end of XLOG replay.  Since this fixes the known failure mode for
      full_page_writes = off, revert my previous band-aid patch that disabled
      that GUC variable.
      defe9346
  9. 05 Apr, 2006 1 commit
  10. 04 Apr, 2006 1 commit
  11. 03 Apr, 2006 1 commit
    • Tom Lane's avatar
      Define a separately configurable XLOG_BLCKSZ symbol for the page size · eaef1113
      Tom Lane authored
      used within WAL files.  Historically this was the same as the data file
      BLCKSZ, but there's no necessary connection, and it's possible that
      performance gains might ensue from reducing XLOG_BLCKSZ.  In any case
      distinguishing two symbols should improve code clarity.  This commit
      does not actually change the page size, only provide the infrastructure
      to make it possible to do so.  initdb forced because of addition of a
      field to pg_control.
      Mark Wong, with some help from Simon Riggs and Tom Lane.
      eaef1113
  12. 31 Mar, 2006 1 commit
    • Tom Lane's avatar
      Clean up WAL/buffer interactions as per my recent proposal. Get rid of the · a8b8f4db
      Tom Lane authored
      misleadingly-named WriteBuffer routine, and instead require routines that
      change buffer pages to call MarkBufferDirty (which does exactly what it says).
      We also require that they do so before calling XLogInsert; this takes care of
      the synchronization requirement documented in SyncOneBuffer.  Note that
      because bufmgr takes the buffer content lock (in shared mode) while writing
      out any buffer, it doesn't matter whether MarkBufferDirty is executed before
      the buffer content change is complete, so long as the content change is
      completed before releasing exclusive lock on the buffer.  So it's OK to set
      the dirtybit before we fill in the LSN.
      This eliminates the former kluge of needing to set the dirtybit in LockBuffer.
      Aside from making the code more transparent, we can also add some new
      debugging assertions, in particular that the caller of MarkBufferDirty must
      hold the buffer content lock, not merely a pin.
      a8b8f4db
  13. 29 Mar, 2006 1 commit
    • Tom Lane's avatar
      Clean up and document the API for XLogOpenRelation and XLogReadBuffer. · 6d61cdec
      Tom Lane authored
      This commit doesn't make much functional change, but it does eliminate some
      duplicated code --- for instance, PageIsNew tests are now done inside
      XLogReadBuffer rather than by each caller.
      The GIST xlog code still needs a lot of love, but I'll worry about that
      separately.
      6d61cdec
  14. 28 Mar, 2006 1 commit
    • Tom Lane's avatar
      Disable full_page_writes, because turning it off risks causing crash-recovery · 0a971e2f
      Tom Lane authored
      failures even when the hardware and OS did nothing wrong.  Per recent analysis
      of a problem report from Alex Bahdushka.
      
      For the moment I've just diked out the test of the parameter, rather than
      removing the GUC infrastructure and documentation, in case we conclude that
      there's something salvageable there.  There seems no chance of it being
      resurrected in the 8.1 branch though.
      0a971e2f
  15. 24 Mar, 2006 1 commit
  16. 05 Mar, 2006 1 commit
  17. 11 Jan, 2006 1 commit
  18. 29 Dec, 2005 1 commit
    • Tom Lane's avatar
      Get rid of the SpinLockAcquire/SpinLockAcquire_NoHoldoff distinction · 195f1642
      Tom Lane authored
      in favor of having just one set of macros that don't do HOLD/RESUME_INTERRUPTS
      (hence, these correspond to the old SpinLockAcquire_NoHoldoff case).
      Given our coding rules for spinlock use, there is no reason to allow
      CHECK_FOR_INTERRUPTS to be done while holding a spinlock, and also there
      is no situation where ImmediateInterruptOK will be true while holding a
      spinlock.  Therefore doing HOLD/RESUME_INTERRUPTS while taking/releasing a
      spinlock is just a waste of cycles.  Qingqing Zhou and Tom Lane.
      195f1642
  19. 28 Dec, 2005 1 commit
  20. 22 Nov, 2005 1 commit
  21. 29 Oct, 2005 1 commit
  22. 22 Oct, 2005 1 commit
  23. 15 Oct, 2005 1 commit
  24. 03 Oct, 2005 1 commit
  25. 22 Aug, 2005 2 commits
    • Tom Lane's avatar
      Rewrite gather-write patch into something less obviously bolted on · 90525373
      Tom Lane authored
      after the fact.  Fix bug with incorrect test for whether we are at end
      of logfile segment.  Arrange for writes triggered by XLogInsert's
      is-cache-more-than-half-full test to synchronize with the cache boundaries,
      so that in long transactions we tend to write alternating halves of the
      cache rather than randomly chosen portions of it; this saves one more
      write syscall per cache load.
      90525373
    • Tom Lane's avatar
      Fix some inconsistent choices of datatypes in xlog.c. Make buffer · d0096a41
      Tom Lane authored
      indexes all be int, rather than variously int, uint16 and uint32;
      add some casts where necessary to support large buffer arrays.
      d0096a41
  26. 20 Aug, 2005 1 commit
    • Tom Lane's avatar
      Convert the arithmetic for shared memory size calculation from 'int' · 0007490e
      Tom Lane authored
      to 'Size' (that is, size_t), and install overflow detection checks in it.
      This allows us to remove the former arbitrary restrictions on NBuffers
      etc.  It won't make any difference in a 32-bit machine, but in a 64-bit
      machine you could theoretically have terabytes of shared buffers.
      (How efficiently we could manage 'em remains to be seen.)  Similarly,
      num_temp_buffers, work_mem, and maintenance_work_mem can be set above
      2Gb on a 64-bit machine.  Original patch from Koichi Suzuki, additional
      work by moi.
      0007490e
  27. 11 Aug, 2005 1 commit
    • Tom Lane's avatar
      Autovacuum loose end mop-up. Provide autovacuum-specific vacuum cost · d90c5311
      Tom Lane authored
      delay and limit, both as global GUCs and as table-specific entries in
      pg_autovacuum.  stats_reset_on_server_start is now OFF by default,
      but a reset is forced if we did WAL replay.  XID-wrap vacuums do not
      ANALYZE, but do FREEZE if it's a template database.  Alvaro Herrera
      d90c5311
  28. 30 Jul, 2005 1 commit
  29. 29 Jul, 2005 3 commits
    • Tom Lane's avatar
      Clean up a number of autovacuum loose ends. Make the stats collector · 5d5f1a79
      Tom Lane authored
      track shared relations in a separate hashtable, so that operations done
      from different databases are counted correctly.  Add proper support for
      anti-XID-wraparound vacuuming, even in databases that are never connected
      to and so have no stats entries.  Miscellaneous other bug fixes.
      Alvaro Herrera, some additional fixes by Tom Lane.
      5d5f1a79
    • Bruce Momjian's avatar
      Update O_DIRECT comment. · c6b1724c
      Bruce Momjian authored
      c6b1724c
    • Bruce Momjian's avatar
      · c34bb005
      Bruce Momjian authored
      Use O_DIRECT if available when using O_SYNC for wal_sync_method.
      
      Also, write multiple WAL buffers out in one write() operation.
      
      ITAGAKI Takahiro
      
      ---------------------------------------------------------------------------
      
      > If we disable writeback-cache and use open_sync, the per-page writing
      > behavior in WAL module will show up as bad result. O_DIRECT is similar
      > to O_DSYNC (at least on linux), so that the benefit of it will disappear
      > behind the slow disk revolution.
      >
      > In the current source, WAL is written as:
      >     for (i = 0; i < N; i++) { write(&buffers[i], BLCKSZ); }
      > Is this intentional? Can we rewrite it as follows?
      >    write(&buffers[0], N * BLCKSZ);
      >
      > In order to achieve it, I wrote a 'gather-write' patch (xlog.gw.diff).
      > Aside from this, I'll also send the fixed direct io patch (xlog.dio.diff).
      > These two patches are independent, so they can be applied either or both.
      >
      >
      > I tested them on my machine and the results as follows. It shows that
      > direct-io and gather-write is the best choice when writeback-cache is off.
      > Are these two patches worth trying if they are used together?
      >
      >
      >             | writeback | fsync= | fdata | open_ | fsync_ | open_
      > patch       | cache     |  false |  sync |  sync | direct | direct
      > ------------+-----------+--------+-------+-------+--------+---------
      > direct io   | off       |  124.2 | 105.7 |  48.3 |   48.3 |  48.2
      > direct io   | on        |  129.1 | 112.3 | 114.1 |  142.9 | 144.5
      > gather-write| off       |  124.3 | 108.7 | 105.4 |  (N/A) | (N/A)
      > both        | off       |  131.5 | 115.5 | 114.4 |  145.4 | 145.2
      >
      > - 20runs * pgbench -s 100 -c 50 -t 200
      >    - with tuning (wal_buffers=64, commit_delay=500, checkpoint_segments=8)
      > - using 2 ATA disks:
      >    - hda(reiserfs) includes system and wal.
      >    - hdc(jfs) includes database files. writeback-cache is always on.
      >
      > ---
      > ITAGAKI Takahiro
      c34bb005
  30. 23 Jul, 2005 2 commits
  31. 08 Jul, 2005 1 commit
  32. 05 Jul, 2005 1 commit
  33. 04 Jul, 2005 1 commit
    • Tom Lane's avatar
      Arrange for the postmaster (and standalone backends, initdb, etc) to · eb5949d1
      Tom Lane authored
      chdir into PGDATA and subsequently use relative paths instead of absolute
      paths to access all files under PGDATA.  This seems to give a small
      performance improvement, and it should make the system more robust
      against naive DBAs doing things like moving a database directory that
      has a live postmaster in it.  Per recent discussion.
      eb5949d1
  34. 30 Jun, 2005 1 commit
    • Tom Lane's avatar
      Improve the checkpoint signaling mechanism so that the bgwriter can tell · 401de9c8
      Tom Lane authored
      the difference between checkpoints forced due to WAL segment consumption
      and checkpoints forced for other reasons (such as CREATE DATABASE).  Avoid
      generating 'checkpoints are occurring too frequently' messages when the
      checkpoint wasn't caused by WAL segment consumption.  Per gripe from
      Chris K-L.
      401de9c8
  35. 29 Jun, 2005 1 commit
    • Tom Lane's avatar
      Clean up the rather historically encumbered interface to now() and · b5f7cff8
      Tom Lane authored
      current time: provide a GetCurrentTimestamp() function that returns
      current time in the form of a TimestampTz, instead of separate time_t
      and microseconds fields.  This is what all the callers really want
      anyway, and it eliminates low-level dependencies on AbsoluteTime,
      which is a deprecated datatype that will have to disappear eventually.
      b5f7cff8
  36. 19 Jun, 2005 1 commit
    • Tom Lane's avatar
      Simplify uses of readdir() by creating a function ReadDir() that · 3f749924
      Tom Lane authored
      includes error checking and an appropriate ereport(ERROR) message.
      This gets rid of rather tedious and error-prone manipulation of errno,
      as well as a Windows-specific bug workaround, at more than a dozen
      call sites.  After an idea in a recent patch by Heikki Linnakangas.
      3f749924