1. 24 Jun, 2012 4 commits
    • Heikki Linnakangas's avatar
      Replace XLogRecPtr struct with a 64-bit integer. · 0ab9d1c4
      Heikki Linnakangas authored
      This simplifies code that needs to do arithmetic on XLogRecPtrs.
      
      To avoid changing on-disk format of data pages, the LSN on data pages is
      still stored in the old format. That should keep pg_upgrade happy. However,
      we have XLogRecPtrs embedded in the control file, and in the structs that
      are sent over the replication protocol, so this changes breaks compatibility
      of pg_basebackup and server. I didn't do anything about this in this patch,
      per discussion on -hackers, the right thing to do would to be to change the
      replication protocol to be architecture-independent, so that you could use
      a newer version of pg_receivexlog, for example, against an older server
      version.
      0ab9d1c4
    • Heikki Linnakangas's avatar
      Allow WAL record header to be split across pages. · 061e7efb
      Heikki Linnakangas authored
      This saves a few bytes of WAL space, but the real motivation is to make it
      predictable how much WAL space a record requires, as it no longer depends
      on whether we need to waste the last few bytes at end of WAL page because
      the header doesn't fit.
      
      The total length field of WAL record, xl_tot_len, is moved to the beginning
      of the WAL record header, so that it is still always found on the first page
      where a WAL record begins.
      
      Bump WAL version number again as this is an incompatible change.
      061e7efb
    • Heikki Linnakangas's avatar
      Move WAL continuation record information to WAL page header. · 20ba5ca6
      Heikki Linnakangas authored
      The continuation record only contained one field, xl_rem_len, so it makes
      things simpler to just include it in the WAL page header. This wastes four
      bytes on pages that don't begin with a continuation from previos page, plus
      four bytes on every page, because of padding.
      
      The motivation of this is to make it easier to calculate how much space a
      WAL record needs. Before this patch, it depended on how many page boundaries
      the record crosses. The motivation of that, in turn, is to separate the
      allocation of space in the WAL from the copying of the record data to the
      allocated space. Keeping the calculation of space required simple helps to
      keep the critical section of allocating the space from WAL short. But that's
      not included in this patch yet.
      
      Bump WAL version number again, as this is an incompatible change.
      20ba5ca6
    • Heikki Linnakangas's avatar
      Don't waste the last segment of each 4GB logical log file. · dfda6eba
      Heikki Linnakangas authored
      The comments claimed that wasting the last segment made it easier to do
      calculations with XLogRecPtrs, because you don't have problems representing
      last-byte-position-plus-1 that way. In my experience, however, it only made
      things more complicated, because the there was two ways to represent the
      boundary at the beginning of a logical log file: logid = n+1 and xrecoff = 0,
      or as xlogid = n and xrecoff = 4GB - XLOG_SEG_SIZE. Some functions were
      picky about which representation was used.
      
      Also, use a 64-bit segment number instead of the log/seg combination, to
      point to a certain WAL segment. We assume that all platforms have a working
      64-bit integer type nowadays.
      
      This is an incompatible change in WAL format, so bumping WAL version number.
      dfda6eba
  2. 22 Jun, 2012 2 commits
  3. 21 Jun, 2012 5 commits
    • Peter Eisentraut's avatar
      Make placeholders in SQL command help more consistent and precise · 6753ced3
      Peter Eisentraut authored
      To avoid divergent names on related pages, avoid ambiguities, and
      reduce translation work a little.
      6753ced3
    • Tom Lane's avatar
      Fix memory leak in ARRAY(SELECT ...) subqueries. · d14241c2
      Tom Lane authored
      Repeated execution of an uncorrelated ARRAY_SUBLINK sub-select (which
      I think can only happen if the sub-select is embedded in a larger,
      correlated subquery) would leak memory for the duration of the query,
      due to not reclaiming the array generated in the previous execution.
      Per bug #6698 from Armando Miraglia.  Diagnosis and fix idea by Heikki,
      patch itself by me.
      
      This has been like this all along, so back-patch to all supported versions.
      d14241c2
    • Alvaro Herrera's avatar
      68d0e3cb
    • Heikki Linnakangas's avatar
      Add a small cache of locks owned by a resource owner in ResourceOwner. · eeb6f37d
      Heikki Linnakangas authored
      This speeds up reassigning locks to the parent owner, when the transaction
      holds a lot of locks, but only a few of them belong to the current resource
      owner. This is particularly helps pg_dump when dumping a large number of
      objects.
      
      The cache can hold up to 15 locks in each resource owner. After that, the
      cache is marked as overflowed, and we fall back to the old method of
      scanning the whole local lock table. The tradeoff here is that the cache has
      to be scanned whenever a lock is released, so if the cache is too large,
      lock release becomes more expensive. 15 seems enough to cover pg_dump, and
      doesn't have much impact on lock release.
      
      Jeff Janes, reviewed by Amit Kapila and Heikki Linnakangas.
      eeb6f37d
    • Tom Lane's avatar
      Remove incomplete/incorrect support for zero-column foreign keys. · dfd9c116
      Tom Lane authored
      The original coding in ri_triggers.c had partial support for the concept of
      zero-column foreign key constraints.  But this is not defined in the SQL
      standard, nor was it ever allowed by any other part of Postgres, nor was it
      very fully implemented even here (eg there was no support for preventing
      PK-table deletions that would violate the constraint).  Doesn't seem very
      useful to carry 100-plus lines of code for a corner case that no one is
      interested in making work.  Instead, just add a check that the column list
      read from pg_constraint is non-empty.
      dfd9c116
  4. 20 Jun, 2012 3 commits
    • Tom Lane's avatar
      Increase MAX_SYSCACHE_CALLBACKS from 20 to 32. · 0ce4459a
      Tom Lane authored
      By my count there are 18 callers of CacheRegisterSyscacheCallback in the
      core code in HEAD, so we are potentially leaving as few as 2 slots for any
      add-on code to use (though possibly not all these callers would actually
      activate in any particular session).  That doesn't seem like a lot of
      headroom, so let's pump it up a little.
      0ce4459a
    • Tom Lane's avatar
      Cache the results of ri_FetchConstraintInfo in a backend-local cache. · 45ba424f
      Tom Lane authored
      Extracting data from pg_constraint turned out to take as much as 10% of the
      runtime in a bulk-update case where the foreign key column wasn't changing,
      because we did it over again for each tuple.  Fix that by maintaining a
      backend-local cache of the results.  This is really a pretty small patch,
      but converting the trigger functions to work with pointers rather than
      local struct variables requires a lot of mechanical changes.
      45ba424f
    • Tom Lane's avatar
      Improve tests for whether we can skip queueing RI enforcement triggers. · cfa0f425
      Tom Lane authored
      During an update of a PK row, we can skip firing the RI trigger if any old
      key value is NULL, because then the row could not have had any matching
      rows in the FK table.  Conversely, during an update of an FK row, the
      outcome is determined if any new key value is NULL.  In either case it
      becomes unnecessary to compare individual key values.
      
      This patch was inspired by discussion of Vik Reykja's patch to use IS NOT
      DISTINCT semantics for the key comparisons.  In the event there is no need
      for that and so this patch looks nothing like his, but he should still get
      credit for having re-opened consideration of the trigger skip logic.
      cfa0f425
  5. 19 Jun, 2012 5 commits
  6. 18 Jun, 2012 6 commits
    • Tom Lane's avatar
      Allow ON UPDATE/DELETE SET DEFAULT plans to be cached. · e8c9fd5f
      Tom Lane authored
      Once upon a time, somebody was worried that cached RI plans wouldn't get
      remade with new default values after ALTER TABLE ... SET DEFAULT, so they
      didn't allow caching of plans for ON UPDATE/DELETE SET DEFAULT actions.
      That time is long gone, though (and even at the time I doubt this was the
      greatest hazard posed by ALTER TABLE...).  So allow these triggers to cache
      their plans just like the others.
      
      The cache_plan argument to ri_PlanCheck is now vestigial, since there
      are no callers that don't pass "true"; but I left it alone in case there
      is any future need for it.
      e8c9fd5f
    • Tom Lane's avatar
      Remove derived fields from RI_QueryKey, and do a bit of other cleanup. · 03a5ba24
      Tom Lane authored
      We really only need the foreign key constraint's OID and the query type
      code to uniquely identify each plan we are caching for FK checks.  The
      other stuff that was in the struct had no business being used as part of
      a hash key, and was all just being copied from struct RI_ConstraintInfo
      anyway.  Get rid of the unnecessary fields, and readjust various function
      APIs to make them use RI_ConstraintInfo not RI_QueryKey as info source.
      
      I'd be surprised if this makes any measurable performance difference,
      but it certainly feels cleaner.
      03a5ba24
    • Peter Eisentraut's avatar
      e1e97e93
    • Tom Lane's avatar
      Update SQL spec references in ri_triggers code to match SQL:2008. · f9429746
      Tom Lane authored
      Now that what we're implementing isn't SQL92, we probably shouldn't cite
      chapter and verse in that spec anymore.  Also fix some comments that
      talked about MATCH FULL but in fact were in code that's also used for
      MATCH SIMPLE.
      
      No code changes in this commit, just comments.
      f9429746
    • Tom Lane's avatar
      Change ON UPDATE SET NULL/SET DEFAULT referential actions to meet SQL spec. · c75be2ad
      Tom Lane authored
      Previously, when executing an ON UPDATE SET NULL or SET DEFAULT action for
      a multicolumn MATCH SIMPLE foreign key constraint, we would set only those
      referencing columns corresponding to referenced columns that were changed.
      This is what the SQL92 standard said to do --- but more recent versions
      of the standard say that all referencing columns should be set to null or
      their default values, no matter exactly which referenced columns changed.
      At least for SET DEFAULT, that is clearly saner behavior.  It's somewhat
      debatable whether it's an improvement for SET NULL, but it appears that
      other RDBMS systems read the spec this way.  So let's do it like that.
      
      This is a release-notable behavioral change, although considering that
      our documentation already implied it was done this way, the lack of
      complaints suggests few people use such cases.
      c75be2ad
    • Tom Lane's avatar
      Refer to the default foreign key match style as MATCH SIMPLE internally. · f5297bdf
      Tom Lane authored
      Previously we followed the SQL92 wording, "MATCH <unspecified>", but since
      SQL99 there's been a less awkward way to refer to the default style.
      
      In addition to the code changes, pg_constraint.confmatchtype now stores
      this match style as 's' (SIMPLE) rather than 'u' (UNSPECIFIED).  This
      doesn't affect pg_dump or psql because they use pg_get_constraintdef()
      to reconstruct foreign key definitions.  But other client-side code might
      examine that column directly, so this change will have to be marked as
      an incompatibility in the 9.3 release notes.
      f5297bdf
  7. 17 Jun, 2012 3 commits
    • Peter Eisentraut's avatar
      Make documentation of --help and --version options more consistent · bb7520cc
      Peter Eisentraut authored
      Before, some places didn't document the short options (-? and -V),
      some documented both, some documented nothing, and they were listed in
      various orders.  Now this is hopefully more consistent and complete.
      bb7520cc
    • Tom Lane's avatar
      Fix stats collector to recover nicely when system clock goes backwards. · 9e18eacb
      Tom Lane authored
      Formerly, if the system clock went backwards, the stats collector would
      fail to update the stats file any more until the clock reading again
      exceeds whatever timestamp was last written into the stats file.  Such
      glitches in the clock's behavior are not terribly unlikely on machines
      not using NTP.  Such a scenario has been observed to cause regression test
      failures in the buildfarm, and it could have bad effects on the behavior
      of autovacuum, so it seems prudent to install some defenses.
      
      We could directly detect the clock going backwards by adding
      GetCurrentTimestamp calls in the stats collector's main loop, but that
      would hurt performance on platforms where GetCurrentTimestamp is expensive.
      To minimize the performance hit in normal cases, adopt a more complicated
      scheme wherein backends check for clock skew when reading the stats file,
      and if they see it, signal the stats collector by sending an extra stats
      inquiry message.  The stats collector does an extra GetCurrentTimestamp
      only when it receives an inquiry with an apparently out-of-order
      timestamp.
      
      To avoid unnecessary GetCurrentTimestamp calls, expand the inquiry messages
      to carry the backend's current clock reading as well as its stats cutoff
      time.  The latter, being intentionally slightly in-the-past, would trigger
      more clock rechecks than we need if it were used for this purpose.
      
      We might want to backpatch this change at some point, but let's let it
      shake out in the buildfarm for awhile first.
      9e18eacb
    • Magnus Hagander's avatar
      Reorder basebackup options, to list pg_basebackup first · 920febda
      Magnus Hagander authored
      Since this is the easy way of doing it, it should be listed first. All
      the old information is retained for those who want the more advanced way.
      
      Also adds a subheading for compressing logs, that seems to have been missing
      920febda
  8. 16 Jun, 2012 3 commits
  9. 15 Jun, 2012 2 commits
  10. 14 Jun, 2012 7 commits