1. 04 Aug, 2010 4 commits
  2. 03 Aug, 2010 12 commits
  3. 02 Aug, 2010 6 commits
  4. 01 Aug, 2010 5 commits
    • Tom Lane's avatar
      Fix ANALYZE's ancient deficiency of not trying to collect stats for expression · 67becf8d
      Tom Lane authored
      indexes when the index column type (the opclass opckeytype) is different from
      the expression's datatype.  When coded, this limitation wasn't worth worrying
      about because we had no intelligence to speak of in stats collection for the
      datatypes used by such opclasses.  However, now that there's non-toy
      estimation capability for tsvector queries, it amounts to a bug that ANALYZE
      fails to do this.
      
      The fix changes struct VacAttrStats, and therefore constitutes an API break
      for custom typanalyze functions.  Therefore we can't back-patch it into
      released branches, but it was agreed that 9.0 isn't yet frozen hard enough
      to make such a change unacceptable.  Ergo, back-patch to 9.0 but no further.
      The API break had better be mentioned in 9.0 release notes.
      67becf8d
    • Tom Lane's avatar
      Add some knowledge about prefix matches to tsmatchsel(). It's not terribly · 97532f7c
      Tom Lane authored
      bright, but it beats assuming that a prefix match behaves identically to an
      exact match, which is what the code was doing before :-(.  Noted while
      experimenting with Artur Dobrowski's example.
      97532f7c
    • Tom Lane's avatar
      Fix an additional set of problems in GIN's handling of lossy page pointers. · d4fe61b0
      Tom Lane authored
      Although the key-combining code claimed to work correctly if its input
      contained both lossy and exact pointers for a single page in a single TID
      stream, in fact this did not work, and could not work without pretty
      fundamental redesign.  Modify keyGetItem so that it will not return such a
      stream, by handling lossy-pointer cases a bit more explicitly than we did
      before.
      
      Per followup investigation of a gripe from Artur Dabrowski.
      An example of a query that failed given his data set is
      select count(*) from search_tab where
      (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:* | dd:*')) and
      (to_tsvector('german', keywords ) @@ to_tsquery('german', 'aa:*'));
      
      Back-patch to 8.4 where the lossy pointer code was introduced.
      d4fe61b0
    • Tom Lane's avatar
      Rewrite the rbtree routines so that an RBNode is the first field of the · 0454f131
      Tom Lane authored
      struct representing a tree entry, rather than being a separately allocated
      piece of storage.  This API is at least as clean as the old one (if not
      more so --- there were some bizarre choices in there) and it permits a
      very substantial memory savings, on the order of 2X in ginbulk.c's usage.
      
      Also, fix minor memory leaks in code called by ginEntryInsert, in
      particular in ginInsertValue and entryFillRoot, as well as ginEntryInsert
      itself.  These leaks resulted in the GIN index build context continuing
      to bloat even after we'd filled it to maintenance_work_mem and started
      to dump data out to the index.
      
      In combination these fixes restore the GIN index build code to honoring
      the maintenance_work_mem limit about as well as it did in 8.4.  Speed
      seems on par with 8.4 too, maybe even a bit faster, for a non-pathological
      case in which HEAD was formerly slower.
      
      Back-patch to 9.0 so we don't have a performance regression from 8.4.
      0454f131
    • Robert Haas's avatar
      Make psql distinguish between unique indices and unique constraints. · afc2900f
      Robert Haas authored
      Josh Kupershmidt.  Reviewing and kibitzing by Kevin Grittner and me.
      afc2900f
  5. 31 Jul, 2010 2 commits
    • Tom Lane's avatar
      Tweak tsmatchsel() so that it examines the structure of the tsquery whenever · b8c798eb
      Tom Lane authored
      possible (ie, whenever the tsquery is a constant), even when no statistics
      are available for the tsvector.  For example, foo @@ 'a & b'::tsquery
      can be expected to be more selective than foo @@ 'a'::tsquery, whether
      or not we know anything about foo.  We use DEFAULT_TS_MATCH_SEL as the assumed
      selectivity of individual query terms when no stats are available, then
      combine the terms according to the query's AND/OR structure as usual.
      
      Per experimentation with Artur Dabrowski's example.  (The fact that there
      are no stats available in that example is a problem in itself, but
      nonetheless tsmatchsel should be smarter about the case.)
      
      Back-patch to 8.4 to keep all versions of tsmatchsel() in sync.
      b8c798eb
    • Tom Lane's avatar
      Rewrite the key-combination logic in GIN's keyGetItem() and scanGetItem() · 2ab57e08
      Tom Lane authored
      routines to make them behave better in the presence of "lossy" index pointers.
      The previous coding was outright incorrect for some cases, as recently
      reported by Artur Dabrowski: scanGetItem would fail to return index entries in
      cases where one index key had multiple exact pointers on the same page as
      another key had a lossy pointer.  Also, keyGetItem was extremely inefficient
      for cases where a single index key generates multiple "entry" streams, such as
      an @@ operator with a multiple-clause tsquery.  The presence of a lossy page
      pointer in any one stream defeated its ability to use the opclass
      consistentFn, resulting in probing many heap pages that didn't really need to
      be visited.  In Artur's example case, a query like
      	WHERE tsvector @@ to_tsquery('a & b')
      was about 50X slower than the theoretically equivalent
      	WHERE tsvector @@ to_tsquery('a') AND tsvector @@ to_tsquery('b')
      The way that I chose to fix this was to have GIN call the consistentFn
      twice with both TRUE and FALSE values for the in-doubt entry stream,
      returning a hit if either call produces TRUE, but not if they both return
      FALSE.  The code handles this for the case of a single in-doubt entry stream,
      but punts (falling back to the stupid behavior) if there's more than one lossy
      reference to the same page.  The idea could be scaled up to deal with multiple
      lossy references, but I think that would probably be wasted complexity.  At
      least to judge by Artur's example, such cases don't occur often enough to be
      worth trying to optimize.
      
      Back-patch to 8.4.  8.3 did not have lossy GIN index pointers, so not
      subject to these problems.
      2ab57e08
  6. 30 Jul, 2010 1 commit
  7. 29 Jul, 2010 10 commits
    • Tom Lane's avatar
      Improved version of patch to protect pg_get_expr() against misuse: · f223bb7a
      Tom Lane authored
      look through join alias Vars to avoid breaking join queries, and
      move the test to someplace where it will catch more possible ways
      of calling a function.  We still ought to throw away the whole thing
      in favor of a data-type-based solution, but that's not feasible in
      the back branches.
      
      This needs to be back-patched further than 9.0, but I don't have time
      to do so today.  Committing now so that the fix gets into 9.0beta4.
      f223bb7a
    • Simon Riggs's avatar
      Rename asyncCommitLSN to asyncXactLSN to reflect changed role in 9.0. · 5b8bd052
      Simon Riggs authored
      Transaction aborts now record their LSN to avoid corner case
      behaviour in SR/HS, hence change of name of variables and functions.
      As pointed out by Fujii Masao. Cosmetic changes only.
      5b8bd052
    • Tom Lane's avatar
    • Robert Haas's avatar
      Avoid using text_to_cstring() in levenshtein functions. · 980341b3
      Robert Haas authored
      Operating directly on the underlying varlena saves palloc and memcpy
      overhead, which testing shows to be significant.
      
      Extracted from a larger patch by Alexander Korotkov.
      980341b3
    • Tom Lane's avatar
      Clean up some inconsistencies in the volatility marking of various I/O · aab353a6
      Tom Lane authored
      related functions.  Per today's discussion, we will henceforth assume
      that datatype I/O functions are either stable or immutable, never volatile.
      (This implies in particular that domain CHECK constraint expressions shouldn't
      be volatile, since domain_in executes them.)  In turn, functions that execute
      the I/O functions of arbitrary datatypes should always be labeled stable.
      This affects the labeling of array_to_string, which was unsafely marked
      immutable, and record_in, record_out, record_recv, record_send,
      domain_in, domain_recv, which were over-conservatively marked volatile.
      The array I/O functions were already marked stable, which is correct
      per this policy but would have been wrong if we maintained domain_in
      as volatile.
      
      Back-patch to 9.0, along with an earlier fix to correctly mark cash_in
      and cash_out as stable not immutable (since they depend on lc_monetary).
      
      No catversion bump --- the implications of this are not currently
      severe enough to justify a forced initdb.
      aab353a6
    • Peter Eisentraut's avatar
      Fix indentation of verbatim block elements · 66424a28
      Peter Eisentraut authored
      Block elements with verbatim formatting (literallayout, programlisting,
      screen, synopsis) should be aligned at column 0 independent of the surrounding
      SGML, because whitespace is significant, and indenting them creates erratic
      whitespace in the output.  The CSS stylesheets already take care of indenting
      the output.
      
      Assorted markup improvements to go along with it.
      66424a28
    • Tom Lane's avatar
      Fix another longstanding problem in copy_relation_data: it was blithely · 984d56b8
      Tom Lane authored
      assuming that a local char[] array would be aligned on at least a word
      boundary.  There are architectures on which that is pretty much guaranteed to
      NOT be the case ... and those arches also don't like non-aligned memory
      accesses, meaning that log_newpage() would crash if it ever got invoked.
      Even on Intel-ish machines there's a potential for a large performance penalty
      from doing I/O to an inadequately aligned buffer.  So palloc it instead.
      
      Backpatch to 8.0 --- 7.4 doesn't have this code.
      984d56b8
    • Tom Lane's avatar
      Work around a documentation toolchain problem by replacing the "AIX-fixlevels" · 5b48e2ec
      Tom Lane authored
      table with a <variablelist> carrying the same information.  Previously the
      9.0 documentation was failing to build as a US-size PDF file.  It's quite
      obscure what the real problem is or why this avoids it, but we need a hack
      now so we can build docs for beta4.
      
      In passing do a bit of editing in the AIX installation docs, in particular
      remove a long-obsolete claim that the regression tests are likely to fail.
      5b48e2ec
    • Robert Haas's avatar
      Fix possible page corruption by ALTER TABLE .. SET TABLESPACE. · 1a078629
      Robert Haas authored
      If a zeroed page is present in the heap, ALTER TABLE .. SET TABLESPACE will
      set the LSN and TLI while copying it, which is wrong, and heap_xlog_newpage()
      will do the same thing during replay, so the corruption propagates to any
      standby.  Note, however, that the bug can't be demonstrated unless archiving
      is enabled, since in that case we skip WAL logging altogether, and the LSN/TLI
      are not set.
      
      Back-patch to 8.0; prior releases do not have tablespaces.
      
      Analysis and patch by Jeff Davis.  Adjustments for back-branches and minor
      wordsmithing by me.
      1a078629
    • Simon Riggs's avatar
      Add explicit regression tests for ALTER TABLE lock levels. · 04e17bae
      Simon Riggs authored
      Use this to catch a couple of lock level assignments that slipped
      through manual testing, per Peter Eisentraut.
      04e17bae