1. 17 Dec, 2010 4 commits
  2. 16 Dec, 2010 11 commits
  3. 15 Dec, 2010 2 commits
  4. 14 Dec, 2010 4 commits
  5. 13 Dec, 2010 4 commits
  6. 12 Dec, 2010 2 commits
  7. 11 Dec, 2010 6 commits
  8. 10 Dec, 2010 2 commits
    • Tom Lane's avatar
      Use symbolic names not octal constants for file permission flags. · 04f4e10c
      Tom Lane authored
      Purely cosmetic patch to make our coding standards more consistent ---
      we were doing symbolic some places and octal other places.  This patch
      fixes all C-coded uses of mkdir, chmod, and umask.  There might be some
      other calls I missed.  Inconsistency noted while researching tablespace
      directory permissions issue.
      04f4e10c
    • Tom Lane's avatar
      Fix efficiency problems in tuplestore_trim(). · 244407a7
      Tom Lane authored
      The original coding in tuplestore_trim() was only meant to work efficiently
      in cases where each trim call deleted most of the tuples in the store.
      Which, in fact, was the pattern of the original usage with a Material node
      supporting mark/restore operations underneath a MergeJoin.  However,
      WindowAgg now uses tuplestores and it has considerably less friendly
      trimming behavior.  In particular it can attempt to trim one tuple at a
      time off a large tuplestore.  tuplestore_trim() had O(N^2) runtime in this
      situation because of repeatedly shifting its tuple pointer array.  Fix by
      avoiding shifting the array until a reasonably large number of tuples have
      been deleted.  This can waste some pointer space, but we do still reclaim
      the tuples themselves, so the percentage wastage should be pretty small.
      
      Per Jie Li's report of slow percent_rank() evaluation.  cume_dist() and
      ntile() would certainly be affected as well, along with any other window
      function that has a moving frame start and requires reading substantially
      ahead of the current row.
      
      Back-patch to 8.4, where window functions were introduced.  There's no
      need to tweak it before that.
      244407a7
  9. 09 Dec, 2010 4 commits
    • Tom Lane's avatar
      Eliminate O(N^2) behavior in parallel restore with many blobs. · 663fc32e
      Tom Lane authored
      With hundreds of thousands of TOC entries, the repeated searches in
      reduce_dependencies() become the dominant cost.  Get rid of that searching
      by constructing reverse-dependency lists, which we can do in O(N) time
      during the fix_dependencies() preprocessing.  I chose to store the reverse
      dependencies as DumpId arrays for consistency with the forward-dependency
      representation, and keep the previously-transient tocsByDumpId[] array
      around to locate actual TOC entry structs quickly from dump IDs.
      
      While this fixes the slow case reported by Vlad Arkhipov, there is still
      a potential for O(N^2) behavior with sufficiently many tables:
      fix_dependencies itself, as well as mark_create_done and
      inhibit_data_for_failed_table, are doing repeated searches to deal with
      table-to-table-data dependencies.  Possibly this work could be extended
      to deal with that, although the latter two functions are also used in
      non-parallel restore where we currently don't run fix_dependencies.
      
      Another TODO is that we fail to parallelize restore of multiple blobs
      at all.  This appears to require changes in the archive format to fix.
      
      Back-patch to 9.0 where the problem was reported.  8.4 has potential issues
      as well; but since it doesn't create a separate TOC entry for each blob,
      it's at much less risk of having enough TOC entries to cause real problems.
      663fc32e
    • Simon Riggs's avatar
    • Simon Riggs's avatar
      Reduce spurious Hot Standby conflicts from never-visible records. · b9075a6d
      Simon Riggs authored
      Hot Standby conflicts only with tuples that were visible at
      some point. So ignore tuples from aborted transactions or for
      tuples updated/deleted during the inserting transaction when
      generating the conflict transaction ids.
      
      Following detailed analysis and test case by Noah Misch.
      Original report covered btree delete records, correctly observed
      by Heikki Linnakangas that this applies to other cases also.
      Fix covers all sources of cleanup records via common code.
      b9075a6d
    • Tom Lane's avatar
      Force default wal_sync_method to be fdatasync on Linux. · 576477e7
      Tom Lane authored
      Recent versions of the Linux system header files cause xlogdefs.h to
      believe that open_datasync should be the default sync method, whereas
      formerly fdatasync was the default on Linux.  open_datasync is a bad
      choice, first because it doesn't actually outperform fdatasync (in fact
      the reverse), and second because we try to use O_DIRECT with it, causing
      failures on certain filesystems (e.g., ext4 with data=journal option).
      This part of the patch is largely per a proposal from Marti Raudsepp.
      More extensive changes are likely to follow in HEAD, but this is as much
      change as we want to back-patch.
      
      Also clean up confusing code and incorrect documentation surrounding the
      fsync_writethrough option.  Those changes shouldn't result in any actual
      behavioral change, but I chose to back-patch them anyway to keep the
      branches looking similar in this area.
      
      In 9.0 and HEAD, also do some copy-editing on the WAL Reliability
      documentation section.
      
      Back-patch to all supported branches, since any of them might get used
      on modern Linux versions.
      576477e7
  10. 08 Dec, 2010 1 commit