1. 08 Apr, 2020 3 commits
    • Andres Freund's avatar
      snapshot scalability: Move delayChkpt from PGXACT to PGPROC. · 75848bc7
      Andres Freund authored
      The goal of separating hotly accessed per-backend data from PGPROC
      into PGXACT is to make accesses fast (GetSnapshotData() in
      particular). But delayChkpt is not actually accessed frequently; only
      when starting a checkpoint. As it is frequently modified (multiple
      times in the course of a single transaction), storing it in the same
      cacheline as hotly accessed data unnecessarily dirties a contended
      cacheline.
      
      Therefore move delayChkpt to PGPROC.
      
      This is part of a larger series of patches intending to improve
      GetSnapshotData() scalability. It is committed and pushed separately,
      as it is independently beneficial (small but measurable win, limited
      by the other frequent modifications of PGXACT).
      
      Author: Andres Freund
      Reviewed-By: Robert Haas, Thomas Munro, David Rowley
      Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de
      75848bc7
    • Tomas Vondra's avatar
      Track SLRU page hits in SimpleLruReadPage_ReadOnly · 2b88fdde
      Tomas Vondra authored
      SLRU page hits were tracked only in SimpleLruReadPage, but that's not
      enough because we may hit the page in SimpleLruReadPage_ReadOnly in
      which case we don't call SimpleLruReadPage at all.
      
      Reported-by: Kuntal Ghosh
      Discussion: https://postgr.es/m/20200119143707.gyinppnigokesjok@development
      2b88fdde
    • Andres Freund's avatar
      Fix XLogReader FD leak that makes backends unusable after 2PC usage. · 91c40548
      Andres Freund authored
      Before the fix every 2PC commit/abort leaked a file descriptor. As the
      files are opened using BasicOpenFile(), that quickly leads to the
      backend running out of file descriptors.
      
      Once enough 2PC abort/commit have caused enough FDs to leak, any IO
      in the backend will fail with "Too many open files", as
      BasicOpenFilePerm() will have triggered all open files known to fd.c
      to be closed.
      
      The leak causing the problem at hand is a consequence of 0dc8ead4,
      but is only exascerbated by it. Previously most XLogPageReadCB
      callbacks used static variables to cache one open file, but after the
      commit the cache is private to each XLogReader instance. There never
      was infrastructure to close FDs at the time of XLogReaderFree, but the
      way XLogReader was used limited the leak to one FD.
      
      This commit just closes the during XLogReaderFree() if the FD is
      stored in XLogReaderState.seg.ws_segno. This may not be the way to
      solve this medium/long term, but at least unbreaks 2PC.
      
      Discussion: https://postgr.es/m/20200406025651.fpzdb5yyb7qyhqko@alap3.anarazel.de
      91c40548
  2. 07 Apr, 2020 14 commits
  3. 06 Apr, 2020 13 commits
  4. 05 Apr, 2020 4 commits
  5. 04 Apr, 2020 6 commits
    • Noah Misch's avatar
      Add perl2host call missing from a new test file. · 70de4e95
      Noah Misch authored
      Oversight in today's commit c6b92041.
      Per buildfarm member jacana.
      
      Discussion: http://postgr.es/m/20200404223212.GC3442685@rfd.leadboat.com
      70de4e95
    • Tom Lane's avatar
      Remove bogus Assert, add some regression test cases showing why. · 07871d40
      Tom Lane authored
      Commit 77ec5aff added an assertion to enforce_generic_type_consistency
      that boils down to "if the function result is polymorphic, there must be
      at least one polymorphic argument".  This should be true for user-created
      functions, but there are built-in functions for which it's not true, as
      pointed out by Jaime Casanova.  Hence, go back to the old behavior of
      leaving the return type alone.  There's only a limited amount of stuff
      you can do with such a function result, but it does work to some extent;
      add some regression test cases to ensure we don't break that again.
      
      Discussion: https://postgr.es/m/CAJGNTeMbhtsCUZgJJ8h8XxAJbK7U2ipsX8wkHRtZRz-NieT8RA@mail.gmail.com
      07871d40
    • Noah Misch's avatar
      Skip WAL for new relfilenodes, under wal_level=minimal. · c6b92041
      Noah Misch authored
      Until now, only selected bulk operations (e.g. COPY) did this.  If a
      given relfilenode received both a WAL-skipping COPY and a WAL-logged
      operation (e.g. INSERT), recovery could lose tuples from the COPY.  See
      src/backend/access/transam/README section "Skipping WAL for New
      RelFileNode" for the new coding rules.  Maintainers of table access
      methods should examine that section.
      
      To maintain data durability, just before commit, we choose between an
      fsync of the relfilenode and copying its contents to WAL.  A new GUC,
      wal_skip_threshold, guides that choice.  If this change slows a workload
      that creates small, permanent relfilenodes under wal_level=minimal, try
      adjusting wal_skip_threshold.  Users setting a timeout on COMMIT may
      need to adjust that timeout, and log_min_duration_statement analysis
      will reflect time consumption moving to COMMIT from commands like COPY.
      
      Internally, this requires a reliable determination of whether
      RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
      current relfilenode.  Introduce rd_firstRelfilenodeSubid.  Amend the
      specification of rd_createSubid such that the field is zero when a new
      rel has an old rd_node.  Make relcache.c retain entries for certain
      dropped relations until end of transaction.
      
      Bump XLOG_PAGE_MAGIC, since this introduces XLOG_GIST_ASSIGN_LSN.
      Future servers accept older WAL, so this bump is discretionary.
      
      Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
      Haas.  Heikki Linnakangas and Michael Paquier implemented earlier
      designs that materially clarified the problem.  Reviewed, in earlier
      designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
      Fujii Masao, and Simon Riggs.  Reported by Martijn van Oosterhout.
      
      Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
      c6b92041
    • Peter Eisentraut's avatar
      552fcebf
    • Amit Kapila's avatar
      Add infrastructure to track WAL usage. · df3b1814
      Amit Kapila authored
      This allows gathering the WAL generation statistics for each statement
      execution.  The three statistics that we collect are the number of WAL
      records, the number of full page writes and the amount of WAL bytes
      generated.
      
      This helps the users who have write-intensive workload to see the impact
      of I/O due to WAL.  This further enables us to see approximately what
      percentage of overall WAL is due to full page writes.
      
      In the future, we can extend this functionality to allow us to compute the
      the exact amount of WAL data due to full page writes.
      
      This patch in itself is just an infrastructure to compute WAL usage data.
      The upcoming patches will expose this data via explain, auto_explain,
      pg_stat_statements and verbose (auto)vacuum output.
      
      Author: Kirill Bychik, Julien Rouhaud
      Reviewed-by: Dilip Kumar, Fujii Masao and Amit Kapila
      Discussion: https://postgr.es/m/CAB-hujrP8ZfUkvL5OYETipQwA=e3n7oqHFU=4ZLxWS_Cza3kQQ@mail.gmail.com
      df3b1814
    • Jeff Davis's avatar
      Include chunk overhead in hash table entry size estimate. · 0588ee63
      Jeff Davis authored
      Don't try to be precise about it, just use a constant 16 bytes of
      chunk overhead. Being smarter would require knowing the memory context
      where the chunk will be allocated, which is not known by all callers.
      
      Discussion: https://postgr.es/m/20200325220936.il3ni2fj2j2b45y5@alap3.anarazel.de
      0588ee63