1. 01 May, 2020 4 commits
    • Peter Geoghegan's avatar
      Fix undercounting in VACUUM VERBOSE output. · 73a076b0
      Peter Geoghegan authored
      The logic for determining how many nbtree pages in an index are deleted
      pages sometimes undercounted pages.  Pages that were deleted by the
      current VACUUM operation (as opposed to some previous VACUUM operation
      whose deleted pages have yet to be reused) were sometimes overlooked.
      The final count is exposed to users through VACUUM VERBOSE's "%u index
      pages have been deleted" output.
      
      btvacuumpage() avoided double-counting when _bt_pagedel() deleted more
      than one page by assuming that only one page was deleted, and that the
      additional deleted pages would get picked up during a future call to
      btvacuumpage() by the same VACUUM operation.  _bt_pagedel() can
      legitimately delete pages that the btvacuumscan() scan will not visit
      again, though, so that assumption was slightly faulty.
      
      Fix the accounting by teaching _bt_pagedel() about its caller's
      requirements.  It now only reports on pages that it knows btvacuumscan()
      won't visit again (including the current btvacuumpage() page), so
      everything works out in the end.
      
      This bug has been around forever.  Only backpatch to v11, though, to
      keep _bt_pagedel() is sync on the branches that have today's bugfix
      commit b0229f26.  Note that this commit changes the signature of
      _bt_pagedel(), just like commit b0229f26.
      
      Author: Peter Geoghegan
      Reviewed-By: Masahiko Sawada
      Discussion: https://postgr.es/m/CAH2-WzkrXBcMQWAYUJMFTTvzx_r4q=pYSjDe07JnUXhe+OZnJA@mail.gmail.com
      Backpatch: 11-
      73a076b0
    • Peter Geoghegan's avatar
      Fix bug in nbtree VACUUM "skip full scan" feature. · b0229f26
      Peter Geoghegan authored
      Commit 857f9c36 (which taught nbtree VACUUM to skip a scan of the
      index from btcleanup in situations where it doesn't seem worth it) made
      VACUUM maintain the oldest btpo.xact among all deleted pages for the
      index as a whole.  It failed to handle all the details surrounding pages
      that are deleted by the current VACUUM operation correctly (though pages
      deleted by some previous VACUUM operation were processed correctly).
      
      The most immediate problem was that the special area of the page was
      examined without a buffer pin at one point.  More fundamentally, the
      handling failed to account for the full range of _bt_pagedel()
      behaviors.  For example, _bt_pagedel() sometimes deletes internal pages
      in passing, as part of deleting an entire subtree with btvacuumpage()
      caller's page as the leaf level page.  The original leaf page passed to
      _bt_pagedel() might not be the page that it deletes first in cases where
      deletion can take place.
      
      It's unclear how disruptive this bug may have been, or what symptoms
      users might want to look out for.  The issue was spotted during
      unrelated code review.
      
      To fix, push down the logic for maintaining the oldest btpo.xact to
      _bt_pagedel().  btvacuumpage() is now responsible for pages that were
      fully deleted by a previous VACUUM operation, while _bt_pagedel() is now
      responsible for pages that were deleted by the current VACUUM operation
      (this includes half-dead pages from a previous interrupted VACUUM
      operation that become fully deleted in _bt_pagedel()).  Note that
      _bt_pagedel() should never encounter an existing deleted page.
      
      This commit theoretically breaks the ABI of a stable release by changing
      the signature of _bt_pagedel().  However, if any third party extension
      is actually affected by this, then it must already be completely broken
      (since there are numerous assumptions made in _bt_pagedel() that cannot
      be met outside of VACUUM).  It seems highly unlikely that such an
      extension actually exists, in any case.
      
      Author: Peter Geoghegan
      Reviewed-By: Masahiko Sawada
      Discussion: https://postgr.es/m/CAH2-WzkrXBcMQWAYUJMFTTvzx_r4q=pYSjDe07JnUXhe+OZnJA@mail.gmail.com
      Backpatch: 11-, where the "skip full scan" feature was introduced.
      b0229f26
    • Peter Eisentraut's avatar
    • Michael Paquier's avatar
      Improve various aspects of pg_rewind documentation · 78bad97f
      Michael Paquier authored
      The pg_rewind docs currently assert that the state of the target's
      data directory after rewind is equivalent to the source's data
      directory.  This clarifies the documentation to describe that the base
      state is further back in time and that the target's data directory will
      include the current state from the source of any copied blocks since the
      point of divergence.
      
      This commit also improves the section "How It Works":
      - Describe the update of the pg_control file.
      - Reorganize the list of files and directories ignored during the
      rewind.
      
      Author: James Coleman
      Discussion: https://postgr.es/m/CAAaqYe-sgqCos7MXF4XiY8rUPy3CEmaCY9EvfhX-DhPhPBF5_A@mail.gmail.com
      78bad97f
  2. 30 Apr, 2020 8 commits
  3. 29 Apr, 2020 4 commits
  4. 28 Apr, 2020 3 commits
  5. 27 Apr, 2020 6 commits
    • Michael Paquier's avatar
      Add more TAP coverage for archive status with crash recovery of standbys · ebf6de86
      Michael Paquier authored
      This part of the test was included originally in 4e87c483, but got
      reverted as of f9c1b8db because of timing issues in the test, where,
      after more analysis, we found that the standbys may not have recovered
      from the archives all the segments needed by the test.  This stabilizes
      the test by making sure that standbys replay up to the position returned
      by pg_switch_wal(), meaning that all segments are recovered before the
      manual checkpoint that tests WAL segment recycling and its interactions
      with archive status files.
      
      Author: Jehan-Guillaume de Rorthais
      Reviewed-by: Kyotaro Horiguchi, Michael Paquier
      Discussion: https://postgr.es/m/20200424034248.GL33034@paquier.xyz
      ebf6de86
    • Robert Haas's avatar
      Fix bogus tar-file padding logic for standby.signal. · 0278d3f7
      Robert Haas authored
      When pg_basebackup -R is used, we inject standby.signal into the
      tar file for the main tablespace. The proper thing to do is to pad
      each file injected into the tar file out to a 512-byte boundary
      by appending nulls, but here the file is of length 0 and we add
      511 zero bytes.  Since 0 is already a multiple of 512, we should
      not add any zero bytes. Do that instead.
      
      Patch by me, reviewed by Tom Lane.
      
      Discussion: http://postgr.es/m/CA+TgmobWbfReO9-XFk8urR1K4wTNwqoHx_v56t7=T8KaiEoKNw@mail.gmail.com
      0278d3f7
    • Tom Lane's avatar
      Fix full text search to handle NOT above a phrase search correctly. · e81e5741
      Tom Lane authored
      Queries such as '!(foo<->bar)' failed to find matching rows when
      implemented as a GiST or GIN index search.  That's because of
      failing to handle phrase searches as tri-valued when considering
      a query without any position information for the target tsvector.
      We can only say that the phrase operator might match, not that it
      does match; and therefore its NOT also might match.  The previous
      coding incorrectly inverted the approximate phrase result to
      decide that there was certainly no match.
      
      To fix, we need to make TS_phrase_execute return a real ternary result,
      and then bubble that up accurately in TS_execute.  As long as we have
      to do that anyway, we can simplify the baroque things TS_phrase_execute
      was doing internally to manage tri-valued searching with only a bool
      as explicit result.
      
      For now, I left the externally-visible result of TS_execute as a plain
      bool.  There do not appear to be any outside callers that need to
      distinguish a three-way result, given that they passed in a flag
      saying what to do in the absence of position data.  This might need
      to change someday, but we wouldn't want to back-patch such a change.
      
      Although tsginidx.c has its own TS_execute_ternary implementation for
      use at upper index levels, that sadly managed to get this case wrong
      as well :-(.  Fixing it is a lot easier fortunately.
      
      Per bug #16388 from Charles Offenbacher.  Back-patch to 9.6 where
      phrase search was introduced.
      
      Discussion: https://postgr.es/m/16388-98cffba38d0b7e6e@postgresql.org
      e81e5741
    • Tom Lane's avatar
      Doc: render &pi; more nicely in PDF output. · 5ac24755
      Tom Lane authored
      We need to select symbol font explicitly, or it comes out misaligned.
      
      Alexander Lakhin, Tom Lane
      
      Discussion: https://postgr.es/m/10598.1587928415@sss.pgh.pa.us
      5ac24755
    • Peter Eisentraut's avatar
      d51f704f
    • Michael Paquier's avatar
      Fix some typos · 641b76d9
      Michael Paquier authored
      Author: Justin Pryzby
      Discussion: https://postgr.es/m/20200408165653.GF2228@telsasoft.com
      641b76d9
  6. 26 Apr, 2020 3 commits
  7. 25 Apr, 2020 5 commits
    • Peter Geoghegan's avatar
      Fix another minor page deletion buffer lock issue. · 7154aa16
      Peter Geoghegan authored
      Avoid accessing the leaf page's top parent tuple without a buffer lock
      held during the second phase of nbtree page deletion.  The old approach
      was safe, though only because VACUUM never drops its buffer pin (and
      because only VACUUM itself can modify a half-dead page).  Even still, it
      seems like a good idea to be strict here.  Tighten things up by copying
      the top parent page's block number to a local variable before releasing
      the buffer lock on the leaf page -- not after.
      
      This is a follow-up to commit fa7ff642, which fixed a similar issue in
      the first phase of nbtree page deletion.
      
      Update some related comments in passing.
      
      Discussion: https://postgr.es/m/CAH2-WzkLgyN3zBvRZ1pkNJThC=xi_0gpWRUb_45eexLH1+k2_Q@mail.gmail.com
      7154aa16
    • Peter Geoghegan's avatar
      Fix minor nbtree page deletion buffer lock issue. · fa7ff642
      Peter Geoghegan authored
      Avoid accessing the deletion target page's special area during nbtree
      page deletion at a point where there is no buffer lock held.  This issue
      was detected by a patch that extends Valgrind's memcheck tool to mark
      nbtree pages that are unsafe to access (due to not having a buffer lock
      or buffer pin) as NOACCESS.
      
      We do hold a buffer pin at this point, and only access the special area,
      so the old approach was safe.  Even still, it seems like a good idea to
      tighten up the rules in this area.  There is no reason to not simply
      insist on always holding a buffer lock (not just a pin) when accessing
      nbtree pages.
      
      Update some related comments in passing.
      
      Discussion: https://postgr.es/m/CAH2-WzkLgyN3zBvRZ1pkNJThC=xi_0gpWRUb_45eexLH1+k2_Q@mail.gmail.com
      fa7ff642
    • Noah Misch's avatar
      In caught-up logical walsender, sleep only in WalSndWaitForWal(). · f246ea3b
      Noah Misch authored
      Before sleeping, WalSndWaitForWal() sends a keepalive if MyWalSnd->write
      < sentPtr.  When the latest physical LSN yields no logical replication
      messages (a common case), that keepalive elicits a reply.  Processing
      the reply updates pg_stat_replication.replay_lsn.  WalSndLoop() lacks
      that; when WalSndLoop() slept, replay_lsn advancement could stall until
      wal_receiver_status_interval elapsed.  This sometimes stalled
      src/test/subscription/t/001_rep_changes.pl for up to 10s.
      
      Reviewed by Fujii Masao and Michael Paquier.
      
      Discussion: https://postgr.es/m/20200418070142.GA1075445@rfd.leadboat.com
      f246ea3b
    • Noah Misch's avatar
      Revert "When WalSndCaughtUp, sleep only in WalSndWaitForWal()." · 72a3dc32
      Noah Misch authored
      This reverts commit 42168581.  It caused
      idle physical walsenders to busy-wait, as reported by Fujii Masao.
      
      Discussion: https://postgr.es/m/20200417054146.GA1061007@rfd.leadboat.com
      72a3dc32
    • Andrew Gierth's avatar
      Fix error case for CREATE ROLE ... IN ROLE. · d9a4cce2
      Andrew Gierth authored
      CreateRole() was passing a Value node, not a RoleSpec node, for the
      newly-created role name when adding the role as a member of existing
      roles for the IN ROLE syntax.
      
      This mistake went unnoticed because the node in question is used only
      for error messages and is not accessed on non-error paths.
      
      In older pg versions (such as 9.5 where this was found), this results
      in an "unexpected node type" error in place of the real error. That
      node type check was removed at some point, after which the code would
      accidentally fail to fail on 64-bit platforms (on which accessing the
      Value node as if it were a RoleSpec would be mostly harmless) or give
      an "unexpected role type" error on 32-bit platforms.
      
      Fix the code to pass the correct node type, and add an lfirst_node
      assertion just in case.
      
      Per report on irc from user m1chelangelo.
      
      Backpatch all the way, because this error has been around for a long
      time.
      d9a4cce2
  8. 24 Apr, 2020 7 commits
    • Tom Lane's avatar
      Update Windows timezone name list to include currently-known zones. · 6c5f9161
      Tom Lane authored
      Thanks to Juan José Santamaría Flecha.
      
      Discussion: https://postgr.es/m/5752.1587740484@sss.pgh.pa.us
      6c5f9161
    • Tom Lane's avatar
      Improve placement of "display name" comment in win32_tzmap[] entries. · bd8c5cee
      Tom Lane authored
      Sticking this comment at the end of the last line was a bad idea: it's
      not particularly readable, and it tempts pgindent to mess with line
      breaks within the comment, which in turn reveals that win32tzlist.pl's
      clean_displayname() does the wrong thing to clean up such line breaks.
      While that's not hard to fix, there's basically no excuse for this
      arrangement to begin with, especially since it makes the table layout
      needlessly vary across back branches with different pgindent rules.
      Let's just put the comment inside the braces, instead.
      
      This commit just moves and reformats the comments, and updates
      win32tzlist.pl to match; there's no actual data change.
      
      Per odd-looking results from Juan José Santamaría Flecha.
      Back-patch, since the point is to make win32_tzmap[] look the
      same in all supported branches again.
      
      Discussion: https://postgr.es/m/5752.1587740484@sss.pgh.pa.us
      bd8c5cee
    • Tom Lane's avatar
      Doc: update section 9.13 for new function table layout. · f8d3e2ab
      Tom Lane authored
      This includes the usual amount of editorial cleanup, such as
      correcting wrong or less-helpful-than-they-could-be examples.
      
      I moved the two tsvector-updating triggers into "9.28 Trigger
      Functions", which seems like a better home for them.  (I believe
      that section didn't exist when this text was originally written.)
      f8d3e2ab
    • Bruce Momjian's avatar
      git_changelog: use modern format for rel branch names in example · 395a9a12
      Bruce Momjian authored
      e.g., REL_12_STABLE
      395a9a12
    • Robert Haas's avatar
      Try to avoid compiler warnings in optimized builds. · 05021a2c
      Robert Haas authored
      Per report from Andres Freund, who also says that this fix
      works for him.
      
      Discussion: http://postgr.es/m/20200405193118.alprgmozhxcfabkw@alap3.anarazel.de
      05021a2c
    • Tom Lane's avatar
      Repair performance regression in information_schema.triggers view. · baf17ad9
      Tom Lane authored
      Commit 32ff2691 introduced use of rank() into the triggers view to
      calculate the spec-mandated action_order column.  As written, this
      prevents query constraints on the table-name column from being pushed
      below the window aggregate step.  That's bad for performance of this
      typical usage pattern, since the view now has to be evaluated for all
      tables not just the one(s) the user wants to see.  It's also the cause
      of some recent buildfarm failures, in which trying to evaluate the view
      rows for triggers in process of being dropped resulted in "cache lookup
      failed for function NNN" errors.  Those rows aren't of interest to the
      test script doing the query, but the filter that would eliminate them
      is being applied too late.  None of this happened before the rank()
      call was there, so it's a regression compared to v10 and before.
      
      We can improve matters by changing the rank() call so that instead of
      partitioning by OIDs, it partitions by nspname and relname, casting
      those to sql_identifier so that they match the respective view output
      columns exactly.  The planner has enough intelligence to know that
      constraints on partitioning columns are safe to push down, so this
      eliminates the performance problem and the regression test failure
      risk.  We could make the other partitioning columns match view outputs
      as well, but it'd be more complicated and the performance benefits
      are questionable.
      
      Side note: as this stands, the planner will push down constraints on
      event_object_table and trigger_schema, but not on event_object_schema,
      because it checks for ressortgroupref matches not expression
      equivalence.  That might be worth improving someday, but it's not
      necessary to fix the immediate concern.
      
      Back-patch to v11 where the rank() call was added.  Ordinarily we'd not
      change information_schema in released branches, but the test failure has
      been seen in v12 and presumably could happen in v11 as well, so we need
      to do this to keep the buildfarm happy.  The change is harmless so far
      as users are concerned.  Some might wish to apply it to existing
      installations if performance of this type of query is of concern,
      but those who don't are no worse off.
      
      I bumped catversion in HEAD as a pro forma matter (there's no
      catalog incompatibility that would really require a re-initdb).
      Obviously that can't be done in the back branches.
      
      Discussion: https://postgr.es/m/5891.1587594470@sss.pgh.pa.us
      baf17ad9
    • Tom Lane's avatar
      Update time zone data files to tzdata release 2020a. · 4cac3a49
      Tom Lane authored
      DST law changes in Morocco and the Canadian Yukon.
      Historical corrections for Shanghai.
      
      The America/Godthab zone is renamed to America/Nuuk to reflect
      current English usage; however, the old name remains available as a
      compatibility link.
      4cac3a49