1. 24 Mar, 2019 2 commits
    • Andres Freund's avatar
      Remove spurious return. · b2db2770
      Andres Freund authored
      Per buildfarm member anole.
      
      Author: Andres Freund
      b2db2770
    • Andres Freund's avatar
      tableam: Add tuple_{insert, delete, update, lock} and use. · 5db6df0c
      Andres Freund authored
      This adds new, required, table AM callbacks for insert/delete/update
      and lock_tuple. To be able to reasonably use those, the EvalPlanQual
      mechanism had to be adapted, moving more logic into the AM.
      
      Previously both delete/update/lock call-sites and the EPQ mechanism had
      to have awareness of the specific tuple format to be able to fetch the
      latest version of a tuple. Obviously that needs to be abstracted
      away. To do so, move the logic that find the latest row version into
      the AM. lock_tuple has a new flag argument,
      TUPLE_LOCK_FLAG_FIND_LAST_VERSION, that forces it to lock the last
      version, rather than the current one.  It'd have been possible to do
      so via a separate callback as well, but finding the last version
      usually also necessitates locking the newest version, making it
      sensible to combine the two. This replaces the previous use of
      EvalPlanQualFetch().  Additionally HeapTupleUpdated, which previously
      signaled either a concurrent update or delete, is now split into two,
      to avoid callers needing AM specific knowledge to differentiate.
      
      The move of finding the latest row version into tuple_lock means that
      encountering a row concurrently moved into another partition will now
      raise an error about "tuple to be locked" rather than "tuple to be
      updated/deleted" - which is accurate, as that always happens when
      locking rows. While possible slightly less helpful for users, it seems
      like an acceptable trade-off.
      
      As part of this commit HTSU_Result has been renamed to TM_Result, and
      its members been expanded to differentiated between updating and
      deleting. HeapUpdateFailureData has been renamed to TM_FailureData.
      
      The interface to speculative insertion is changed so nodeModifyTable.c
      does not have to set the speculative token itself anymore. Instead
      there's a version of tuple_insert, tuple_insert_speculative, that
      performs the speculative insertion (without requiring a flag to signal
      that fact), and the speculative insertion is either made permanent
      with table_complete_speculative(succeeded = true) or aborted with
      succeeded = false).
      
      Note that multi_insert is not yet routed through tableam, nor is
      COPY. Changing multi_insert requires changes to copy.c that are large
      enough to better be done separately.
      
      Similarly, although simpler, CREATE TABLE AS and CREATE MATERIALIZED
      VIEW are also only going to be adjusted in a later commit.
      
      Author: Andres Freund and Haribabu Kommi
      Discussion:
          https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
          https://postgr.es/m/20190313003903.nwvrxi7rw3ywhdel@alap3.anarazel.de
          https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
      5db6df0c
  2. 23 Mar, 2019 8 commits
    • Tom Lane's avatar
      Remove inadequate check for duplicate "xml" PI. · f778e537
      Tom Lane authored
      I failed to think about PIs starting with "xml".  We don't really
      need this check at all, so just take it out.  Oversight in
      commit 8d1dadb2 et al.
      f778e537
    • Tom Lane's avatar
      Ensure xmloption = content while restoring pg_dump output. · 4870dce3
      Tom Lane authored
      In combination with the previous commit, this ensures that valid XML
      data can always be dumped and reloaded, whether it is "document"
      or "content".
      
      Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com
      4870dce3
    • Tom Lane's avatar
      Accept XML documents when xmloption = content, as required by SQL:2006+. · 8d1dadb2
      Tom Lane authored
      Previously we were using the SQL:2003 definition, which doesn't allow
      this, but that creates a serious dump/restore gotcha: there is no
      setting of xmloption that will allow all valid XML data.  Hence,
      switch to the 2006 definition.
      
      Since libxml doesn't accept <!DOCTYPE> directives in the mode we
      use for CONTENT parsing, the implementation is to detect <!DOCTYPE>
      in the input and switch to DOCUMENT parsing mode.  This should not
      cost much, because <!DOCTYPE> should be close to the front of the
      input if it's there at all.  It's possible that this causes the
      error messages for malformed input to be slightly different than
      they were before, if said input includes <!DOCTYPE>; but that does
      not seem like a big problem.
      
      In passing, buy back a few cycles in parsing of large XML documents
      by not doing strlen() of the whole input in parse_xml_decl().
      
      Back-patch because dump/restore failures are not nice.  This change
      shouldn't break any cases that worked before, so it seems safe to
      back-patch.
      
      Chapman Flack (revised a bit by me)
      
      Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com
      8d1dadb2
    • Peter Geoghegan's avatar
      Suppress DETAIL output from an event_trigger test. · 05f110cc
      Peter Geoghegan authored
      Suppress 3 lines of unstable DETAIL output from a DROP ROLE statement in
      event_trigger.sql.  This is further cleanup for commit dd299df8.
      
      Note that the event_trigger test instability issue is very similar to
      the recently suppressed foreign_data test instability issue.  Both
      issues involve DETAIL output for a DROP ROLE statement that needed to be
      changed as part of dd299df8.
      
      Per buildfarm member macaque.
      05f110cc
    • Peter Geoghegan's avatar
      Add nbtree high key "continuescan" optimization. · 29b64d1d
      Peter Geoghegan authored
      Teach nbtree forward index scans to check the high key before moving to
      the right sibling page in the hope of finding that it isn't actually
      necessary to do so.  The new check may indicate that the scan definitely
      cannot find matching tuples to the right, ending the scan immediately.
      We already opportunistically force a similar "continuescan orientated"
      key check of the final non-pivot tuple when it's clear that it cannot be
      returned to the scan due to being dead-to-all.  The new high key check
      is complementary.
      
      The new approach for forward scans is more effective than checking the
      final non-pivot tuple, especially with composite indexes and non-unique
      indexes.  The improvements to the logic for picking a split point added
      by commit fab25024 make it likely that relatively dissimilar high keys
      will appear on a page.  A distinguishing key value that can only appear
      on non-pivot tuples on the right sibling page will often be present in
      leaf page high keys.
      
      Since forcing the final item to be key checked no longer makes any
      difference in the case of forward scans, the existing extra key check is
      now only used for backwards scans.  Backward scans continue to
      opportunistically check the final non-pivot tuple, which is actually the
      first non-pivot tuple on the page (not the last).
      
      Note that even pg_upgrade'd v3 indexes make use of this optimization.
      
      Author: Peter Geoghegan, Heikki Linnakangas
      Reviewed-By: Heikki Linnakangas
      Discussion: https://postgr.es/m/CAH2-WzkOmUduME31QnuTFpimejuQoiZ-HOf0pOWeFZNhTMctvA@mail.gmail.com
      29b64d1d
    • Michael Paquier's avatar
      Improve format of code and some error messages in pg_checksums · 4ba96d1b
      Michael Paquier authored
      This makes the code more consistent with the surroundings.
      
      Author: Fabrízio de Royes Mello
      Discussion: https://postgr.es/m/CAFcNs+pXb_35r5feMU3-dWsWxXU=Yjq+spUsthFyGFbT0QcaKg@mail.gmail.com
      4ba96d1b
    • Tom Lane's avatar
      Add unreachable "break" to satisfy -Wimplicit-fallthrough. · fb50d3f0
      Tom Lane authored
      gcc is a bit pickier about this than perhaps it should be.
      
      Discussion: https://postgr.es/m/E1h6zzT-0003ft-DD@gemulon.postgresql.org
      fb50d3f0
    • Andres Freund's avatar
      Expand EPQ tests for UPDATEs and DELETEs · cdcffe22
      Andres Freund authored
      Previously there was basically no coverage for UPDATEs encountering
      deleted rows, and no coverage for DELETE having to perform EPQ. That's
      problematic for an upcoming commit in which EPQ is tought to integrate
      with tableams.  Also, there was no test for UPDATE to encounter a row
      UPDATEd into another partition.
      
      Author: Andres Freund
      cdcffe22
  3. 22 Mar, 2019 18 commits
  4. 21 Mar, 2019 5 commits
    • Peter Geoghegan's avatar
      Revert "Suppress DETAIL output from a foreign_data test." · fff518d0
      Peter Geoghegan authored
      This should be superseded by commit 8aa9dd74.
      fff518d0
    • Alvaro Herrera's avatar
    • Alvaro Herrera's avatar
      Fix dependency recording bug for partitioned PKs · 7e7c57bb
      Alvaro Herrera authored
      When DefineIndex recurses to create constraints on partitions, it needs
      to use the value returned by index_constraint_create to set up partition
      dependencies.  However, in the course of fixing the DEPENDENCY_INTERNAL_AUTO
      mess, commit 1d92a0c9 introduced some code to that function that
      clobbered the return value, causing the recorded OID to be of the wrong
      object.  Close examination of pg_depend after creating the tables leads
      to indescribable objects :-( My sin (in commit bdc3d7fa, while
      preparing for DDL deparsing in event triggers) was to use a variable
      name for the return value that's typically used for throwaway objects in
      dependency-setting calls ("referenced").  Fix by changing the variable
      names to match extended practice (the return value is "myself" rather
      than "referenced".)
      
      The pg_upgrade test notices the problem (in an indirect way: the pg_dump
      outputs are in different order), but only if you create the objects in a
      specific way that wasn't being used in the existing tests.  Add a stanza
      to leave some objects around that shows the bug.
      
      Catversion bump because preexisting databases might have bogus pg_depend
      entries.
      
      Discussion: https://postgr.es/m/20190318204235.GA30360@alvherre.pgsql
      7e7c57bb
    • Tom Lane's avatar
      Improve error reporting for DROP FUNCTION/PROCEDURE/AGGREGATE/ROUTINE. · bfb456c1
      Tom Lane authored
      These commands allow the argument type list to be omitted if there is
      just one object that matches by name.  However, if that syntax was
      used with DROP IF EXISTS and there was more than one match, you got
      a "function ... does not exist, skipping" notice message rather than a
      truthful complaint about the ambiguity.  This was basically due to
      poor factorization and a rats-nest of logic, so refactor the relevant
      lookup code to make it cleaner.
      
      Note that this amounts to narrowing the scope of which sorts of
      error conditions IF EXISTS will bypass.  Per discussion, we only
      intend it to skip no-such-object cases, not multiple-possible-matches
      cases.
      
      Per bug #15572 from Ash Marath.  Although this definitely seems like
      a bug, it's not clear that people would thank us for changing the
      behavior in minor releases, so no back-patch.
      
      David Rowley, reviewed by Julien Rouhaud and Pavel Stehule
      
      Discussion: https://postgr.es/m/15572-ed1b9ed09503de8a@postgresql.org
      bfb456c1
    • Thomas Munro's avatar
      Add DNS SRV support for LDAP server discovery. · 0f086f84
      Thomas Munro authored
      LDAP servers can be advertised on a network with RFC 2782 DNS SRV
      records.  The OpenLDAP command-line tools automatically try to find
      servers that way, if no server name is provided by the user.  Teach
      PostgreSQL to do the same using OpenLDAP's support functions, when
      building with OpenLDAP.
      
      For now, we assume that HAVE_LDAP_INITIALIZE (an OpenLDAP extension
      available since OpenLDAP 2.0 and also present in Apple LDAP) implies
      that you also have ldap_domain2hostlist() (which arrived in the same
      OpenLDAP version and is also present in Apple LDAP).
      
      Author: Thomas Munro
      Reviewed-by: Daniel Gustafsson
      Discussion: https://postgr.es/m/CAEepm=2hAnSfhdsd6vXsM6VZVN0br-FbAZ-O+Swk18S5HkCP=A@mail.gmail.com
      0f086f84
  5. 20 Mar, 2019 7 commits
    • Tom Lane's avatar
      Sort the dependent objects before deletion in DROP OWNED BY. · 8aa9dd74
      Tom Lane authored
      This finishes a task we left undone in commit f1ad067f, by extending
      the delete-in-descending-OID-order rule to deletions triggered by
      DROP OWNED BY.  We've coped with machine-dependent deletion orders
      one time too many, and the new issues caused by Peter G's recent
      nbtree hacking seem like the last straw.
      
      Discussion: https://postgr.es/m/E1h6eep-0001Mw-Vd@gemulon.postgresql.org
      8aa9dd74
    • Alvaro Herrera's avatar
      Add index_get_partition convenience function · a6da0047
      Alvaro Herrera authored
      This new function simplifies some existing coding, as well as supports
      future patches.
      
      Discussion: https://postgr.es/m/201901222145.t6wws6t6vrcu@alvherre.pgsql
      Reviewed-by: Amit Langote, Jesper Pedersen
      a6da0047
    • Peter Geoghegan's avatar
      Fix spurious compiler warning in nbtxlog.c. · 3d0dcc5c
      Peter Geoghegan authored
      Cleanup from commit dd299df8.
      
      Per complaint from Tom Lane.
      3d0dcc5c
    • Peter Geoghegan's avatar
      Suppress DETAIL output from a foreign_data test. · 7d3bf73a
      Peter Geoghegan authored
      Unstable sort order related to changes to nbtree from commit dd299df8
      can cause two lines of DETAIL output to be in opposite-of-expected
      order.  Suppress the output using the same VERBOSITY hack that is used
      elsewhere in the foreign_data tests.
      
      Note that the same foreign_data.out DETAIL output was mechanically
      updated by commit dd299df8.  Only a few such changes were required,
      though.
      
      Per buildfarm member batfish.
      
      Discussion: https://postgr.es/m/CAH2-WzkCQ_MtKeOpzozj7QhhgP1unXsK8o9DMAFvDqQFEPpkYQ@mail.gmail.com
      7d3bf73a
    • Alvaro Herrera's avatar
      Restore RI trigger sanity check · 815b20ae
      Alvaro Herrera authored
      I unnecessarily removed this check in 3de241db because I
      misunderstood what the final representation of constraints across a
      partitioning hierarchy was to be.  Put it back (in both branches).
      
      Discussion: https://postgr.es/m/201901222145.t6wws6t6vrcu@alvherre.pgsql
      815b20ae
    • Peter Geoghegan's avatar
      Allow amcheck to re-find tuples using new search. · c1afd175
      Peter Geoghegan authored
      Teach contrib/amcheck's bt_index_parent_check() function to take
      advantage of the uniqueness property of heapkeyspace indexes in support
      of a new verification option: non-pivot tuples (non-highkey tuples on
      the leaf level) can optionally be re-found using a new search for each,
      that starts from the root page.  If a tuple cannot be re-found, report
      that the index is corrupt.
      
      The new "rootdescend" verification option is exhaustive, and can
      therefore make a call to bt_index_parent_check() take a lot longer.
      Re-finding tuples during verification is mostly intended as an option
      for backend developers, since the corruption scenarios that it alone is
      uniquely capable of detecting seem fairly far-fetched.
      
      For example, "rootdescend" verification is much more likely to detect
      corruption of the least significant byte of a key from a pivot tuple in
      the root page of a B-Tree that already has at least three levels.
      Typically, only a few tuples on a cousin leaf page are at risk of
      "getting overlooked" by index scans in this scenario.  The corrupt key
      in the root page is only slightly corrupt: corrupt enough to give wrong
      answers to some queries, and yet not corrupt enough to allow the problem
      to be detected without verifying agreement between the leaf page and the
      root page, skipping at least one internal page level.  The existing
      bt_index_parent_check() checks never cross more than a single level.
      
      Author: Peter Geoghegan
      Reviewed-By: Heikki Linnakangas
      Discussion: https://postgr.es/m/CAH2-Wz=yTWnVu+HeHGKb2AGiADL9eprn-cKYAto4MkKOuiGtRQ@mail.gmail.com
      c1afd175
    • Peter Geoghegan's avatar
      Consider secondary factors during nbtree splits. · fab25024
      Peter Geoghegan authored
      Teach nbtree to give some consideration to how "distinguishing"
      candidate leaf page split points are.  This should not noticeably affect
      the balance of free space within each half of the split, while still
      making suffix truncation truncate away significantly more attributes on
      average.
      
      The logic for choosing a leaf split point now uses a fallback mode in
      the case where the page is full of duplicates and it isn't possible to
      find even a minimally distinguishing split point.  When the page is full
      of duplicates, the split should pack the left half very tightly, while
      leaving the right half mostly empty.  Our assumption is that logical
      duplicates will almost always be inserted in ascending heap TID order
      with v4 indexes.  This strategy leaves most of the free space on the
      half of the split that will likely be where future logical duplicates of
      the same value need to be placed.
      
      The number of cycles added is not very noticeable.  This is important
      because deciding on a split point takes place while at least one
      exclusive buffer lock is held.  We avoid using authoritative insertion
      scankey comparisons to save cycles, unlike suffix truncation proper.  We
      use a faster binary comparison instead.
      
      Note that even pg_upgrade'd v3 indexes make use of these optimizations.
      Benchmarking has shown that even v3 indexes benefit, despite the fact
      that suffix truncation will only truncate non-key attributes in INCLUDE
      indexes.  Grouping relatively similar tuples together is beneficial in
      and of itself, since it reduces the number of leaf pages that must be
      accessed by subsequent index scans.
      
      Author: Peter Geoghegan
      Reviewed-By: Heikki Linnakangas
      Discussion: https://postgr.es/m/CAH2-WzmmoLNQOj9mAD78iQHfWLJDszHEDrAzGTUMG3mVh5xWPw@mail.gmail.com
      fab25024