1. 02 Apr, 2018 2 commits
    • Peter Eisentraut's avatar
      Make be-secure-common.c more consistent for future SSL implementations · 2764d5dc
      Peter Eisentraut authored
      Recent commit 8a3d9425 has introduced be-secure-common.c, which is aimed
      at including backend-side APIs that can be used by any SSL
      implementation.  The purpose is similar to fe-secure-common.c for the
      frontend-side APIs.
      
      However, this has forgotten to include check_ssl_key_file_permissions()
      in the move, which causes a double dependency between be-secure.c and
      be-secure-openssl.c.
      
      Refactor the code in a more logical way.  This also puts into light an
      API which is usable by future SSL implementations for permissions on SSL
      key files.
      
      Author: Michael Paquier <michael@paquier.xyz>
      2764d5dc
    • Robert Haas's avatar
      postgres_fdw: Push down partition-wise aggregation. · 7e0d64c7
      Robert Haas authored
      Since commit 7012b132, postgres_fdw
      has been able to push down the toplevel aggregation operation to the
      remote server.  Commit e2f1eb0e made
      it possible to break down the toplevel aggregation into one
      aggregate per partition.  This commit lets postgres_fdw push down
      aggregation in that case just as it does at the top level.
      
      In order to make this work, this commit adds an additional argument
      to the GetForeignUpperPaths FDW API.  A matching argument is added
      to the signature for create_upper_paths_hook.  Third-party code using
      either of these will need to be updated.
      
      Also adjust create_foreignscan_plan() so that it picks up the correct
      set of relids in this case.
      
      Jeevan Chalke, reviewed by Ashutosh Bapat and by me and with some
      adjustments by me.  The larger patch series of which this patch is a
      part was also reviewed and tested by Antonin Houska, Rajkumar
      Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal
      Legrand, and Rafia Sabih.
      
      Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
      Discussion: http://postgr.es/m/CAM2+6=XPWujjmj5zUaBTGDoB38CemwcPmjkRy0qOcsQj_V+2sQ@mail.gmail.com
      7e0d64c7
  2. 01 Apr, 2018 4 commits
    • Tom Lane's avatar
      0b11a674
    • Andres Freund's avatar
      Fix non-portable use of round(). · 686d399f
      Andres Freund authored
      round() is from C99.  Use rint() instead.  There are behavioral
      differences between round() and rint(), but they should not matter to
      the Bloom filter optimal_k() function.  We already assume POSIX
      behavior for rint(), so there is no question of rint() not using
      "rounds towards nearest" as its rounding mode.
      
      Cleanup from commit 51bc2717.
      
      Per buildfarm member thrips.
      
      Author: Peter Geoghegan
      Discussion: https://postgr.es/m/CAH2-Wzn76eCGUonARy-wrVtMHsf+4cvbK_oJAWTLfORTU5ki0w@mail.gmail.com
      686d399f
    • Andres Freund's avatar
      Add amcheck verification of heap relations belonging to btree indexes. · 7f563c09
      Andres Freund authored
      Add a new, optional, capability to bt_index_check() and
      bt_index_parent_check():  check that each heap tuple that should have an
      index entry does in fact have one.  The extra checking is performed at
      the end of the existing nbtree checks.
      
      This is implemented by using a Bloom filter data structure.  The
      implementation performs set membership tests within a callback (the same
      type of callback that each index AM registers for CREATE INDEX).  The
      Bloom filter is populated during the initial index verification scan.
      
      Reusing the CREATE INDEX infrastructure allows the new verification
      option to automatically benefit from the heap consistency checks that
      CREATE INDEX already performs.  CREATE INDEX does thorough sanity
      checking of HOT chains, so the new check actually manages to detect
      problems in heap-only tuples.
      
      Author: Peter Geoghegan
      Reviewed-By: Pavan Deolasee, Andres Freund
      Discussion: https://postgr.es/m/CAH2-Wzm5VmG7cu1N-H=nnS57wZThoSDQU+F5dewx3o84M+jY=g@mail.gmail.com
      7f563c09
    • Andres Freund's avatar
      Add Bloom filter implementation. · 51bc2717
      Andres Freund authored
      A Bloom filter is a space-efficient, probabilistic data structure that
      can be used to test set membership.  Callers will sometimes incur false
      positives, but never false negatives.  The rate of false positives is a
      function of the total number of elements and the amount of memory
      available for the Bloom filter.
      
      Two classic applications of Bloom filters are cache filtering, and data
      synchronization testing.  Any user of Bloom filters must accept the
      possibility of false positives as a cost worth paying for the benefit in
      space efficiency.
      
      This commit adds a test harness extension module, test_bloomfilter.  It
      can be used to get a sense of how the Bloom filter implementation
      performs under varying conditions.
      
      This is infrastructure for the upcoming "heapallindexed" amcheck patch,
      which verifies the consistency of a heap relation against one of its
      indexes.
      
      Author: Peter Geoghegan
      Reviewed-By: Andrey Borodin, Michael Paquier, Thomas Munro, Andres Freund
      Discussion: https://postgr.es/m/CAH2-Wzm5VmG7cu1N-H=nnS57wZThoSDQU+F5dewx3o84M+jY=g@mail.gmail.com
      51bc2717
  3. 31 Mar, 2018 8 commits
    • Andrew Dunstan's avatar
      Small cleanups in fast default code. · ed698643
      Andrew Dunstan authored
      Problems identified by Andres Freund and Haribabu Kommi
      ed698643
    • Tom Lane's avatar
      Fix assorted issues in parallel vacuumdb. · 94173d3e
      Tom Lane authored
      Avoid storing the result of PQsocket() in a pgsocket variable; it's
      declared as int, and the no-socket test is properly written as "x < 0"
      not "x == PGINVALID_SOCKET".  This accidentally had no bad effect
      because we never got to init_slot() with a bad connection, but it's
      still wrong.
      
      Actually, it seems like we should avoid storing the result for a long
      period at all.  The function's not so expensive that it's worth avoiding,
      and the existing coding technique here would fail if anyone tried to
      PQreset the connection during the life of the program.  Hence, just
      re-call PQsocket every time we construct a select(2) mask.
      
      Speaking of select(), GetIdleSlot imagined that it could compute the
      select mask once and continue to use it over multiple calls to
      select_loop(), which is pretty bogus since that would stomp on the
      mask on return.  This could only matter if the function's outer loop
      iterated more than once, which is unlikely (it'd take some connection
      receiving data, but not enough to complete its command).  But if it
      did happen, we'd acquire "tunnel vision" and stop watching the other
      connections for query termination, with the effect of losing parallelism.
      
      Another way in which GetIdleSlot could lose parallelism is that once
      PQisBusy returns false, it would lock in on that connection and do
      PQgetResult until that returns NULL; in some cases that could result
      in blocking.  (Perhaps this can never happen in vacuumdb due to the
      limited set of commands that it can issue, but I'm not quite sure
      of that, and even if true today it's not a future-proof assumption.)
      Refactor the code to do that properly, so that it risks blocking in
      PQgetResult only in cases where we need to wait anyway.
      
      Another loss-of-parallelism problem, which *is* easily demonstrable,
      is that any setup queries issued during prepare_vacuum_command() were
      always issued on the last-to-be-created connection, whether or not
      that was idle.  Long-running operations on that connection thus
      prevented issuance of additional operations on the other ones, except
      in the limited cases where no preparatory query was needed.  Instead,
      wait till we've identified a free connection and use that one.
      
      Also, avoid core dump due to undersized malloc request in the case
      that no tables are identified to be vacuumed.
      
      The bogus no-socket test was noted by CharSyam, the other problems
      identified in my own code review.  Back-patch to 9.5 where parallel
      vacuumdb was introduced.
      
      Discussion: https://postgr.es/m/CAMrLSE6etb33-192DTEUGkV-TsvEcxtBDxGWG1tgNOMnQHwgDA@mail.gmail.com
      94173d3e
    • Tom Lane's avatar
      Fix portability and translatability issues in commit 64f85894. · 5635c7aa
      Tom Lane authored
      Compilation failed for lack of an #ifdef on builds without
      pg_strong_random().  Also fix relevant error messages to meet
      project style guidelines.
      
      Fabien Coelho, further adjusted by me
      
      Discussion: https://postgr.es/m/32390.1522464534@sss.pgh.pa.us
      5635c7aa
    • Tom Lane's avatar
      Portability fix for commit 9a895462. · b0c90c85
      Tom Lane authored
      So far as I can find, NI_MAXHOST isn't actually required anywhere by
      POSIX.  Nonetheless, commit 9a895462 supposed that it could rely on
      having that symbol without any ceremony at all.  We do have a hack
      for providing it if the platform doesn't, in getaddrinfo.h, so fix
      the problem by #including that file.  Per buildfarm.
      b0c90c85
    • Andres Freund's avatar
      Remove PARTIAL_LINKING build mode. · a4ebbd27
      Andres Freund authored
      In 9956ddc1, ten years ago, the
      current objfile.txt based linking model was introduced.  It's time to
      retire the old SUBSYS.o based model.
      
      This primarily is pertinent because the bitcode files for LLVM based
      inlining are not produced when using PARTIAL_LINKING. It does not seem
      worth to fix PARTIAL_LINKING to support that.
      
      Author: Andres Freund
      Discussion: https://postgr.es/m/20180121204356.d5oeu34jetqhmdv2@alap3.anarazel.de
      a4ebbd27
    • Tatsuo Ishii's avatar
      Fix bug with view locking code. · 1b26bd40
      Tatsuo Ishii authored
      LockViewRecurese() obtains view relation using heap_open() and passes
      it to get_view_query() to get view info. It immediately closes the
      relation then uses the returned view info by calling
      LockViewRecurse_walker().  Since get_view_query() returns a pointer
      within the relcache, the relcache should be kept until
      LockViewRecurse_walker() returns. Otherwise the relation could point
      to a garbage memory area.
      
      Fix is moving the heap_close() call after LockViewRecurse_walker().
      
      Problem reported by Tom Lane (buildfarm is unhappy, especially prion
      since it enables -DRELCACHE_FORCE_RELEASE cpp flag), fix by me.
      1b26bd40
    • Andres Freund's avatar
      Add SKIP_LOCKED option to RangeVarGetRelidExtended(). · 3e256e55
      Andres Freund authored
      This will be used for VACUUM (SKIP LOCKED).
      
      Author: Nathan Bossart
      Reviewed-By: Michael Paquier and Andres Freund
      Discussion: https://postgr.es/m/20180306005349.b65whmvj7z6hbe2y@alap3.anarazel.de
      3e256e55
    • Andres Freund's avatar
      Combine options for RangeVarGetRelidExtended() into a flags argument. · d87510a5
      Andres Freund authored
      A followup patch will add a SKIP_LOCKED option. To avoid introducing
      evermore arguments, breaking existing callers each time, introduce a
      flags argument. This'll no doubt break a few external users...
      
      Also change the MISSING_OK behaviour so a DEBUG1 debug message is
      emitted when a relation is not found.
      
      Author: Nathan Bossart
      Reviewed-By: Michael Paquier and Andres Freund
      Discussion: https://postgr.es/m/20180306005349.b65whmvj7z6hbe2y@alap3.anarazel.de
      d87510a5
  4. 30 Mar, 2018 13 commits
  5. 29 Mar, 2018 13 commits
    • Andres Freund's avatar
      Improve JIT docs. · fb604780
      Andres Freund authored
      Author: John Naylor and Andres Freund
      Discussion: https://postgr.es/m/CAJVSVGUs-VcwSY7-Kx-GQe__8hvWuA4Uhyf3gxoMXeiZqebE9g@mail.gmail.com
      fb604780
    • Robert Haas's avatar
    • Robert Haas's avatar
      Rewrite the code that applies scan/join targets to paths. · 11cf92f6
      Robert Haas authored
      If the toplevel scan/join target list is parallel-safe, postpone
      generating Gather (or Gather Merge) paths until after the toplevel has
      been adjusted to return it.  This (correctly) makes queries with
      expensive functions in the target list more likely to choose a
      parallel plan, since the cost of the plan now reflects the fact that
      the evaluation will happen in the workers rather than the leader.
      The original complaint about this problem was from Jeff Janes.
      
      If the toplevel scan/join relation is partitioned, recursively apply
      the changes to all partitions.  This sometimes allows us to get rid of
      Result nodes, because Append is not projection-capable but its
      children may be.  It also cleans up what appears to be incorrect SRF
      handling from commit e2f1eb0e: the old
      code had no knowledge of SRFs for child scan/join rels.
      
      Because we now use create_projection_path() in some cases where we
      formerly used apply_projection_to_path(), this changes the ordering
      of columns in some queries generated by postgres_fdw.  Update
      regression outputs accordingly.
      
      Patch by me, reviewed by Amit Kapila and by Ashutosh Bapat.  Other
      fixes for this problem (substantially different from this version)
      were reviewed by Dilip Kumar, Amit Khandekar, and Marina Polyakova.
      
      Discussion: http://postgr.es/m/CAMkU=1ycXNipvhWuweUVpKuyu6SpNjF=yHWu4c4US5JgVGxtZQ@mail.gmail.com
      11cf92f6
    • Robert Haas's avatar
      Postpone generate_gather_paths for topmost scan/join rel. · 3f90ec85
      Robert Haas authored
      Don't call generate_gather_paths for the topmost scan/join relation
      when it is initially populated with paths.  Instead, do the work in
      grouping_planner.  By itself, this gains nothing; in fact it loses
      slightly because we end up calling set_cheapest() for the topmost
      scan/join rel twice rather than once.  However, it paves the way for
      a future commit which will postpone generate_gather_paths for the
      topmost scan/join relation even further, allowing more accurate
      costing of parallel paths.
      
      Amit Kapila and Robert Haas.  Earlier versions of this patch (which
      different substantially) were reviewed by Dilip Kumar, Amit
      Khandekar, Marina Polyakova, and Ashutosh Bapat.
      3f90ec85
    • Robert Haas's avatar
      Teach create_projection_plan to omit projection where possible. · d7c19e62
      Robert Haas authored
      We sometimes insert a ProjectionPath into a plan tree when projection
      is not strictly required. The existing code already arranges to avoid
      emitting a Result node when the ProjectionPath's subpath can perform
      the projection itself, but previously it didn't consider the
      possibility that the parent node might not actually require the
      projection to be performed at all.
      
      Skipping projection when it's not required can not only avoid Result
      nodes that aren't needed, but also avoid losing the "physical tlist"
      optimization unneccessarily.
      
      Patch by me, reviewed by Amit Kapila.
      
      Discussion: http://postgr.es/m/CA+TgmoakT5gmahbPWGqrR2nAdFOMAOnOXYoWHRdVfGWs34t6_A@mail.gmail.com
      d7c19e62
    • Bruce Momjian's avatar
      C comments: "a" <--> "an" corrections · 20b4323b
      Bruce Momjian authored
      Reported-by: Michael Paquier, Abhijit Menon-Sen
      
      Discussion: https://postgr.es/m/20180305045854.GB2266@paquier.xyz
      
      Author: Michael Paquier, Abhijit Menon-Sen, me
      20b4323b
    • Bruce Momjian's avatar
      3282c4c1
    • Magnus Hagander's avatar
      Fix incorrect copy/paste in comment · 8cdc8346
      Magnus Hagander authored
      Author: Alexander Korotkov <a.korotkov@postgrespro.ru>
      8cdc8346
    • Magnus Hagander's avatar
      Fix typo in comment · 9778d5c1
      Magnus Hagander authored
      Author: Daniel Gustafsson <daniel@yesql.se>
      9778d5c1
    • Tom Lane's avatar
      Remove unnecessary BufferGetPage() calls in fsm_vacuum_page(). · 2b1759e2
      Tom Lane authored
      Just noticed that these were quite redundant, since we're holding the
      page address in a local variable anyway, and we have pin on the buffer
      throughout.
      
      Also improve a comment.
      2b1759e2
    • Tom Lane's avatar
      Remove UpdateFreeSpaceMap(), use FreeSpaceMapVacuumRange() instead. · a063baac
      Tom Lane authored
      FreeSpaceMapVacuumRange has the same effect, is more efficient if many
      pages are involved, and makes fewer assumptions about how it's used.
      Notably, Claudio Freire pointed out that UpdateFreeSpaceMap could fail
      if the specified freespace value isn't the maximum possible.  This isn't
      a problem for the single existing user, but the function represents an
      attractive nuisance IMO, because it's named as though it were a
      general-purpose update function and its limitations are undocumented.
      In any case we don't need multiple ways to get the same result.
      
      In passing, do some code review and cleanup in RelationAddExtraBlocks.
      In particular, I see no excuse for it to omit the PageIsNew safety check
      that's done in the mainline extension path in RelationGetBufferForTuple.
      
      Discussion: https://postgr.es/m/CAGTBQpYR0uJCNTt3M5GOzBRHo+-GccNO1nCaQ8yEJmZKSW5q1A@mail.gmail.com
      a063baac
    • Bruce Momjian's avatar
    • Tom Lane's avatar
      While vacuuming a large table, update upper-level FSM data every so often. · 851a26e2
      Tom Lane authored
      VACUUM updates leaf-level FSM entries immediately after cleaning the
      corresponding heap blocks.  fsmpage.c updates the intra-page search trees
      on the leaf-level FSM pages when this happens, but it does not touch the
      upper-level FSM pages, so that the released space might not actually be
      findable by searchers.  Previously, updating the upper-level pages happened
      only at the conclusion of the VACUUM run, in a single FreeSpaceMapVacuum()
      call.  This is bad because the VACUUM might get canceled before ever
      reaching that point, so that from the point of view of searchers no space
      has been freed at all, leading to table bloat.
      
      We can improve matters by updating the upper pages immediately after each
      cycle of index-cleaning and heap-cleaning, processing just the FSM pages
      corresponding to the range of heap blocks we have now fully cleaned.
      This adds a small amount of extra work, since the FSM pages leading down
      to each range boundary will be touched twice, but it's pretty negligible
      compared to everything else going on in a large VACUUM.
      
      If there are no indexes, VACUUM doesn't work in cycles but just cleans
      each heap page on first visit.  In that case we just arbitrarily update
      upper FSM pages after each 8GB of heap.  That maintains the goal of not
      letting all this work slide until the very end, and it doesn't seem worth
      expending extra complexity on a case that so seldom occurs in practice.
      
      In either case, the FSM is fully up to date before any attempt is made
      to truncate the relation, so that the most likely scenario for VACUUM
      cancellation no longer results in out-of-date upper FSM pages.  When
      we do successfully truncate, adjusting the FSM to reflect that is now
      fully handled within FreeSpaceMapTruncateRel.
      
      Claudio Freire, reviewed by Masahiko Sawada and Jing Wang, some additional
      tweaks by me
      
      Discussion: https://postgr.es/m/CAGTBQpYR0uJCNTt3M5GOzBRHo+-GccNO1nCaQ8yEJmZKSW5q1A@mail.gmail.com
      851a26e2