1. 15 Mar, 2019 2 commits
    • Thomas Munro's avatar
      Enable parallel query with SERIALIZABLE isolation. · bb16aba5
      Thomas Munro authored
      Previously, the SERIALIZABLE isolation level prevented parallel query
      from being used.  Allow the two features to be used together by
      sharing the leader's SERIALIZABLEXACT with parallel workers.
      
      An extra per-SERIALIZABLEXACT LWLock is introduced to make it safe to
      share, and new logic is introduced to coordinate the early release
      of the SERIALIZABLEXACT required for the SXACT_FLAG_RO_SAFE
      optimization, as follows:
      
      The first backend to observe the SXACT_FLAG_RO_SAFE flag (set by
      some other transaction) will 'partially release' the SERIALIZABLEXACT,
      meaning that the conflicts and locks it holds are released, but the
      SERIALIZABLEXACT itself will remain active because other backends
      might still have a pointer to it.
      
      Whenever any backend notices the SXACT_FLAG_RO_SAFE flag, it clears
      its own MySerializableXact variable and frees local resources so that
      it can skip SSI checks for the rest of the transaction.  In the
      special case of the leader process, it transfers the SERIALIZABLEXACT
      to a new variable SavedSerializableXact, so that it can be completely
      released at the end of the transaction after all workers have exited.
      
      Remove the serializable_okay flag added to CreateParallelContext() by
      commit 9da0cc35, because it's now redundant.
      
      Author: Thomas Munro
      Reviewed-by: Haribabu Kommi, Robert Haas, Masahiko Sawada, Kevin Grittner
      Discussion: https://postgr.es/m/CAEepm=0gXGYhtrVDWOTHS8SQQy_=S9xo+8oCxGLWZAOoeJ=yzQ@mail.gmail.com
      bb16aba5
    • Amit Kapila's avatar
      During pg_upgrade, conditionally skip transfer of FSMs. · 13e8643b
      Amit Kapila authored
      If a heap on the old cluster has 4 pages or fewer, and the old cluster
      was PG v11 or earlier, don't copy or link the FSM. This will shrink
      space usage for installations with large numbers of small tables.
      
      This will allow pg_upgrade to take advantage of commit b0eaa4c5 where
      we have avoided creation of the free space map for small heap relations.
      
      Author: John Naylor
      Reviewed-by: Amit Kapila
      Discussion: https://postgr.es/m/CACPNZCu4cOdm3uGnNEGXivy7Gz8UWyQjynDpdkPGabQ18_zK6g%40mail.gmail.com
      13e8643b
  2. 14 Mar, 2019 13 commits
  3. 13 Mar, 2019 9 commits
  4. 12 Mar, 2019 7 commits
    • Peter Geoghegan's avatar
      Correct obsolete nbtree page split comment. · 3f342839
      Peter Geoghegan authored
      Commit 40dae7ec, which made the nbtree page split algorithm more
      robust, made _bt_insert_parent() only unlock the right child of the
      parent page before inserting a new downlink into the parent.  Update a
      comment from the Berkeley days claiming that both left and right child
      pages are unlocked before the new downlink actually gets inserted.
      
      The claim that it is okay to release both locks early based on Lehman
      and Yao's say-so never made much sense.  Lehman and Yao must sometimes
      "couple" buffer locks across a pair of internal pages when relocating a
      downlink, unlike the corresponding code within _bt_getstack().
      3f342839
    • Tom Lane's avatar
      Add support for hyperbolic functions, as well as log10(). · f1d85aa9
      Tom Lane authored
      The SQL:2016 standard adds support for the hyperbolic functions
      sinh(), cosh(), and tanh().  POSIX has long required libm to
      provide those functions as well as their inverses asinh(),
      acosh(), atanh().  Hence, let's just expose the libm functions
      to the SQL level.  As with the trig functions, we only implement
      versions for float8, not numeric.
      
      For the moment, we'll assume that all platforms actually do have
      these functions; if experience teaches otherwise, some autoconf
      effort may be needed.
      
      SQL:2016 also adds support for base-10 logarithm, but with the
      function name log10(), whereas the name we've long used is log().
      Add aliases named log10() for the float8 and numeric versions.
      
      Lætitia Avrot
      
      Discussion: https://postgr.es/m/CAB_COdguG22LO=rnxDQ2DW1uzv8aQoUzyDQNJjrR4k00XSgm5w@mail.gmail.com
      f1d85aa9
    • Tom Lane's avatar
      Remove remaining hard-wired OID references in the initial catalog data. · 3aa0395d
      Tom Lane authored
      In the v11-era commits that taught genbki.pl to resolve symbolic
      OID references in the initial catalog data, we didn't bother to
      make every last reference symbolic; some of the catalogs have so
      few initial rows that it didn't seem worthwhile.
      
      However, the new project policy that OIDs assigned by new patches
      should be automatically renumberable changes this calculus.
      A patch that wants to add a row in one of these catalogs would have
      a problem when the OID it assigns gets renumbered.  Hence, do the
      mop-up work needed to make all OID references in initial data be
      symbolic, and establish an associated project policy that we'll
      never again write a hard-wired OID reference there.
      
      No catversion bump since the contents of postgres.bki aren't
      actually changed by this commit.
      
      Discussion: https://postgr.es/m/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-=eaochT0dd2BN9CQ@mail.gmail.com
      3aa0395d
    • Tom Lane's avatar
      Create a script that can renumber manually-assigned OIDs. · a6417078
      Tom Lane authored
      This commit adds a Perl script renumber_oids.pl, which can reassign a
      range of manually-assigned OIDs to someplace else by modifying OID
      fields of the catalog *.dat files and OID-assigning macros in the
      catalog *.h files.
      
      Up to now, we've encouraged new patches that need manually-assigned
      OIDs to use OIDs just above the range of existing OIDs.  Predictably,
      this leads to patches stepping on each others' toes, as whichever
      one gets committed first creates an OID conflict that other patch
      author(s) have to resolve manually.  With the availability of
      renumber_oids.pl, we can eliminate a lot of this hassle.
      The new project policy, therefore, is:
      
      * Encourage new patches to use high OIDs (the documentation suggests
      choosing a block of OIDs at random in 8000..9999).
      
      * After feature freeze in each development cycle, run renumber_oids.pl
      to move all such OIDs down to lower numbers, thus freeing the high OID
      range for the next development cycle.
      
      This plan should greatly reduce the risk of OID collisions between
      concurrently-developed patches.  Also, if such a collision happens
      anyway, we have the option to resolve it without much effort by doing
      an off-schedule OID renumbering to get the first-committed patch out
      of the way.  Or a patch author could use renumber_oids.pl to change
      their patch's assignments without much pain.
      
      This approach does put a premium on not hard-wiring any OID values
      in places where renumber_oids.pl and genbki.pl can't fix them.
      Project practice in that respect seems to be pretty good already,
      but a follow-on patch will sand down some rough edges.
      
      John Naylor and Tom Lane, per an idea of Peter Geoghegan's
      
      Discussion: https://postgr.es/m/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-=eaochT0dd2BN9CQ@mail.gmail.com
      a6417078
    • Etsuro Fujita's avatar
      Fix testing of parallel-safety of scan/join target. · b5afdde6
      Etsuro Fujita authored
      In commit 960df2a9 ("Correctly assess parallel-safety of tlists when
      SRFs are used."), the testing of scan/join target was done incorrectly,
      which caused a plan-quality problem.  Backpatch through to v11 where
      the aforementioned commit went in, since this is a regression from v10.
      
      Author: Etsuro Fujita
      Reviewed-by: Robert Haas and Tom Lane
      Discussion: https://postgr.es/m/5C75303E.8020303@lab.ntt.co.jp
      b5afdde6
    • Amit Kapila's avatar
      Add more tests for FSM. · 6f918159
      Amit Kapila authored
      In commit b0eaa4c5, we left out a test that used a vacuum to remove dead
      rows as the behavior of test was not predictable.  This test has been
      rewritten to use fillfactor instead to control free space.  Since we no
      longer need to remove dead rows as part of the test, put the fsm regression
      test in a parallel group.
      
      Author: John Naylor
      Reviewed-by: Amit Kapila
      Discussion: https://postgr.es/m/CAA4eK1L=qWp_bJ5aTc9+fy4Ewx2LPaLWY-RbR4a60g_rupCKnQ@mail.gmail.com
      6f918159
    • Michael Paquier's avatar
      Add routine able to update the control file to src/common/ · ce6afc68
      Michael Paquier authored
      This adds a new routine to src/common/ which is compatible with both the
      frontend and backend code, able to update the control file's contents.
      This is now getting used only by pg_rewind, but some upcoming patches
      which add more control on checksums for offline instances will make use
      of it.  This could also get used more by the backend as xlog.c has its
      own flavor of the same logic with some wait events and an additional
      flush phase before closing the opened file descriptor, but this is let
      as separate work.
      
      Author: Michael Banck, Michael Paquier
      Reviewed-by: Fabien Coelho, Sergei Kornilov
      Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
      ce6afc68
  5. 11 Mar, 2019 9 commits
    • Tom Lane's avatar
      Allow fractional input values for integer GUCs, and improve rounding logic. · 1a83a80a
      Tom Lane authored
      Historically guc.c has just refused examples like set work_mem = '30.1GB',
      but it seems more useful for it to take that and round off the value to
      some reasonable approximation of what the user said.  Just rounding to
      the parameter's native unit would work, but it would lead to rather
      silly-looking settings, such as 31562138kB for this example.  Instead
      let's round to the nearest multiple of the next smaller unit (if any),
      producing 30822MB.
      
      Also, do the units conversion math in floating point and round to integer
      (if needed) only at the end.  This produces saner results for inputs that
      aren't exact multiples of the parameter's native unit, and removes another
      difference in the behavior for integer vs. float parameters.
      
      In passing, document the ability to use hex or octal input where it
      ought to be documented.
      
      Discussion: https://postgr.es/m/1798.1552165479@sss.pgh.pa.us
      1a83a80a
    • Andrew Dunstan's avatar
      Tweak wording on VARIADIC array doc patch. · fe0b2c12
      Andrew Dunstan authored
      Per suggestion from Tom Lane.
      fe0b2c12
    • Andrew Dunstan's avatar
      Document incompatibility of comparison expressions with VARIADIC array arguments · 5e74a427
      Andrew Dunstan authored
      COALESCE, GREATEST and LEAST all look like functions taking variable
      numbers of arguments, but in fact they are not functions, and so
      VARIADIC array arguments don't work with them. Add a note to the docs
      explaining this fact.
      
      The consensus is not to try to make this work, but just to document the
      limitation.
      
      Discussion: https://postgr.es/m/CAFj8pRCaAtuXuRtvXf5GmPbAVriUQrNMo7-=TXUFN025S31R_w@mail.gmail.com
      5e74a427
    • Andres Freund's avatar
      Remove spurious return. · 32b8f0b0
      Andres Freund authored
      Per buildfarm member anole.
      
      Author: Andres Freund
      32b8f0b0
    • Tom Lane's avatar
      Give up on testing guc.c's behavior for "infinity" inputs. · d9c5e962
      Tom Lane authored
      Further buildfarm testing shows that on the machines that are failing
      ac75959c's test case, what we're actually getting from strtod("-infinity")
      is a syntax error (endptr == value) not ERANGE at all.  This test case
      is not worth carrying two sets of expected output for, so just remove it,
      and revert commit b212245f's misguided attempt to work around the platform
      dependency.
      
      Discussion: https://postgr.es/m/E1h33xk-0001Og-Gs@gemulon.postgresql.org
      d9c5e962
    • Andres Freund's avatar
      Ensure sufficient alignment for ParallelTableScanDescData in BTShared. · 8cacea7a
      Andres Freund authored
      Previously ParallelTableScanDescData was just a member in BTShared,
      but after c2fe139c that doesn't guarantee sufficient alignment as
      specific AMs might (are likely to) need atomic variables in the
      struct.
      
      One might think that MAXALIGNing would be sufficient, but as a
      comment in shm_toc_allocate() explains, that's not enough. For now,
      copy the hack described there.
      
      For parallel sequential scans no such change is needed, as its
      allocations go through shm_toc_allocate().
      
      An alternative approach would have been to allocate the parallel scan
      descriptor in a separate TOC entry, but there seems little benefit in
      doing so.
      
      Per buildfarm member dromedary.
      
      Author: Andres Freund
      Discussion: https://postgr.es/m/20190311203126.ty5gbfz42gjbm6i6@alap3.anarazel.de
      8cacea7a
    • Andres Freund's avatar
      tableam: Add and use scan APIs. · c2fe139c
      Andres Freund authored
      Too allow table accesses to be not directly dependent on heap, several
      new abstractions are needed. Specifically:
      
      1) Heap scans need to be generalized into table scans. Do this by
         introducing TableScanDesc, which will be the "base class" for
         individual AMs. This contains the AM independent fields from
         HeapScanDesc.
      
         The previous heap_{beginscan,rescan,endscan} et al. have been
         replaced with a table_ version.
      
         There's no direct replacement for heap_getnext(), as that returned
         a HeapTuple, which is undesirable for a other AMs. Instead there's
         table_scan_getnextslot().  But note that heap_getnext() lives on,
         it's still used widely to access catalog tables.
      
         This is achieved by new scan_begin, scan_end, scan_rescan,
         scan_getnextslot callbacks.
      
      2) The portion of parallel scans that's shared between backends need
         to be able to do so without the user doing per-AM work. To achieve
         that new parallelscan_{estimate, initialize, reinitialize}
         callbacks are introduced, which operate on a new
         ParallelTableScanDesc, which again can be subclassed by AMs.
      
         As it is likely that several AMs are going to be block oriented,
         block oriented callbacks that can be shared between such AMs are
         provided and used by heap. table_block_parallelscan_{estimate,
         intiialize, reinitialize} as callbacks, and
         table_block_parallelscan_{nextpage, init} for use in AMs. These
         operate on a ParallelBlockTableScanDesc.
      
      3) Index scans need to be able to access tables to return a tuple, and
         there needs to be state across individual accesses to the heap to
         store state like buffers. That's now handled by introducing a
         sort-of-scan IndexFetchTable, which again is intended to be
         subclassed by individual AMs (for heap IndexFetchHeap).
      
         The relevant callbacks for an AM are index_fetch_{end, begin,
         reset} to create the necessary state, and index_fetch_tuple to
         retrieve an indexed tuple.  Note that index_fetch_tuple
         implementations need to be smarter than just blindly fetching the
         tuples for AMs that have optimizations similar to heap's HOT - the
         currently alive tuple in the update chain needs to be fetched if
         appropriate.
      
         Similar to table_scan_getnextslot(), it's undesirable to continue
         to return HeapTuples. Thus index_fetch_heap (might want to rename
         that later) now accepts a slot as an argument. Core code doesn't
         have a lot of call sites performing index scans without going
         through the systable_* API (in contrast to loads of heap_getnext
         calls and working directly with HeapTuples).
      
         Index scans now store the result of a search in
         IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the
         target is not generally a HeapTuple anymore that seems cleaner.
      
      To be able to sensible adapt code to use the above, two further
      callbacks have been introduced:
      
      a) slot_callbacks returns a TupleTableSlotOps* suitable for creating
         slots capable of holding a tuple of the AMs
         type. table_slot_callbacks() and table_slot_create() are based
         upon that, but have additional logic to deal with views, foreign
         tables, etc.
      
         While this change could have been done separately, nearly all the
         call sites that needed to be adapted for the rest of this commit
         also would have been needed to be adapted for
         table_slot_callbacks(), making separation not worthwhile.
      
      b) tuple_satisfies_snapshot checks whether the tuple in a slot is
         currently visible according to a snapshot. That's required as a few
         places now don't have a buffer + HeapTuple around, but a
         slot (which in heap's case internally has that information).
      
      Additionally a few infrastructure changes were needed:
      
      I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now
         internally uses a slot to keep track of tuples. While
         systable_getnext() still returns HeapTuples, and will so for the
         foreseeable future, the index API (see 1) above) now only deals with
         slots.
      
      The remainder, and largest part, of this commit is then adjusting all
      scans in postgres to use the new APIs.
      
      Author: Andres Freund, Haribabu Kommi, Alvaro Herrera
      Discussion:
          https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
          https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
      c2fe139c
    • Andrew Dunstan's avatar
      pgbench: increase the maximum number of variables/arguments · a4784152
      Andrew Dunstan authored
      pgbench's arbitrary limit of 10 arguments for SQL statements or
      metacommands is far too low. Increase it to 256.
      
      This results in a very modest increase in memory usage, not enough to
      worry about.
      
      The maximum includes the SQL statement or metacommand. This is reflected
      in the comments and revised TAP tests.
      
      Simon Riggs and Dagfinn Ilmari Mannsåker with some light editing by me.
      Reviewed by: David Rowley and Fabien Coelho
      
      Discussion: https://postgr.es/m/CANP8+jJiMJOAf-dLoHuR-8GENiK+eHTY=Omw38Qx7j2g0NDTXA@mail.gmail.com
      a4784152
    • Amit Kapila's avatar