1. 30 Oct, 2015 4 commits
    • Tom Lane's avatar
      Implement lookbehind constraints in our regular-expression engine. · 12c9a040
      Tom Lane authored
      A lookbehind constraint is like a lookahead constraint in that it consumes
      no text; but it checks for existence (or nonexistence) of a match *ending*
      at the current point in the string, rather than one *starting* at the
      current point.  This is a long-requested feature since it exists in many
      other regex libraries, but Henry Spencer had never got around to
      implementing it in the code we use.
      
      Just making it work is actually pretty trivial; but naive copying of the
      logic for lookahead constraints leads to code that often spends O(N^2) time
      to scan an N-character string, because we have to run the match engine
      from string start to the current probe point each time the constraint is
      checked.  In typical use-cases a lookbehind constraint will be written at
      the start of the regex and hence will need to be checked at every character
      --- so O(N^2) work overall.  To fix that, I introduced a third copy of the
      core DFA matching loop, paralleling the existing longest() and shortest()
      loops.  This version, matchuntil(), can suspend and resume matching given
      a couple of pointers' worth of storage space.  So we need only run it
      across the string once, stopping at each interesting probe point and then
      resuming to advance to the next one.
      
      I also put in an optimization that simplifies one-character lookahead and
      lookbehind constraints, such as "(?=x)" or "(?<!\w)", into AHEAD and BEHIND
      constraints, which already existed in the engine.  This avoids the overhead
      of the LACON machinery entirely for these rather common cases.
      
      The net result is that lookbehind constraints run a factor of three or so
      slower than Perl's for multi-character constraints, but faster than Perl's
      for one-character constraints ... and they work fine for variable-length
      constraints, which Perl gives up on entirely.  So that's not bad from a
      competitive perspective, and there's room for further optimization if
      anyone cares.  (In reality, raw scan rate across a large input string is
      probably not that big a deal for Postgres usage anyway; so I'm happy if
      it's linear.)
      12c9a040
    • Robert Haas's avatar
      doc: security_barrier option is a Boolean, not a string. · c5057b2b
      Robert Haas authored
      Mistake introduced by commit 5bd91e3a.
      
      Hari Babu
      c5057b2b
    • Robert Haas's avatar
      Update parallel executor support to reuse the same DSM. · 3a1f8611
      Robert Haas authored
      Commit b0b0d84b purported to make it
      possible to relaunch workers using the same parallel context, but it had
      an unpleasant race condition: we might reinitialize after the workers
      have sent their last control message but before they have dettached the
      DSM, leaving to crashes.  Repair by introducing a new ParallelContext
      operation, ReinitializeParallelDSM.
      
      Adjust execParallel.c to use this new support, so that we can rescan a
      Gather node by relaunching workers but without needing to recreate the
      DSM.
      
      Amit Kapila, with some adjustments by me.  Extracted from latest parallel
      sequential scan patch.
      3a1f8611
    • Robert Haas's avatar
      Fix typo in bgworker.c · c6baec92
      Robert Haas authored
      c6baec92
  2. 29 Oct, 2015 3 commits
  3. 28 Oct, 2015 3 commits
  4. 27 Oct, 2015 5 commits
    • Robert Haas's avatar
      Make Gather node projection-capable. · 8538a630
      Robert Haas authored
      The original Gather code failed to mark a Gather node as not able to
      do projection, but it couldn't, even though it did call initialize its
      projection info via ExecAssignProjectionInfo.  There doesn't seem to
      be any good reason for this node not to have projection capability,
      so clean things up so that it does.  Without this, plans using Gather
      nodes might need to carry extra Result nodes to do projection.
      8538a630
    • Alvaro Herrera's avatar
      Document BRIN's inclusion opclass framework · c15898c1
      Alvaro Herrera authored
      Backpatch to 9.5 -- this should have been part of b0b7be61, but we
      didn't have 38b03caebc5de either at the time.
      
      Author: Emre Hasegeli
      Revised by: Ian Barwick
      Discussion:
       http://www.postgresql.org/message-id/CAE2gYzyB39Q9up_-TO6FKhH44pcAM1x6n_Cuj15qKoLoFihUVg@mail.gmail.com
       http://www.postgresql.org/message-id/562DA711.3020305@2ndquadrant.com
      c15898c1
    • Alvaro Herrera's avatar
      Fix BRIN free space computations · 21a4e4a4
      Alvaro Herrera authored
      A bug in the original free space computation made it possible to
      return a page which wasn't actually able to fit the item.  Since the
      insertion code isn't prepared to deal with PageAddItem failing, a PANIC
      resulted ("failed to add BRIN tuple [to new page]").  Add a macro to
      encapsulate the correct computation, and use it in
      brin_getinsertbuffer's callers before calling that routine, to raise an
      early error.
      
      I became aware of the possiblity of a problem in this area while working
      on ccc4c074.  There's no archived discussion about it, but it's
      easy to reproduce a problem in the unpatched code with something like
      
      CREATE TABLE t (a text);
      CREATE INDEX ti ON t USING brin (a) WITH (pages_per_range=1);
      
      for length in `seq 8000 8196`
      do
      	psql -f - <<EOF
      TRUNCATE TABLE t;
      INSERT INTO t VALUES ('z'), (repeat('a', $length));
      EOF
      done
      
      Backpatch to 9.5, where BRIN was introduced.
      21a4e4a4
    • Alvaro Herrera's avatar
      Cleanup commit timestamp module activaction, again · 531d21b7
      Alvaro Herrera authored
      Further tweak commit_ts.c so that on a standby the state is completely
      consistent with what that in the master, rather than behaving
      differently in the cases that the settings differ.  Now in standby and
      master the module should always be active or inactive in lockstep.
      
      Author: Petr Jelínek, with some further tweaks by Álvaro Herrera.
      
      Backpatch to 9.5, where commit timestamps were introduced.
      
      Discussion: http://www.postgresql.org/message-id/5622BF9D.2010409@2ndquadrant.com
      531d21b7
    • Alvaro Herrera's avatar
      Measure string lengths only once · 0cd836a4
      Alvaro Herrera authored
      Bernd Helmle complained that CreateReplicationSlot() was assigning the
      same value to the same variable twice, so we could remove one of them.
      Code inspection reveals that we can actually remove both assignments:
      according to the author the assignment was there for beauty of the
      strlen line only, and another possible fix to that is to put the strlen
      in its own line, so do that.
      
      To be consistent within the file, refactor all duplicated strlen()
      calls, which is what we do elsewhere in the backend anyway.  In
      basebackup.c, snprintf already returns the right length; no need for
      strlen afterwards.
      
      Backpatch to 9.4, where replication slots were introduced, to keep code
      identical.  Some of this is older, but the patch doesn't apply cleanly
      and it's only of cosmetic value anyway.
      
      Discussion: http://www.postgresql.org/message-id/BE2FD71DEA35A2287EA5F018@eje.credativ.lan
      0cd836a4
  5. 23 Oct, 2015 1 commit
  6. 22 Oct, 2015 8 commits
  7. 20 Oct, 2015 9 commits
    • Tom Lane's avatar
      Fix incorrect translation of minus-infinity datetimes for json/jsonb. · d4355425
      Tom Lane authored
      Commit bda76c1c caused both plus and
      minus infinity to be rendered as "infinity", which is not only wrong
      but inconsistent with the pre-9.4 behavior of to_json().  Fix that by
      duplicating the coding in date_out/timestamp_out/timestamptz_out more
      closely.  Per bug #13687 from Stepan Perlov.  Back-patch to 9.4, like
      the previous commit.
      
      In passing, also re-pgindent json.c, since it had gotten a bit messed up by
      recent patches (and I was already annoyed by indentation-related problems
      in back-patching this fix ...)
      d4355425
    • Peter Eisentraut's avatar
    • Robert Haas's avatar
      Fix incorrect comment in plannodes.h · a1c466c5
      Robert Haas authored
      Etsuro Fujita
      a1c466c5
    • Robert Haas's avatar
      Remove duplicate word. · dc486fb9
      Robert Haas authored
      Amit Langote
      dc486fb9
    • Robert Haas's avatar
      Tab complete CREATE EXTENSION .. VERSION. · 7c0b49cd
      Robert Haas authored
      Jeff Janes
      7c0b49cd
    • Robert Haas's avatar
      Put back ssl_renegotiation_limit parameter, but only allow 0. · 84ef9c59
      Robert Haas authored
      Per a report from Shay Rojansky, Npgsql sends ssl_renegotiation_limit=0
      in the startup packet because it does not support renegotiation; other
      clients which have not attempted to support renegotiation might well
      behave similarly.  The recent removal of this parameter forces them to
      break compatibility with either current PostgreSQL versions, or
      previous ones.  Per discussion, the best solution is to accept the
      parameter but only allow a value of 0.
      
      Shay Rojansky, edited a little by me.
      84ef9c59
    • Robert Haas's avatar
      Be a bit more rigorous about how we cache strcoll and strxfrm results. · 5be94a9e
      Robert Haas authored
      Commit 0e57b4d8 contained some clever
      logic that attempted to make sure that we couldn't get confused about
      whether the last thing we cached was a strcoll() result or a strxfrm()
      result, but it wasn't quite clever enough, because we can perform
      further abbreviations after having already performed some comparisons.
      Introduce an explicit flag in the hopes of making this watertight.
      
      Peter Geoghegan, reviewed by me.
      5be94a9e
    • Robert Haas's avatar
      Remove obsolete comment. · d53f808e
      Robert Haas authored
      Peter Geoghegan
      d53f808e
    • Noah Misch's avatar
      Eschew "RESET statement_timeout" in tests. · 8e3b4d9d
      Noah Misch authored
      Instead, use transaction abort.  Given an unlucky bout of latency, the
      timeout would cancel the RESET itself.  Buildfarm members gharial,
      lapwing, mereswine, shearwater, and sungazer witness that.  Back-patch
      to 9.1 (all supported versions).  The query_canceled test still could
      timeout before entering its subtransaction; for whatever reason, that
      has yet to happen on the buildfarm.
      8e3b4d9d
  8. 19 Oct, 2015 1 commit
    • Tom Lane's avatar
      Fix incorrect handling of lookahead constraints in pg_regprefix(). · 9f1e642d
      Tom Lane authored
      pg_regprefix was doing nothing with lookahead constraints, which would
      be fine if it were the right kind of nothing, but it isn't: we have to
      terminate our search for a fixed prefix, not just pretend the LACON arc
      isn't there.  Otherwise, if the current state has both a LACON outarc and a
      single plain-color outarc, we'd falsely conclude that the color represents
      an addition to the fixed prefix, and generate an extracted index condition
      that restricts the indexscan too much.  (See added regression test case.)
      
      Terminating the search is conservative: we could traverse the LACON arc
      (thus assuming that the constraint can be satisfied at runtime) and then
      examine the outarcs of the linked-to state.  But that would be a lot more
      work than it seems worth, because writing a LACON followed by a single
      plain character is a pretty silly thing to do.
      
      This makes a difference only in rather contrived cases, but it's a bug,
      so back-patch to all supported branches.
      9f1e642d
  9. 16 Oct, 2015 6 commits
    • Robert Haas's avatar
      Add a C API for parallel heap scans. · ee7ca559
      Robert Haas authored
      Using this API, one backend can set up a ParallelHeapScanDesc to
      which multiple backends can then attach.  Each tuple in the relation
      will be returned to exactly one of the scanning backends.  Only
      forward scans are supported, and rescans must be carefully
      coordinated.
      
      This is not exposed to the planner or executor yet.
      
      The original version of this code was written by me.  Amit Kapila
      reviewed it, tested it, and improved it, including adding support for
      synchronized scans, per review comments from Jeff Davis.  Extensive
      testing of this and related patches was performed by Haribabu Kommi.
      Final cleanup of this patch by me.
      ee7ca559
    • Robert Haas's avatar
      Allow a parallel context to relaunch workers. · b0b0d84b
      Robert Haas authored
      This may allow some callers to avoid the overhead involved in tearing
      down a parallel context and then setting up a new one, which means
      releasing the DSM and then allocating and populating a new one.  I
      suspect we'll want to revise the Gather node to make use of this new
      capability, but even if not it may be useful elsewhere and requires
      very little additional code.
      b0b0d84b
    • Tom Lane's avatar
      Miscellaneous cleanup of regular-expression compiler. · afdfcd3f
      Tom Lane authored
      Revert our previous addition of "all" flags to copyins() and copyouts();
      they're no longer needed, and were never anything but an unsightly hack.
      
      Improve a couple of infelicities in the REG_DEBUG code for dumping
      the NFA data structure, including adding code to count the total
      number of states and arcs.
      
      Add a couple of missed error checks.
      
      Add some more documentation in the README file, and some regression tests
      illustrating cases that exceeded the state-count limit and/or took
      unreasonable amounts of time before this set of patches.
      
      Back-patch to all supported branches.
      afdfcd3f
    • Tom Lane's avatar
      Improve memory-usage accounting in regular-expression compiler. · 538b3b8b
      Tom Lane authored
      This code previously counted the number of NFA states it created, and
      complained if a limit was exceeded, so as to prevent bizarre regex patterns
      from consuming unreasonable time or memory.  That's fine as far as it went,
      but the code paid no attention to how many arcs linked those states.  Since
      regexes can be contrived that have O(N) states but will need O(N^2) arcs
      after fixempties() processing, it was still possible to blow out memory,
      and take a long time doing it too.  To fix, modify the bookkeeping to count
      space used by both states and arcs.
      
      I did not bother with including the "color map" in the accounting; it
      can only grow to a few megabytes, which is not a lot in comparison to
      what we're allowing for states+arcs (about 150MB on 64-bit machines
      or half that on 32-bit machines).
      
      Looking at some of the larger real-world regexes captured in the Tcl
      regression test suite suggests that the most that is likely to be needed
      for regexes found in the wild is under 10MB, so I believe that the current
      limit has enough headroom to make it okay to keep it as a hard-wired limit.
      
      In connection with this, redefine REG_ETOOBIG as meaning "regular
      expression is too complex"; the previous wording of "nfa has too many
      states" was already somewhat inapropos because of the error code's use
      for stack depth overrun, and it was not very user-friendly either.
      
      Back-patch to all supported branches.
      538b3b8b
    • Tom Lane's avatar
      Improve performance of pullback/pushfwd in regular-expression compiler. · 6a715366
      Tom Lane authored
      The previous coding would create a new intermediate state every time it
      wanted to interchange the ordering of two constraint arcs.  Certain regex
      features such as \Y can generate large numbers of parallel constraint arcs,
      and if we needed to reorder the results of that, we created unreasonable
      numbers of intermediate states.  To improve matters, keep a list of
      already-created intermediate states associated with the state currently
      being considered by the outer loop; we can re-use such states to place all
      the new arcs leading to the same destination or source.
      
      I also took the trouble to redefine push() and pull() to have a less risky
      API: they no longer delete any state or arc that the caller might possibly
      have a pointer to, except for the specifically-passed constraint arc.
      This reduces the risk of re-introducing the same type of error seen in
      the failed patch for CVE-2007-4772.
      
      Back-patch to all supported branches.
      6a715366
    • Tom Lane's avatar
      Improve performance of fixempties() pass in regular-expression compiler. · f5b7d103
      Tom Lane authored
      The previous coding took something like O(N^4) time to fully process a
      chain of N EMPTY arcs.  We can't really do much better than O(N^2) because
      we have to insert about that many arcs, but we can do lots better than
      what's there now.  The win comes partly from using mergeins() to amortize
      de-duplication of arcs across multiple source states, and partly from
      exploiting knowledge of the ordering of arcs for each state to avoid
      looking at arcs we don't need to consider during the scan.  We do have
      to be a bit careful of the possible reordering of arcs introduced by
      the sort-merge coding of the previous commit, but that's not hard to
      deal with.
      
      Back-patch to all supported branches.
      f5b7d103