1. 24 Nov, 2013 1 commit
    • Tom Lane's avatar
      Fix array slicing of int2vector and oidvector values. · 45e02e32
      Tom Lane authored
      The previous coding labeled expressions such as pg_index.indkey[1:3] as
      being of int2vector type; which is not right because the subscript bounds
      of such a result don't, in general, satisfy the restrictions of int2vector.
      To fix, implicitly promote the result of slicing int2vector to int2[],
      or oidvector to oid[].  This is similar to what we've done with domains
      over arrays, which is a good analogy because these types are very much
      like restricted domains of the corresponding regular-array types.
      
      A side-effect is that we now also forbid array-element updates on such
      columns, eg while "update pg_index set indkey[4] = 42" would have worked
      before if you were superuser (and corrupted your catalogs irretrievably,
      no doubt) it's now disallowed.  This seems like a good thing since, again,
      some choices of subscripting would've led to results not satisfying the
      restrictions of int2vector.  The case of an array-slice update was
      rejected before, though with a different error message than you get now.
      We could make these cases work in future if we added a cast from int2[]
      to int2vector (with a cast function checking the subscript restrictions)
      but it seems unlikely that there's any value in that.
      
      Per report from Ronan Dunklau.  Back-patch to all supported branches
      because of the crash risks involved.
      45e02e32
  2. 23 Nov, 2013 3 commits
    • Tom Lane's avatar
      Ensure _dosmaperr() actually sets errno correctly. · f145454d
      Tom Lane authored
      If logging is enabled, either ereport() or fprintf() might stomp on errno
      internally, causing this function to return the wrong result.  That might
      only end in a misleading error report, but in any code that's examining
      errno to decide what to do next, the consequences could be far graver.
      
      This has been broken since the very first version of this file in 2006
      ... it's a bit astonishing that we didn't identify this long ago.
      
      Reported by Amit Kapila, though this isn't his proposed fix.
      f145454d
    • Peter Eisentraut's avatar
      Fix thinko in SPI_execute_plan() calls · b7212c97
      Peter Eisentraut authored
      Two call sites were apparently thinking that the last argument of
      SPI_execute_plan() is the number of query parameters, but it is actually
      the row limit.  Change the calls to 0, since we don't care about the
      limit there.  The previous code didn't break anything, but it was still
      wrong.
      b7212c97
    • Peter Eisentraut's avatar
      Avoid potential buffer overflow crash · 4053189d
      Peter Eisentraut authored
      A pointer to a C string was treated as a pointer to a "name" datum and
      passed to SPI_execute_plan().  This pointer would then end up being
      passed through datumCopy(), which would try to copy the entire 64 bytes
      of name data, thus running past the end of the C string.  Fix by
      converting the string to a proper name structure.
      
      Found by LLVM AddressSanitizer.
      4053189d
  3. 22 Nov, 2013 6 commits
    • Tom Lane's avatar
      Flatten join alias Vars before pulling up targetlist items from a subquery. · f19e92ed
      Tom Lane authored
      pullup_replace_vars()'s decisions about whether a pulled-up replacement
      expression needs to be wrapped in a PlaceHolderVar depend on the assumption
      that what looks like a Var behaves like a Var.  However, if the Var is a
      join alias reference, later flattening of join aliases might replace the
      Var with something that's not a Var at all, and should have been wrapped.
      
      To fix, do a forcible pass of flatten_join_alias_vars() on the subquery
      targetlist before we start to copy items out of it.  We'll re-run that
      processing on the pulled-up expressions later, but that's harmless.
      
      Per report from Ken Tanzer; the added regression test case is based on his
      example.  This bug has been there since the PlaceHolderVar mechanism was
      invented, but has escaped detection because the circumstances that trigger
      it are fairly narrow.  You need a flattenable query underneath an outer
      join, which contains another flattenable query inside a join of its own,
      with a dangerous expression (a constant or something else non-strict)
      in that one's targetlist.
      
      Having seen this, I'm wondering if it wouldn't be prudent to do all
      alias-variable flattening earlier, perhaps even in the rewriter.
      But that would probably not be a back-patchable change.
      f19e92ed
    • Tom Lane's avatar
      Fix quoting in help messages in uuid-ossp extension scripts. · f29baf92
      Tom Lane authored
      The command we're telling people to type needs to include double-quoting
      around the unfortunately-chosen extension name.  Twiddle the textual
      quoting so that it looks somewhat sane.  Per gripe from roadrunner6.
      f29baf92
    • Heikki Linnakangas's avatar
      Fix Hot-Standby initialization of clog and subtrans. · 98f58a30
      Heikki Linnakangas authored
      These bugs can cause data loss on standbys started with hot_standby=on at
      the moment they start to accept read only queries, by marking committed
      transactions as uncommited. The likelihood of such corruptions is small
      unless the primary has a high transaction rate.
      
      5a031a55 fixed bugs in HS's startup logic
      by maintaining less state until at least STANDBY_SNAPSHOT_PENDING state
      was reached, missing the fact that both clog and subtrans are written to
      before that. This only failed to fail in common cases because the usage
      of ExtendCLOG in procarray.c was superflous since clog extensions are
      actually WAL logged.
      
      f44eedc3f0f347a856eea8590730769125964597/I then tried to fix the missing
      extensions of pg_subtrans due to the former commit's changes - which are
      not WAL logged - by performing the extensions when switching to a state
      > STANDBY_INITIALIZED and not performing xid assignments before that -
      again missing the fact that ExtendCLOG is unneccessary - but screwed up
      twice: Once because latestObservedXid wasn't updated anymore in that
      state due to the earlier commit and once by having an off-by-one error in
      the loop performing extensions. This means that whenever a
      CLOG_XACTS_PER_PAGE (32768 with default settings) boundary was crossed
      between the start of the checkpoint recovery started from and the first
      xl_running_xact record old transactions commit bits in pg_clog could be
      overwritten if they started and committed in that window.
      
      Fix this mess by not performing ExtendCLOG() in HS at all anymore since
      it's unneeded and evidently dangerous and by performing subtrans
      extensions even before reaching STANDBY_SNAPSHOT_PENDING.
      
      Analysis and patch by Andres Freund. Reported by Christophe Pettus.
      Backpatch down to 9.0, like the previous commit that caused this.
      98f58a30
    • Heikki Linnakangas's avatar
      Avoid acquiring spinlock when checking if recovery has finished, for speed. · 1a3d1044
      Heikki Linnakangas authored
      RecoveryIsInProgress() can be called very frequently. During normal
      operation, it just checks a backend-local variable and returns quickly,
      but during hot standby, it checks a spinlock-protected shared variable.
      Those spinlock acquisitions can become a point of contention on a busy
      hot standby system.
      
      Replace the spinlock acquisition with a memory barrier.
      
      Per discussion with Andres Freund, Ants Aasma and Merlin Moncure.
      1a3d1044
    • Peter Eisentraut's avatar
      Tweak streamutil.c further to avoid scan-build warning · f4482a54
      Peter Eisentraut authored
      The previous change added a new scan-build warning about need_password
      assigned but not read.
      f4482a54
    • Tom Lane's avatar
      Support multi-argument UNNEST(), and TABLE() syntax for multiple functions. · 784e762e
      Tom Lane authored
      This patch adds the ability to write TABLE( function1(), function2(), ...)
      as a single FROM-clause entry.  The result is the concatenation of the
      first row from each function, followed by the second row from each
      function, etc; with NULLs inserted if any function produces fewer rows than
      others.  This is believed to be a much more useful behavior than what
      Postgres currently does with multiple SRFs in a SELECT list.
      
      This syntax also provides a reasonable way to combine use of column
      definition lists with WITH ORDINALITY: put the column definition list
      inside TABLE(), where it's clear that it doesn't control the ordinality
      column as well.
      
      Also implement SQL-compliant multiple-argument UNNEST(), by turning
      UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)).
      
      The SQL standard specifies TABLE() with only a single function, not
      multiple functions, and it seems to require an implicit UNNEST() which is
      not what this patch does.  There may be something wrong with that reading
      of the spec, though, because if it's right then the spec's TABLE() is just
      a pointless alternative spelling of UNNEST().  After further review of
      that, we might choose to adopt a different syntax for what this patch does,
      but in any case this functionality seems clearly worthwhile.
      
      Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and
      significantly revised by me
      784e762e
  4. 21 Nov, 2013 1 commit
    • Fujii Masao's avatar
      Fix pg_isready to handle -d option properly. · 38f43289
      Fujii Masao authored
      Previously, -d option for pg_isready was broken. When the name of the
      database was specified by -d option, pg_isready failed with an error.
      When the conninfo specified by -d option contained the setting of the
      host name but not Numeric IP address (i.e., hostaddr), pg_isready
      displayed wrong connection message. -d option could not handle a valid
      URI prefix at all. This commit fixes these bugs of pg_isready.
      
      Backpatch to 9.3, where pg_isready was introduced.
      
      Per report from Josh Berkus and Robert Haas.
      Original patch by Fabrízio de Royes Mello, heavily modified by me.
      38f43289
  5. 20 Nov, 2013 4 commits
    • Heikki Linnakangas's avatar
      More GIN refactoring. · 04eee1fa
      Heikki Linnakangas authored
      Split off the portion of ginInsertValue that inserts the tuple to current
      level into a separate function, ginPlaceToPage. ginInsertValue's charter
      is now to recurse up the tree to insert the downlink, when a page split is
      required.
      
      This is in preparation for a patch to change the way incomplete splits are
      handled, which will need to do these operations separately. And IMHO makes
      the code more readable anyway.
      04eee1fa
    • Heikki Linnakangas's avatar
      Refactor the internal GIN B-tree interface for forming a downlink. · 50101263
      Heikki Linnakangas authored
      This creates a new gin-btree callback function for creating a downlink for
      a page. Previously, ginxlog.c duplicated the logic used during normal
      operation.
      50101263
    • Heikki Linnakangas's avatar
      Further GIN refactoring. · 04965ad4
      Heikki Linnakangas authored
      Merge some functions that were always called together. Makes the code
      little bit more readable.
      04965ad4
    • Peter Eisentraut's avatar
      ecpg: Split off mmfatal() from mmerror() · b21de4e7
      Peter Eisentraut authored
      This allows decorating mmfatal() with noreturn compiler hints, leading
      to better diagnostics.
      b21de4e7
  6. 19 Nov, 2013 4 commits
  7. 18 Nov, 2013 5 commits
  8. 17 Nov, 2013 1 commit
  9. 16 Nov, 2013 4 commits
    • Tom Lane's avatar
      Improve performance of numeric sum(), avg(), stddev(), variance(), etc. · 69c8fbac
      Tom Lane authored
      This patch improves performance of most built-in aggregates that formerly
      used a NUMERIC or NUMERIC array as their transition type; this includes
      not only aggregates on numeric inputs, but some aggregates on integer
      inputs where overflow of an int8 value is a possibility.  The code now
      uses a special-purpose data structure to avoid array construction and
      deconstruction overhead, as well as packing and unpacking overhead for
      numeric values.
      
      These aggregates' transition type is now declared as INTERNAL, since
      it doesn't correspond to any SQL data type.  To keep the planner from
      thinking that that means a lot of storage will be used, we make use
      of the just-added pg_aggregate.aggtransspace feature.  The space estimate
      is set to 128 bytes, which is at least in the right ballpark.
      
      Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra
      69c8fbac
    • Tom Lane's avatar
      Allow aggregates to provide estimates of their transition state data size. · 6cb86143
      Tom Lane authored
      Formerly the planner had a hard-wired rule of thumb for guessing the amount
      of space consumed by an aggregate function's transition state data.  This
      estimate is critical to deciding whether it's OK to use hash aggregation,
      and in many situations the built-in estimate isn't very good.  This patch
      adds a column to pg_aggregate wherein a per-aggregate estimate can be
      provided, overriding the planner's default, and infrastructure for setting
      the column via CREATE AGGREGATE.
      
      It may be that additional smarts will be required in future, perhaps even
      a per-aggregate estimation function.  But this is already a step forward.
      
      This is extracted from a larger patch to improve the performance of numeric
      and int8 aggregates.  I (tgl) thought it was worth reviewing and committing
      this infrastructure separately.  In this commit, all built-in aggregates
      are given aggtransspace = 0, so no behavior should change.
      
      Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra
      6cb86143
    • Peter Eisentraut's avatar
      55c3d86a
    • Tom Lane's avatar
      Remove pgbench's hardwired limit on line length in custom script files. · 61a07bae
      Tom Lane authored
      pgbench formerly failed on lines longer than BUFSIZ, unexpectedly
      splitting them into multiple commands.  Allow it to work with any
      length of input line.
      
      Sawada Masahiko
      61a07bae
  10. 15 Nov, 2013 10 commits
    • Tom Lane's avatar
      Fix incorrect loop counts in tidbitmap.c. · f1f21b2d
      Tom Lane authored
      A couple of places that should have been iterating over WORDS_PER_CHUNK
      words were iterating over WORDS_PER_PAGE words instead.  This thinko
      accidentally failed to fail, because (at least on common architectures
      with default BLCKSZ) WORDS_PER_CHUNK is a bit less than WORDS_PER_PAGE,
      and the extra words being looked at were always zero so nothing happened.
      Still, it's a bug waiting to happen if anybody ever fools with the
      parameters affecting TIDBitmap sizes, and it's a small waste of cycles
      too.  So back-patch to all active branches.
      
      Etsuro Fujita
      f1f21b2d
    • Tom Lane's avatar
      Speed up printing of INSERT statements in pg_dump. · 97e1ec46
      Tom Lane authored
      In --inserts and especially --column-inserts mode, we can get a useful
      speedup by generating the common prefix of all a table's INSERT commands
      just once, and then printing the prebuilt string for each row.  This avoids
      multiple invocations of fmtId() and other minor fooling around.
      
      David Rowley
      97e1ec46
    • Tom Lane's avatar
      Clean up password prompting logic in streamutil.c. · 3172eea0
      Tom Lane authored
      The previous coding was fairly unreadable and drew double-free warnings
      from clang.  I believe the double free was actually not reachable, because
      PQconnectionNeedsPassword is coded to not return true if a password was
      provided, so that the loop can't iterate more than twice.  Nonetheless
      it seems worth rewriting.  No back-patch since this is just cosmetic.
      3172eea0
    • Tom Lane's avatar
      Compute correct em_nullable_relids in get_eclass_for_sort_expr(). · f3b3b8d5
      Tom Lane authored
      Bug #8591 from Claudio Freire demonstrates that get_eclass_for_sort_expr
      must be able to compute valid em_nullable_relids for any new equivalence
      class members it creates.  I'd worried about this in the commit message
      for db9f0e1d, but claimed that it wasn't a
      problem because multi-member ECs should already exist when it runs.  That
      is transparently wrong, though, because this function is also called by
      initialize_mergeclause_eclasses, which runs during deconstruct_jointree.
      The example given in the bug report (which the new regression test item
      is based upon) fails because the COALESCE() expression is first seen by
      initialize_mergeclause_eclasses rather than process_equivalence.
      
      Fixing this requires passing the appropriate nullable_relids set to
      get_eclass_for_sort_expr, and it requires new code to compute that set
      for top-level expressions such as ORDER BY, GROUP BY, etc.  We store
      the top-level nullable_relids in a new field in PlannerInfo to avoid
      computing it many times.  In the back branches, I've added the new
      field at the end of the struct to minimize ABI breakage for planner
      plugins.  There doesn't seem to be a good alternative to changing
      get_eclass_for_sort_expr's API signature, though.  There probably aren't
      any third-party extensions calling that function directly; moreover,
      if there are, they probably need to think about what to pass for
      nullable_relids anyway.
      
      Back-patch to 9.2, like the previous patch in this area.
      f3b3b8d5
    • Tom Lane's avatar
      Prevent leakage of cached plans and execution trees in plpgsql DO blocks. · c7b849a8
      Tom Lane authored
      plpgsql likes to cache query plans and simple-expression execution state
      trees across calls.  This is a considerable win for multiple executions
      of the same function.  However, it's useless for DO blocks, since by
      definition those are executed only once and discarded.  Nonetheless,
      we were allowing a DO block's expression execution trees to survive
      until end of transaction, resulting in a significant intra-transaction
      memory leak, as reported by Yeb Havinga.  Worse, if the DO block exited
      with an error, the compiled form of the block's code was leaked till
      end of session --- along with subsidiary plancache entries.
      
      To fix, make DO blocks keep their expression execution trees in a private
      EState that's deleted at exit from the block, and add a PG_TRY block
      to plpgsql_inline_handler to make sure that memory cleanup happens
      even on error exits.  Also add a regression test covering error handling
      in a DO block, because my first try at this broke that.  (The test is
      not meant to prove that we don't leak memory anymore, though it could
      be used for that with a much larger loop count.)
      
      Ideally we'd back-patch this into all versions supporting DO blocks;
      but the patch needs to add a field to struct PLpgSQL_execstate, and that
      would break ABI compatibility for third-party plugins such as the plpgsql
      debugger.  Given the small number of complaints so far, fixing this in
      HEAD only seems like an acceptable choice.
      c7b849a8
    • Tom Lane's avatar
      Minor comment corrections for sequence hashtable patch. · 80e3a470
      Tom Lane authored
      There were enough typos in the comments to annoy me ...
      80e3a470
    • Kevin Grittner's avatar
      Fix buffer overrun in isolation test program. · 7cb964ac
      Kevin Grittner authored
      Commit 061b88c7 saved argv0 to a
      global buffer without ensuring that it was zero terminated,
      allowing references to it to overrun the buffer and access other
      memory.  This probably would not have presented any security risk,
      but could have resulted in very confusing failures if the path to
      the executable was very long.
      
      Reported by David Rowley
      7cb964ac
    • Robert Haas's avatar
      doc: Restore proper alphabetical order. · 71dd54ad
      Robert Haas authored
      Colin 't Hart
      71dd54ad
    • Heikki Linnakangas's avatar
      Fix bogus hash table creation. · 5cb719be
      Heikki Linnakangas authored
      Andres Freund
      5cb719be
    • Heikki Linnakangas's avatar
      Use a hash table to store current sequence values. · 21025d4a
      Heikki Linnakangas authored
      This speeds up nextval() and currval(), when you touch a lot of different
      sequences in the same backend.
      
      David Rowley
      21025d4a
  11. 14 Nov, 2013 1 commit