1. 04 Apr, 2012 4 commits
    • Tom Lane's avatar
      Add a "row processor" API to libpq for better handling of large results. · 92785dac
      Tom Lane authored
      Traditionally libpq has collected an entire query result before passing
      it back to the application.  That provides a simple and transactional API,
      but it's pretty inefficient for large result sets.  This patch allows the
      application to process each row on-the-fly instead of accumulating the
      rows into the PGresult.  Error recovery becomes a bit more complex, but
      often that tradeoff is well worth making.
      
      Kyotaro Horiguchi, reviewed by Marko Kreen and Tom Lane
      92785dac
    • Tom Lane's avatar
      Remove useless PGRES_COPY_BOTH "support" in psql. · cb917e15
      Tom Lane authored
      There is no existing or foreseeable case in which psql should see a
      PGRES_COPY_BOTH PQresultStatus; and if such a case ever emerges, it's a
      pretty good bet that these code fragments wouldn't do the right thing
      anyway.  Remove them, and let the existing default cases do the appropriate
      thing, namely emit an "unexpected PQresultStatus" bleat.
      
      Noted while working on libpq row processor patch, for which I was
      considering adding a PGRES_SUSPENDED status code --- the same default-case
      treatment would be appropriate for that.
      cb917e15
    • Tom Lane's avatar
      Fix syslogger to not lose log coherency under high load. · c17e863b
      Tom Lane authored
      The original coding of the syslogger had an arbitrary limit of 20 large
      messages concurrently in progress, after which it would just punt and dump
      message fragments to the output file separately.  Our ambitions are a bit
      higher than that now, so allow the data structure to expand as necessary.
      
      Reported and patched by Andrew Dunstan; some editing by Tom
      c17e863b
    • Tom Lane's avatar
      Fix a couple of contrib/dblink bugs. · d843ed21
      Tom Lane authored
      dblink_exec leaked temporary database connections if any error occurred
      after connection setup, for example
      	SELECT dblink_exec('...connect string...', 'select 1/0');
      Add a PG_TRY block to ensure PQfinish gets done when it is needed.
      (dblink_record_internal is on the hairy edge of needing similar treatment,
      but seems not to be actively broken at the moment.)
      
      Also, in 9.0 and up, only one of the three functions using tuplestore
      return mode was properly checking that the query context would allow
      a tuplestore result.
      
      Noted while reviewing dblink patch.  Back-patch to all supported branches.
      d843ed21
  2. 03 Apr, 2012 2 commits
  3. 01 Apr, 2012 2 commits
  4. 31 Mar, 2012 5 commits
    • Tom Lane's avatar
      Fix O(N^2) behavior in pg_dump when many objects are in dependency loops. · d5881c03
      Tom Lane authored
      Combining the loop workspace with the record of already-processed objects
      might have been a cute trick, but it behaves horridly if there are many
      dependency loops to repair: the time spent in the first step of findLoop()
      grows as O(N^2).  Instead use a separate flag array indexed by dump ID,
      which we can check in constant time.  The length of the workspace array
      is now never more than the actual length of a dependency chain, which
      should be reasonably short in all cases of practical interest.  The code
      is noticeably easier to understand this way, too.
      
      Per gripe from Mike Roest.  Since this is a longstanding performance bug,
      backpatch to all supported versions.
      d5881c03
    • Tom Lane's avatar
      Fix O(N^2) behavior in pg_dump for large numbers of owned sequences. · 0d8117ab
      Tom Lane authored
      The loop that matched owned sequences to their owning tables required time
      proportional to number of owned sequences times number of tables; although
      this work was only expended in selective-dump situations, which is probably
      why the issue wasn't recognized long since.  Refactor slightly so that we
      can perform this work after the index array for findTableByOid has been
      set up, reducing the time to O(M log N).
      
      Per gripe from Mike Roest.  Since this is a longstanding performance bug,
      backpatch to all supported versions.
      0d8117ab
    • Tom Lane's avatar
      Rename frontend keyword arrays to avoid conflict with backend. · c252a17d
      Tom Lane authored
      ecpg and pg_dump each contain keyword arrays with structure similar
      to the backend's keyword array.  Up to now, we actually named those
      arrays the same as the backend's and relied on parser/keywords.h
      to declare them.  This seems a tad too cute, though, and it breaks
      now that we need to PGDLLIMPORT-decorate the backend symbols.
      Rename to avoid the problem.  Per buildfarm.
      
      (It strikes me that maybe we should get rid of the separate keywords.c
      files altogether, and just define these arrays in the modules that use
      them, but that's a rather more invasive change.)
      c252a17d
    • Tom Lane's avatar
      Fix glitch recently introduced in psql tab completion. · a52e6fe7
      Tom Lane authored
      Over-optimization (by me, looks like :-() broke the case of recognizing
      a word boundary just before a quoted identifier.  Reported and diagnosed
      by Dean Rasheed.
      a52e6fe7
    • Tom Lane's avatar
      Add PGDLLIMPORT to ScanKeywords and NumScanKeywords. · 5e83854d
      Tom Lane authored
      Per buildfarm, this is now needed by contrib/pg_stat_statements.
      5e83854d
  5. 30 Mar, 2012 4 commits
  6. 29 Mar, 2012 9 commits
    • Tom Lane's avatar
      Fix dblink's failure to report correct connection name in error messages. · b75fbe91
      Tom Lane authored
      The DBLINK_GET_CONN and DBLINK_GET_NAMED_CONN macros did not set the
      surrounding function's conname variable, causing errors to be incorrectly
      reported as having occurred on the "unnamed" connection in some cases.
      This bug was actually visible in two cases in the regression tests,
      but apparently whoever added those cases wasn't paying attention.
      
      Noted by Kyotaro Horiguchi, though this is different from his proposed
      patch.
      
      Back-patch to 8.4; 8.3 does not have the same type of error reporting
      so the patch is not relevant.
      b75fbe91
    • Tom Lane's avatar
      Improve contrib/pg_stat_statements' handling of PREPARE/EXECUTE statements. · 566a1d43
      Tom Lane authored
      It's actually more useful for the module to ignore these.  Ignoring
      EXECUTE (and not incrementing the nesting level) allows the executor
      hooks to charge the time to the underlying prepared query, which
      shows up as a stats entry with the original PREPARE as query string
      (possibly modified by suppression of constants, which might not be
      terribly useful here but it's not worth avoiding).  This is much more
      useful than cluttering the stats table with a distinct entry for each
      textually distinct EXECUTE.
      
      Experimentation with this idea shows that it's also preferable to ignore
      PREPARE.  If we don't, we get two stats table entries, one with the query
      string hash and one with the jumble-derived hash, but with the same visible
      query string (modulo those constants).  This is confusing and not very
      helpful, since the first entry will only receive costs associated with
      initial planning of the query, which is not something counted at all
      normally by pg_stat_statements.  (And if we do start tracking planning
      costs, we'd want them blamed on the other hash table entry anyway.)
      566a1d43
    • Tom Lane's avatar
      Improve handling of utility statements containing plannable statements. · e0e4ebe3
      Tom Lane authored
      When tracking nested statements, contrib/pg_stat_statements formerly
      double-counted the execution costs of utility statements that directly
      contain an executable statement, such as EXPLAIN and DECLARE CURSOR.
      This was not obvious since the ProcessUtility and Executor hooks
      would each add their measured costs to the same stats table entry.
      However, with the new implementation that hashes utility and plannable
      statements differently, this showed up as seemingly-duplicate stats
      entries.  Fix that by disabling the Executor hooks when the query has a
      queryId of zero, which was the case already for such statements but is now
      more clearly specified in the code.  (The zero queryId was causing problems
      anyway because all such statements would add to a single bogus entry.)
      
      The PREPARE/EXECUTE case still results in counting the same execution
      in two different stats table entries, but it should be much less surprising
      to users that there are two entries in such cases.
      
      In passing, include a CommonTableExpr's ctename in the query hash.
      I had left it out originally on the grounds that we wanted to omit all
      inessential aliases, but since RTE_CTE RTEs are hashing their referenced
      names, we'd better hash the CTE names too to make sure we don't hash
      semantically different queries the same.
      e0e4ebe3
    • Peter Eisentraut's avatar
      initdb: Mark more messages for translation · 2005b77b
      Peter Eisentraut authored
      Some Windows-only messages had apparently been forgotten so far.
      
      Also make the wording of the messages more consistent with similar
      messages other parts, such as pg_ctl and pg_regress.
      2005b77b
    • Simon Riggs's avatar
      Correct epoch of txid_current() when executed on a Hot Standby server. · 68219aaf
      Simon Riggs authored
      Initialise ckptXidEpoch from starting checkpoint and maintain the correct
      value as we roll forwards. This allows GetNextXidAndEpoch() to return the
      correct epoch when executed during recovery. Backpatch to 9.0 when the
      problem is first observable by a user.
      
      Bug report from Daniel Farina
      68219aaf
    • Andrew Dunstan's avatar
      aeca6502
    • Heikki Linnakangas's avatar
      Inherit max_safe_fds to child processes in EXEC_BACKEND mode. · 5762a4d9
      Heikki Linnakangas authored
      Postmaster sets max_safe_fds by testing how many open file descriptors it
      can open, and that is normally inherited by all child processes at fork().
      Not so on EXEC_BACKEND, ie. Windows, however. Because of that, we
      effectively ignored max_files_per_process on Windows, and always assumed
      a conservative default of 32 simultaneous open files. That could have an
      impact on performance, if you need to access a lot of different files
      in a query. After this patch, the value is passed to child processes by
      save/restore_backend_variables() among many other global variables.
      
      It has been like this forever, but given the lack of complaints about it,
      I'm not backpatching this.
      5762a4d9
    • Andrew Dunstan's avatar
      Remove now redundant pgpipe code. · d2c1740d
      Andrew Dunstan authored
      d2c1740d
    • Tom Lane's avatar
      Improve contrib/pg_stat_statements to lump "similar" queries together. · 7313cc01
      Tom Lane authored
      pg_stat_statements now hashes selected fields of the analyzed parse tree
      to assign a "fingerprint" to each query, and groups all queries with the
      same fingerprint into a single entry in the pg_stat_statements view.
      In practice it is expected that queries with the same fingerprint will be
      equivalent except for values of literal constants.  To make the display
      more useful, such constants are replaced by "?" in the displayed query
      strings.
      
      This mechanism currently supports only optimizable queries (SELECT,
      INSERT, UPDATE, DELETE).  Utility commands are still matched on the
      basis of their literal query strings.
      
      There remain some open questions about how to deal with utility statements
      that contain optimizable queries (such as EXPLAIN and SELECT INTO) and how
      to deal with expiring speculative hashtable entries that are made to save
      the normalized form of a query string.  However, fixing these issues should
      require only localized changes, and since there are other open patches
      involving contrib/pg_stat_statements, it seems best to go ahead and commit
      what we've got.
      
      Peter Geoghegan, reviewed by Daniel Farina
      7313cc01
  7. 28 Mar, 2012 6 commits
  8. 27 Mar, 2012 7 commits
    • Robert Haas's avatar
      pg_test_timing utility, to measure clock monotonicity and timing cost. · cee52386
      Robert Haas authored
      Ants Aasma, Greg Smith
      cee52386
    • Robert Haas's avatar
      Expose track_iotiming information via pg_stat_statements. · 5b4f3466
      Robert Haas authored
      Ants Aasma, reviewed by Greg Smith, with very minor tweaks by me.
      5b4f3466
    • Tom Lane's avatar
      Bend parse location rules for the convenience of pg_stat_statements. · 5d3fcc4c
      Tom Lane authored
      Generally, the parse location assigned to a multiple-token construct is
      the location of its leftmost token.  This commit breaks that rule for
      the syntaxes TYPENAME 'LITERAL' and CAST(CONSTANT AS TYPENAME) --- the
      resulting Const will have the location of the literal string, not the
      typename or CAST keyword.  The cases where this matters are pretty thin on
      the ground (no error messages in the regression tests change, for example),
      and it's unlikely that any user would be confused anyway by an error cursor
      pointing at the literal.  But still it's less than consistent.  The reason
      for changing it is that contrib/pg_stat_statements wants to know the parse
      location of the original literal, and it was agreed that this is the least
      unpleasant way to preserve that information through parse analysis.
      
      Peter Geoghegan
      5d3fcc4c
    • Tom Lane's avatar
      Add some infrastructure for contrib/pg_stat_statements. · a40fa613
      Tom Lane authored
      Add a queryId field to Query and PlannedStmt.  This is not used by the
      core backend, except for being copied around at appropriate times.
      It's meant to allow plug-ins to track a particular query forward from
      parse analysis to execution.
      
      The queryId is intentionally not dumped into stored rules (and hence this
      commit doesn't bump catversion).  You could argue that choice either way,
      but it seems better that stored rule strings not have any dependency
      on plug-ins that might or might not be present.
      
      Also, add a post_parse_analyze_hook that gets invoked at the end of
      parse analysis (but only for top-level analysis of complete queries,
      not cases such as analyzing a domain's default-value expression).
      This is mainly meant to be used to compute and assign a queryId,
      but it could have other applications.
      
      Peter Geoghegan
      a40fa613
    • Robert Haas's avatar
      New GUC, track_iotiming, to track I/O timings. · 40b9b957
      Robert Haas authored
      Currently, the only way to see the numbers this gathers is via
      EXPLAIN (ANALYZE, BUFFERS), but the plan is to add visibility through
      the stats collector and pg_stat_statements in subsequent patches.
      
      Ants Aasma, reviewed by Greg Smith, with some further changes by me.
      40b9b957
    • Tom Lane's avatar
      98316e21
    • Peter Eisentraut's avatar
      dd024c22
  9. 26 Mar, 2012 1 commit