Commit 6f922ef8 authored by Tom Lane's avatar Tom Lane

Improve efficiency of dblink by using libpq's new row processor API.

This patch provides a test case for libpq's row processor API.
contrib/dblink can deal with very large result sets by dumping them into
a tuplestore (which can spill to disk) --- but until now, the intermediate
storage of the query result in a PGresult meant memory bloat for any large
result.  Now we use a row processor to convert the data to tuple form and
dump it directly into the tuplestore.

A limitation is that this only works for plain dblink() queries, not
dblink_send_query() followed by dblink_get_result().  In the latter
case we don't know the desired tuple rowtype soon enough.  While hack
solutions to that are possible, a different user-level API would
probably be a better answer.

Kyotaro Horiguchi, reviewed by Marko Kreen and Tom Lane
parent 92785dac
This diff is collapsed.
...@@ -425,14 +425,6 @@ SELECT * ...@@ -425,14 +425,6 @@ SELECT *
<refsect1> <refsect1>
<title>Notes</title> <title>Notes</title>
<para>
<function>dblink</> fetches the entire remote query result before
returning any of it to the local system. If the query is expected
to return a large number of rows, it's better to open it as a cursor
with <function>dblink_open</> and then fetch a manageable number
of rows at a time.
</para>
<para> <para>
A convenient way to use <function>dblink</> with predetermined A convenient way to use <function>dblink</> with predetermined
queries is to create a view. queries is to create a view.
...@@ -1432,6 +1424,18 @@ dblink_get_result(text connname [, bool fail_on_error]) returns setof record ...@@ -1432,6 +1424,18 @@ dblink_get_result(text connname [, bool fail_on_error]) returns setof record
sent, and one additional time to obtain an empty set result, sent, and one additional time to obtain an empty set result,
before the connection can be used again. before the connection can be used again.
</para> </para>
<para>
When using <function>dblink_send_query</> and
<function>dblink_get_result</>, <application>dblink</> fetches the entire
remote query result before returning any of it to the local query
processor. If the query returns a large number of rows, this can result
in transient memory bloat in the local session. It may be better to open
such a query as a cursor with <function>dblink_open</> and then fetch a
manageable number of rows at a time. Alternatively, use plain
<function>dblink()</>, which avoids memory bloat by spooling large result
sets to disk.
</para>
</refsect1> </refsect1>
<refsect1> <refsect1>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment