Commit 8a504a36 authored by Tom Lane's avatar Tom Lane

Make pg_dump emit more accurate dependency information.

While pg_dump has included dependency information in archive-format output
ever since 7.3, it never made any large effort to ensure that that
information was actually useful.  In particular, in common situations where
dependency chains include objects that aren't separately emitted in the
dump, the dependencies shown for objects that were emitted would reference
the dump IDs of these un-dumped objects, leaving no clue about which other
objects the visible objects indirectly depend on.  So far, parallel
pg_restore has managed to avoid tripping over this misfeature, but only
by dint of some crude hacks like not trusting dependency information in
the pre-data section of the archive.

It seems prudent to do something about this before it rises up to bite us,
so instead of emitting the "raw" dependencies of each dumped object,
recursively search for its actual dependencies among the subset of objects
that are being dumped.

Back-patch to 9.2, since that code hasn't yet diverged materially from
HEAD.  At some point we might need to back-patch further, but right now
there are no known cases where this is actively necessary.  (The one known
case, bug #6699, is fixed in a different way by my previous patch.)  Since
this patch depends on 9.2 changes that made TOC entries be marked before
output commences as to whether they'll be dumped, back-patching further
would require additional surgery; and as of now there's no evidence that
it's worth the risk.
parent a1ef01fe
......@@ -3498,9 +3498,14 @@ restore_toc_entries_parallel(ArchiveHandle *AH)
* Do all the early stuff in a single connection in the parent. There's no
* great point in running it in parallel, in fact it will actually run
* faster in a single connection because we avoid all the connection and
* setup overhead. Also, pg_dump is not currently very good about showing
* all the dependencies of SECTION_PRE_DATA items, so we do not risk
* trying to process them out-of-order.
* setup overhead. Also, pre-9.2 pg_dump versions were not very good
* about showing all the dependencies of SECTION_PRE_DATA items, so we do
* not risk trying to process them out-of-order.
*
* Note: as of 9.2, it should be guaranteed that all PRE_DATA items appear
* before DATA items, and all DATA items before POST_DATA items. That is
* not certain to be true in older archives, though, so this loop is coded
* to not assume it.
*/
skipped_some = false;
for (next_work_item = AH->toc->next; next_work_item != AH->toc; next_work_item = next_work_item->next)
......@@ -4162,8 +4167,9 @@ fix_dependencies(ArchiveHandle *AH)
/*
* Count the incoming dependencies for each item. Also, it is possible
* that the dependencies list items that are not in the archive at all.
* Subtract such items from the depCounts.
* that the dependencies list items that are not in the archive at all
* (that should not happen in 9.2 and later, but is highly likely in
* older archives). Subtract such items from the depCounts.
*/
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
......
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment