Обсуждение: docs: warn about post-data-only schema dumps with parallel restore.

Поиск
Список
Период
Сортировка

docs: warn about post-data-only schema dumps with parallel restore.

От
vaibhave postgres
Дата:
Hi hackers,

Following up on the discussion in [1] about pg_restore failing to restore post-data items due to circular foreign key deadlocks.  

I’m attaching a doc patch that adds a warning about using post-data-only schema dumps together with parallel restore.

Thanks.

Вложения

Re: docs: warn about post-data-only schema dumps with parallel restore.

От
"David G. Johnston"
Дата:
On Sun, Jan 25, 2026 at 10:23 PM vaibhave postgres <postgresvaibhave@gmail.com> wrote:
Hi hackers,

Following up on the discussion in [1] about pg_restore failing to restore post-data items due to circular foreign key deadlocks.  

I’m attaching a doc patch that adds a warning about using post-data-only schema dumps together with parallel restore.

Thanks.


The note element would align with the sibling para element.

Not a fan of the patch overall though.  I'd want to add something to pg_restore noting that use of --jobs for constraint restoration needs schema information to compute the restoration order.

There is also just a lot of detail here when something like:

<para>Consider always combining pre-data and post-data in the same command so that parallel restores have the necessary dependency information to create constraints in parallel.</para>

Any other content related to this probably belongs in the Notes section.

David J.

Re: docs: warn about post-data-only schema dumps with parallel restore.

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Sun, Jan 25, 2026 at 10:23 PM vaibhave postgres <
> postgresvaibhave@gmail.com> wrote:
>> Following up on the discussion in [1] about pg_restore failing to restore
>> post-data items due to circular foreign key deadlocks.
>> I’m attaching a doc patch that adds a warning about using post-data-only
>> schema dumps together with parallel restore.

> Not a fan of the patch overall though.  I'd want to add something to
> pg_restore noting that use of --jobs for constraint restoration needs
> schema information to compute the restoration order.

Yeah, dropping this into the list of options is bad.  We put caveats
like that into the Notes section usually.

I also tend to think that it'd be better to document this under
pg_restore: when people run into this type of failure, they are going
to go to the pg_restore docs not the pg_dump docs to understand it.
I guess there could be a case for repeating the info in both the
pg_dump and pg_restore pages, but that feels a bit verbose.

So maybe like the attached?

            regards, tom lane

diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml
index 9d91c365214..5d5dbe9c0d2 100644
--- a/doc/src/sgml/ref/pg_restore.sgml
+++ b/doc/src/sgml/ref/pg_restore.sgml
@@ -1215,6 +1215,21 @@ CREATE DATABASE foo WITH TEMPLATE template0;
      </para>
     </listitem>

+    <listitem>
+     <para>
+      Parallel restore (<option>--jobs</option> greater than 1) requires
+      applying dependency information from the archive file to ensure
+      that an object is not restored before other objects it depends on.
+      This information will be incomplete, leading to unexpected restore
+      failures, if the archive does not include object definitions
+      (the <option>pre-data</option> section).  Therefore, avoid
+      using <application>pg_dump</application> options such
+      as <option>--no-schema</option> or <option>-a/--data-only</option>
+      when creating an archive you wish to restore in parallel.  Instead,
+      provide such options to <application>pg_restore</application>.
+     </para>
+    </listitem>
+
     <listitem>
      <para><application>pg_restore</application> cannot restore large objects
       selectively;  for instance, only those for a specific table.  If

Re: docs: warn about post-data-only schema dumps with parallel restore.

От
"David G. Johnston"
Дата:
On Sun, Mar 29, 2026 at 11:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Sun, Jan 25, 2026 at 10:23 PM vaibhave postgres <
> postgresvaibhave@gmail.com> wrote:
>> Following up on the discussion in [1] about pg_restore failing to restore
>> post-data items due to circular foreign key deadlocks.
>> I’m attaching a doc patch that adds a warning about using post-data-only
>> schema dumps together with parallel restore.

> Not a fan of the patch overall though.  I'd want to add something to
> pg_restore noting that use of --jobs for constraint restoration needs
> schema information to compute the restoration order.

Yeah, dropping this into the list of options is bad.  We put caveats
like that into the Notes section usually.

I also tend to think that it'd be better to document this under
pg_restore: when people run into this type of failure, they are going
to go to the pg_restore docs not the pg_dump docs to understand it.
I guess there could be a case for repeating the info in both the
pg_dump and pg_restore pages, but that feels a bit verbose.

So maybe like the attached?


Works for me.

But how about adding something like the following to the pg_dump notes?  We already have the corresponding link going to pg_dump in the pg_restore notes.

"If producing a non-plaint-text format output see also the pg_restore documentation for details on how the restore process uses the different sections."

David J.

Re: docs: warn about post-data-only schema dumps with parallel restore.

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> But how about adding something like the following to the pg_dump notes?  We
> already have the corresponding link going to pg_dump in the pg_restore
> notes.

> "If producing a non-plaint-text format output see also the pg_restore
> documentation for details on how the restore process uses the different
> sections."

Hmm, I think we could be a bit more definite than that.  What do you
think of this advice:

  <para>
   When creating an archive (non-text) output file, it is advisable not to
   restrict the set of database objects dumped, but instead plan to apply
   any desired object filtering when reading the archive
   with <application>pg_restore</application>.  This will preserve
   flexibility and possibly avoid problems at restore time; for details
   see the <xref linkend="app-pgrestore"/> documentation.  However,
   omitting table data (<option>--no-data</option>) or large objects
   (<option>--no-large-objects</option>) does not have any surprising
   consequences.
  </para>

            regards, tom lane



Re: docs: warn about post-data-only schema dumps with parallel restore.

От
"David G. Johnston"
Дата:
On Tuesday, March 31, 2026, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> But how about adding something like the following to the pg_dump notes?  We
> already have the corresponding link going to pg_dump in the pg_restore
> notes.

> "If producing a non-plaint-text format output see also the pg_restore
> documentation for details on how the restore process uses the different
> sections."

Hmm, I think we could be a bit more definite than that.  What do you
think of this advice:

  <para>
   When creating an archive (non-text) output file, it is advisable not to
   restrict the set of database objects dumped, but instead plan to apply
   any desired object filtering when reading the archive
   with <application>pg_restore</application>.  This will preserve
   flexibility and possibly avoid problems at restore time; for details
   see the <xref linkend="app-pgrestore"/> documentation.  However,
   omitting table data (<option>--no-data</option>) or large objects
   (<option>--no-large-objects</option>) does not have any surprising
   consequences.
  </para>


I’m against including that final sentence.  The rest seems ok but I’ suggest going with an explicit mention that “—no-schema is risky” (or otherwise omitting the entire section)

I have a nagging suspicion we could be a bit more precise; e.g., it’s advisable to include the schema objects for the data that is being exported only, not the entire schema always.  But we already mention that dealing in subsets can introduce dependency issues so people have already been given the alert there.  But data/no-schema seems like it should just work and this just needs to warn them that it may not.

David J.

Re: docs: warn about post-data-only schema dumps with parallel restore.

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Tuesday, March 31, 2026, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hmm, I think we could be a bit more definite than that.  What do you
>> think of this advice:
>> 
>> <para>
>> When creating an archive (non-text) output file, it is advisable not to
>> restrict the set of database objects dumped, but instead plan to apply
>> any desired object filtering when reading the archive
>> with <application>pg_restore</application>.  This will preserve
>> flexibility and possibly avoid problems at restore time; for details
>> see the <xref linkend="app-pgrestore"/> documentation.  However,
>> omitting table data (<option>--no-data</option>) or large objects
>> (<option>--no-large-objects</option>) does not have any surprising
>> consequences.
>> </para>

> I’m against including that final sentence.  The rest seems ok but I’
> suggest going with an explicit mention that “—no-schema is risky” (or
> otherwise omitting the entire section)

How about replacing that sentence with "In particular, dumping table
data without the corresponding table definition (via --no-schema and
related options) is not recommended."

            regards, tom lane



Re: docs: warn about post-data-only schema dumps with parallel restore.

От
"David G. Johnston"
Дата:
On Tuesday, March 31, 2026, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Tuesday, March 31, 2026, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hmm, I think we could be a bit more definite than that.  What do you
>> think of this advice:
>>
>> <para>
>> When creating an archive (non-text) output file, it is advisable not to
>> restrict the set of database objects dumped, but instead plan to apply
>> any desired object filtering when reading the archive
>> with <application>pg_restore</application>.  This will preserve
>> flexibility and possibly avoid problems at restore time; for details
>> see the <xref linkend="app-pgrestore"/> documentation.  However,
>> omitting table data (<option>--no-data</option>) or large objects
>> (<option>--no-large-objects</option>) does not have any surprising
>> consequences.
>> </para>

> I’m against including that final sentence.  The rest seems ok but I’
> suggest going with an explicit mention that “—no-schema is risky” (or
> otherwise omitting the entire section)

How about replacing that sentence with "In particular, dumping table
data without the corresponding table definition (via --no-schema and
related options) is not recommended."


That should work.

David J.
 

Re: docs: warn about post-data-only schema dumps with parallel restore.

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Tuesday, March 31, 2026, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> How about replacing that sentence with "In particular, dumping table
>> data without the corresponding table definition (via --no-schema and
>> related options) is not recommended."

> That should work.

Done that way, then.

            regards, tom lane