Re: Parallel pg_dump's error reporting doesn't work worth squat

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: Parallel pg_dump's error reporting doesn't work worth squat
Дата
Msg-id 20160525.130137.175160263.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Parallel pg_dump's error reporting doesn't work worth squat  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Parallel pg_dump's error reporting doesn't work worth squat  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hello,

# Note for convenience for others: The commit fixing the original
# issue is 1aa41e3eae3746e05d0e23286ac740a9a6cee7df.

At Mon, 23 May 2016 13:40:30 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in <1336.1464025230@sss.pgh.pa.us>
> I wrote:
> >> ... the pg_dump process has crashed with a SIGPIPE without printing
> >> any message whatsoever, and without coming anywhere near finishing the
> >> dump.
> 
> > Attached is a proposed patch for this.  It reverts exit_horribly() to
> > what it used to be pre-9.3, ie just print the message on stderr and die.
> > The master now notices child failure by observing EOF on the status pipe.
> > While that works automatically on Unix, we have to explicitly close the
> > child side of the pipe on Windows (could someone check that that works?).
> > I also fixed the parent side to ignore SIGPIPE earlier, so that it won't
> > just die if it was in the midst of sending to the child.
> 
> Now that we're all back from PGCon ... does anyone wish to review this?
> Or have an objection to treating it as a bug fix and patching all branches
> back to 9.3?

FWIW, it seems to me a bug which should be fixed to back
branches.

I tried this only on Linux (I'll try it on Windows afterward) and
got something like this.

pg_dump: [archiver (db)] pg_dump: [archiver (db)] connection to database "postgres" failed: fe_sendauth: no password
supplied
connection to database "postgres" failed: fe_sendauth: no password supplied
pg_dump: [archiver (db)] connection to database "postgres" failed: fe_sendauth: no password supplied
<some repeats>
pg_dump: [parallel archiver] a worker process died unexpectedly
pg_dump: [archiver (db)] connection to database "postgres" failed: fe_sendauth: no password supplied
<some repeats>


Although the error messages from multiple children are cluttered,
it would be inevitable and better than vanishing.

It seems hard to distinguish the meaning among the module names
enclosed by '[]' but it is another issue.


In archive_close_connection, the following change means that
si->AHX could be NULL there, as the existing code for
non-parallel does. But it seems to be set once for
si(=shutdown_info)'s lifetime, before reaching there, to a valid
value.

-            DisconnectDatabase(si->AHX);
+            if (si->AHX)
+                DisconnectDatabase(si->AHX);

Shouldn't archive_close_connection have an assertion (si->AHX !=
NULL) instead of checking "if (si->AHX)" in each path?

> > BTW, after having spent the afternoon staring at it, I will assert with
> > confidence that pg_dump/parallel.c is an embarrassment to the project.
> 
> I've done a bit of work on a cosmetic cleanup patch, but don't have
> anything to show yet.
> 
>             regards, tom lane

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Error during restore - dump taken with pg_dumpall -c option
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Parallel pg_dump's error reporting doesn't work worth squat