Re: libpq copy error handling busted

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: libpq copy error handling busted
Дата
Msg-id 1153694.1591224088@sss.pgh.pa.us
обсуждение исходный текст
Ответ на libpq copy error handling busted  (Andres Freund <andres@anarazel.de>)
Ответы Re: libpq copy error handling busted  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: libpq copy error handling busted  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: libpq copy error handling busted  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> When libpq is used to COPY data to the server, it doesn't properly
> handle errors.
> This is partially an old problem, and partially got recently
> worse. Before the below commit we detected terminated connections, but
> we didn't handle copy failing.

Yeah.  After poking at that for a little bit, there are at least three
problems:

* pqSendSome() is responsible not only for pushing out data, but for
calling pqReadData in any situation where it can't get rid of the data
promptly.  1f39a1c06 overlooked that requirement, and the upshot is
that we don't necessarily notice that the connection is broken (it's
pqReadData's job to detect that).  Putting a pqReadData call into
the early-exit path helps, but doesn't fix things completely.

* The more longstanding problem is that the PQputCopyData code path
doesn't have any mechanism for consuming an 'E' (error) message
once pqReadData has collected it.  AFAICS that's ancient.  (It does
not affect the behavior of this case if you use an immediate-mode
shutdown, because then the backend never issues an 'E' message;
but it does matter in a fast-mode shutdown.)  I think that the
idea was to let the client dump all its copy data and then report
the error message when PQendCopy is called, but as you say, that's
none too friendly when there's gigabytes of data involved.  I'm
not sure we can do anything about this without effectively changing
the client API for copy errors, though.

* As for control-C not getting out of it: there is

        if (CancelRequested)
            break;

in pgbench's loop, but this does nothing in this scenario because
fe-utils/cancel.c only sets that flag when it successfully sends a
Cancel ... which it certainly cannot if the postmaster is gone.
I suspect that this may be relatively recent breakage.  It doesn't look
too well thought out, in any case.  The places that are testing this
flag look like they'd rather not be bothered with the fine point of
whether the cancel request actually went anywhere.  (And aside from this
issue, I see no mechanism for that flag to become unset once it's set.
Current users of cancel.c probably don't care, but we'd have noticed if
we tried to make psql use it.)

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: SIGSEGV from START_REPLICATION 0/XXXXXXX in XLogSendPhysical ()at walsender.c:2762
Следующее
От: Soumyadeep Chakraborty
Дата:
Сообщение: Re: Parallel Seq Scan vs kernel read ahead