Re: termination of backend waiting for sync rep generates a junk log message

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: termination of backend waiting for sync rep generates a junk log message
Дата
Msg-id 27332.1319398399@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: termination of backend waiting for sync rep generates a junk log message  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Oct 18, 2011 at 11:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> One thing worth asking is why we're willing to violate half a dozen
>> different coding rules if we see ProcDiePending, yet we're perfectly
>> happy to rely on the client understanding a WARNING for the
>> QueryCancelPending case. �Another is whether this whole function isn't
>> complete BS in the first place, since it appears to be coded on the
>> obviously-false assumption that nothing it calls can throw elog(ERROR)
>> --- and of course, if any of those functions do throw ERROR, all the
>> argumentation here goes out the window.

> Well, there is a general problem that anything which throws an ERROR
> too late in the commit path is Evil; and sync rep makes that worse to
> the extent that it adds more stuff late in the commit path, but it
> didn't invent the problem.  What it did do is add stuff late in the
> commit path that can block for a potentially unbounded period of time,
> and I don't see that there are any solutions to that problem that
> aren't somewhat grotty.

After further reflection, you're right that all sync rep is really doing
is extending the time duration of the interval wherein clients will have
a hard time telling whether the commit occurred or not.  It's always
been the case that if a cancel/die interrupt occurs during
CommitTransaction, that will get serviced at the RESUME_INTERRUPTS call
at the end, and the client will see an apparent failure even though the
transaction was committed.  Even without that, an interrupt occurring
just after this code sequence, but before we reach the point of sending
a command-complete response message, is going to result in client
confusion, and there's very little we can do about that.

I think what we should do in SyncRepWaitForLSN is just send a warning
and abandon waiting.  Trying to fool with the interrupt response
behavior beyond that is simply broken, and it doesn't help any that we
chose to break it in two different, but equally indefensible, ways for
cancel versus die interrupts.

It would help BTW for the warning to have its own SQLSTATE, if we're
imagining that "some clients may be able to interpret" it.  Also, this
code is supposing that it must be called within a HOLD_INTERRUPTS
context, but it doesn't look to me like that is being done for the
various calls from twophase.c.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Brar Piening
Дата:
Сообщение: Re: Visual Studio 2010/Windows SDK 7.1 support
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: So, is COUNT(*) fast now?