Re: BUG #7546: Backups on hot standby cancelled despite hot_standby_feedback=on

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: BUG #7546: Backups on hot standby cancelled despite hot_standby_feedback=on
Дата
Msg-id 201209191356.12176.andres@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: BUG #7546: Backups on hot standby cancelled despite hot_standby_feedback=on  (Stuart Bishop <stuart@stuartbishop.net>)
Ответы Re: BUG #7546: Backups on hot standby cancelled despite hot_standby_feedback=on  (Stuart Bishop <stuart@stuartbishop.net>)
Список pgsql-bugs
On Wednesday, September 19, 2012 11:33:29 AM Stuart Bishop wrote:
> On Wed, Sep 19, 2012 at 3:38 PM, Andres Freund <andres@2ndquadrant.com>
wrote:
> >> >> ERROR:  canceling statement due to conflict with recovery
> >> >> DETAIL:  User was holding shared buffer pin for too long.
> >> >
> >> > Maybe I'm missing something, but ISTM that hot_standby_feedback
> >> > doesn't eliminate the query cancels caused by buffer pin lock which
> >> > you encountered. It eliminates only the query cancels caused by
> >> > cleanup of rows. So you might need to set max_standby_streaming_delay
> >> > to -1, to avoid query cancels.
> >
> > I think the above is just a case of the backend waiting a long time to
> > send data out due to your rate limiting. Why it does that holding a
> > buffer pin may be worth investigating...
>
> I put the rate limiting in there to trigger the problem consistently
> with a small data set. On the main production server I see this
> problem, it normally fails dumping a table with nearly 2 billion rows
> to local disk (pg_dump -Fc). The trick seems to be that vacuum needs
> to start running on the primary after the dump starts on the hot
> standby, and the vacuum complete before the dump finishes.
You see buffer pin errors or you see the relation errors? Those arent the same.
So if you see no buffer pin errors but relation lock ones in production your
way to reproduce the issue isn't telling us very much ;)

> >> I have also seen the following (but only on production, not with my test
> >> case):
> >>
> >> ERROR:  canceling statement due to conflict with recovery
> >> DETAIL:  User was holding a relation lock for too long.
> >>
> >> If you are correct, it seems impossible with 9.1 to have reliable long
> >> transactions on a hot standby without allowing the hot standby to lag
> >> behind by the length of the longest transaction.
> >
> > Is it possible that you perform such commands on tables used on the
> > standby?
>
> Yes, it is quite possible CREATE INDEX CONCURRENTLY was run on the
> primary during the backup window. I can't discount other commands
> either, but if that is the case one of our ops needs a spanking. I've
> only been able to reproduce the buffer pin error in isolation.
Do you have DDL logging enabled so you could investigate that?

I really wonder why we don't bother logging the dbOid/relOid pair before we
cancel transactions due to lock conflicts. That should make investigating such
issues considerably easier...

Greetings,

Andres
--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Stuart Bishop
Дата:
Сообщение: Re: BUG #7546: Backups on hot standby cancelled despite hot_standby_feedback=on
Следующее
От: Stuart Bishop
Дата:
Сообщение: Re: BUG #7546: Backups on hot standby cancelled despite hot_standby_feedback=on