Re: [BUGS] PostgreSQL hot standby Hangs due to AccessExclusiveLock onpg_attribute or pg_type tables

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: [BUGS] PostgreSQL hot standby Hangs due to AccessExclusiveLock onpg_attribute or pg_type tables
Дата
Msg-id CAMkU=1zV8rvpUUXd7Bgt53HtOEtBnUXok76RwfWAJBprO-+5Bg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [BUGS] PostgreSQL hot standby Hangs due to AccessExclusiveLockon pg_attribute or pg_type tables  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Ответы Re: [BUGS] PostgreSQL hot standby Hangs due to AccessExclusiveLock onpg_attribute or pg_type tables  (Bruno Richard <bruno.richard@broadsign.com>)
Re: [BUGS] PostgreSQL hot standby Hangs due to AccessExclusiveLockon pg_attribute or pg_type tables  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Список pgsql-bugs
On Tue, Jul 11, 2017 at 3:01 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello.

In conclusion, this doesn't seem to be a bug but just a standby
conflict.

At Mon, 19 Jun 2017 13:29:34 -0400, Bruno Richard <bruno.richard@broadsign.com> wrote in <CAB-EU3RawZx8-OzMfvswFf6z+Y7GOZf03TZ=bez+PbQX+A4M7Q@mail.gmail.com>
> *Summary:*
> PostgreSQL hot standby Hangs due to AccessExclusiveLock on pg_attribute or
> pg_type tables
>
> *Platform information (all instances)*
> PostgreSQL 9.6.2, ltree extension enabled.
> Ubuntu 12.04 64bits
>
> *Description:*
> We have an issue on our Database system since we migrated from PostgreSQL
> 9.3.2 to 9.6.2.
> The issue is that one (or many) of our hot standby instances does an
> AccessExclusiveLock on the pg_attribute or pg_type system tables that never
> gets unlocked, causing hangs on our systems. We have to restart the
> instance to get out of the lock. It happened ~30 times since end of April.
>
> *Steps to reproduce:*
> Looks like a race condition, happens randomly in production only. I am
> trying to replicate in house.
>
> *Description:*
> This is the output from the pg_locks table when the issue occurs:
>
> -[ RECORD 129 ]----+--------------------
> locktype           | relation
> database           | 16385
> relation           | 1249
> page               |
> tuple              |
> virtualxid         |
> transactionid      |
> classid            |
> objid              |
> objsubid           |
> virtualtransaction | 1/0
> pid                | 19018
> mode               | AccessExclusiveLock
> granted            | t
> fastpath           | f

I'm not sure why it didn't happen on 9.3,

In 9.3, the AccessExclusiveLock is obtained by polling.  If it can never get the lock, it won't stop other processes from getting an AccessShareLock, so it won't block their ability to log on.  (Once max_standby_streaming_delay expires, it will then slaughter everything and take its lock.)  In 9.6, it is obtained in the more conventional way, by waiting while blocking newcomers who want a conflicting version.  I don't understand exactly why this change leads to the reported behavior, but it is probably related.


but it is very likely
to be the consequence of a standby conflict. You are setting
max_standby_streaming_delay to 1200 seconds so the lock will be
released after the same duration at maximum.

The lock shown above almost certainly is a vacuum-truncation
lock. Usually the lock is released soon by a following
commit. However, if some other conflicting xlogs (for example
vaccum cleanup records from a concurrent vacuum) comes before the
commit the truncation lock won't be released until the conflict
is resolved. This would be what you are looking.

Since no new sessions can connect once the pg_attributes table is locked, this would either be a self-deadlock (which I think is a bug) or it would require there to be long-lived connections to the standy-by which are holding the locks which are blocking the vacuum cleanup .  

Bruno, is this correct that you have long-lived transactions to the standby while this is occurring with pg_attribute?  Does it correct after 20 minutes, if you are willing to wait that long?

Cheers,

Jeff

В списке pgsql-bugs по дате отправления:

Предыдущее
От: jrogers@whe.org
Дата:
Сообщение: [BUGS] BUG #14740: PgAdmin 4 Crashing when creating a PgAgent Schedule
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [BUGS] BUG #14738: ALTER SERVER for foregin servers not working