BUG #13366: pg_attribute always locked on streaming replica

Поиск
Список
Период
Сортировка
От diptatapa@gmail.com
Тема BUG #13366: pg_attribute always locked on streaming replica
Дата
Msg-id 20150528064854.4705.28041@wrigleys.postgresql.org
обсуждение исходный текст
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      13366
Logged by:          soni
Email address:      diptatapa@gmail.com
PostgreSQL version: 9.1.13
Operating system:   Red Hat Enterprise Linux Server release 6.5
Description:

We have master and one asynchronous streaming replica.

One time, on master occurs autovacuum on pg_attribute. The autovacuum occurs
when master is on checkpoint process.

Here's the log :
2015-05-26 14:10:30.636 CDT,,,26619,,5564c4d1.67fb,3,,2015-05-26 14:09:05
CDT,369/25999,2302120090,LOG,00000,"automatic vacuum of table
""consprod.pg_catalog.pg_attribute"": index scans: 1
pages: 2122 removed, 1552 remain
tuples: 107186 removed, 5586 remain
system usage: CPU 0.32s/0.39u sec elapsed 54.48 sec",,,,,,,,,""
2015-05-26 14:10:38.856 CDT,,,19569,,5562b841.4c71,1044,,2015-05-25 00:50:57
CDT,,0,LOG,00000,"checkpoint complete: wrote 31891 buffers (3.0%); 0
transaction log file(s) added, 0 removed, 30 recycled; write=43.470 s,
sync=1.076 s, total=44.574 s; sync files=974, longest=0.027 s, average=0.001
s",,,,,,,,,""
2015-05-26 14:10:45.677 CDT,,,27144,,5564c50d.6a08,3,,2015-05-26 14:10:05
CDT,363/24582,0,LOG,00000,"automatic vacuum of table
""consprod.pg_catalog.pg_attribute"": index scans: 1
pages: 0 removed, 1552 remain
tuples: 5321 removed, 40625 remain
system usage: CPU 0.05s/0.07u sec elapsed 10.71 sec",,,,,,,,,""

Just after that, on streaming replica, pg_attribute locked all the time,
making all queries waiting.
Here's the log :
2015-05-26 14:10:30.768 CDT,,,18868,,55574fd6.49b4,3469,,2015-05-16 09:10:30
CDT,,0,LOG,00000,"recovery restart point at 50FB/1000C418","last completed
transaction was at log time 2015-05-26 08:08:51.593013-05",,,,,,,,""
2015-05-26 14:10:30.771 CDT,,,18868,,55574fd6.49b4,3470,,2015-05-16 09:10:30
CDT,,0,LOG,00000,"restartpoint starting: xlog",,,,,,,,,""
2015-05-26 14:11:08.867 CDT,,,24950,"",5564c54c.6176,1,"",2015-05-26
14:11:08 CDT,,0,LOG,00000,"connection received: host=10.2.5.12
port=52568",,,,,,,,,""
2015-05-26 14:11:08.895
CDT,"monitoring","consprod",24950,"10.2.5.12:52568",5564c54c.6176,2,"authentication",2015-05-26
14:11:08 CDT,2/9381,0,LOG,00000,"connection authorized: user=monitoring
database=consprod",,,,,,,,,""
2015-05-26 14:11:10.091
CDT,"monitoring","consprod",24950,"10.2.5.12:52568",5564c54c.6176,3,"SELECT
waiting",2015-05-26 14:11:08 CDT,2/9382,0,LOG,00000,"process 24950 still
waiting for AccessShareLock on relation 1249 of database 111285 after
1000.069 ms",,,,,,"select current_timestamp - value::timestamp as ""slsql003
(streaming)"" from system_config where name = 'last.update.date'",,,"psql"

and still many waiting queries recorded on log.

When this issue detected, the max_connections has been reached. I saw many
process has 'startup waiting'.
Here's what ps show :
postgres 18855  2.3 23.3 2773300 1882460 ?     Ss   May16 398:14 postgres:
startup process   recovering 00000003000050FC0000003B waiting
postgres 18868  0.2 23.1 2772240 1864836 ?     Ss   May16  38:59 postgres:
writer process
postgres 23445  0.0  0.0 180152   572 ?        Ss   May16   2:29 postgres:
stats collector process
postgres 24950  0.0  0.0 2772896 6376 ?        Ss   May26   0:00 postgres:
monitoring consprod 10.2.5.12(52568) SELECT waiting
postgres 24954  0.0  0.0 2772704 2768 ?        Ss   May26   0:00 postgres:
monitoring consprod 10.2.5.12(52597) startup waiting
postgres 24957  0.0  0.0 2772704 2780 ?        Ss   May26   0:00 postgres:
monitoring consprod 10.2.5.12(52613) startup waiting
postgres 24960  0.0  0.0 2772704 2784 ?        Ss   May26   0:00 postgres:
monitoring consprod 10.2.5.12(52642) startup waiting

I restarted the replica, and then it continue to replay WAL successfully.

I think it kind of bug.

PS:We had dump running on replica, making the replica not restore the WAL
files during the dump. The dump usually finish at 13 o'clock. So, when this
happen, the replica seems on heavy restoring the wal files.

Thanks.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Upsert functions spinlock when doing insert first.
Следующее
От: Jeevan Chalke
Дата:
Сообщение: pg_get_functiondef() does not show LEAKPROOF for leakproof functions