Re: Slow WAL recovery for DROP TABLE

Поиск
Список
Период
Сортировка
От Sherrylyn Branchaw
Тема Re: Slow WAL recovery for DROP TABLE
Дата
Msg-id CAB_myF7G3V8CoJGUzGxun4XjWeSTUxONi2iy4e0H9K7qvfrzZw@mail.gmail.com
обсуждение исходный текст
Ответ на RE: Slow WAL recovery for DROP TABLE  ("Jamison, Kirk" <k.jamison@jp.fujitsu.com>)
Список pgsql-general
Hi, I have also reported a similar problem in the hackers mailing list, but particularly on TRUNCATE TABLE.
https://www.postgresql.org/message-id/flat/D09B13F772D2274BB348A310EE3027C62FD6E6%40g01jpexmbkw24

Ooh, interesting. I admit I did not include TRUNCATE in my testing.

> The problem lies with the standby server’s replay as it does separate scanning of the whole shared buffer for each DROP/TRUNCATE TABLE in order to check if the table-to-delete is cached in shared buffer. Therefore, it will take a long recovery time and sometimes fail for large tables depending on shared_buffer size.

Also very interesting. We only (?) have 8 GB of shared buffers, and I see from your message that you had 300 GB. All of our tables, both in prod and in my reproduction, were empty, but there were hundreds of thousands of them.

> The main problem here is the scanning of shared_buffers, which not only affects drop/truncate table, but also drop database and vacuum as well.

I wondered about that. I didn't have any problem with a single drop database, but the database dropped was a small one (albeit one with a few hundred thousand empty tables), and I neither tested dropping a large database nor dropping 100,000 databases. I didn't test vacuuming, but we do heavy vacuuming on all our primaries frequently, and...hmm. Regular vacuuming doesn't cause any problems that have made it onto my radar, but VACUUM FULLs can cause WAL files to pile up on the primary's pg_xlog before getting archived. I never investigated that, just throttled my VACUUM FULLs, because they're only ever run manually. I will keep an eye on the recovery time of individual files the next time I have to do this, which will probably be soon.

> But I think any working minor solutions/fixes from developers are also welcome, such as the recent committed patch for the multiple dropped tables per transaction with large shared_buffers.

Agreed. Should I have sent or should I still send this to pgsql-hackers? I wasn't sure, so I erred on the side of not bothering the developers until I'd gotten some feedback here.

Best,
Sherrylyn

В списке pgsql-general по дате отправления:

Предыдущее
От: Sherrylyn Branchaw
Дата:
Сообщение: Re: Slow WAL recovery for DROP TABLE
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: Shared buffers increased but cache hit ratio is still 85%