Slow WAL recovery for DROP TABLE

Поиск

Список

Период

Сортировка

От	Sherrylyn Branchaw
Тема	Slow WAL recovery for DROP TABLE
Дата	17 июля 2018 г. 20:40:41
Msg-id	CAB_myF5RybwaQKAjB=Zb_uBeOUfNO=Dn1X-N-hp0qkCjP=5gXA@mail.gmail.com обсуждение исходный текст
Ответы	Re: Slow WAL recovery for DROP TABLE RE: Slow WAL recovery for DROP TABLE
Список	pgsql-general

Дерево обсуждения

We are running Postgres 9.6.9 on RHEL 6.9. We're using built-in streaming replication, with a WAL archive for fallback purposes. No logical replication.

We recently had a bug in our code accidentally create several hundred thousand tables in a single database. A batch job started cleaning them up one at a time (separate transactions), over the course of a couple nights.

But what we noticed the next day was that the 2 standbys to this database were falling further and further behind. They were no longer streaming, instead they were recovering from the archive, and they were doing so very slowly.

Only a small amount of WAL was generated by the DROP TABLEs, relative to normal traffic, but the time to recover a single WAL file sometimes took over a minute, compared to 1 second normally. Meanwhile, new WALs were being generated by regular user traffic, and the database got 24 hours behind and getting worse, before we resynced the standbys, in order to not have to replay the WALs. That did the trick. But it seemed like this performance drop was drastic enough to be worth reporting.

I was able to reproduce the slow recovery time of WALs generated by DROP TABLE on a pair of test servers.

Normal writes (INSERT, UPDATE, DELETE, CREATE TABLE) generated WAL files that were recoverable on the order of 1 second each. The standby never fell behind, even when hundreds of WAL files were generated quickly. But when I generated WALs exclusively from DROP TABLE, it took 20-80 seconds to recover each one.

Some benchmarks:

For 10 minutes, I issued a bunch of rapid-fire INSERT statements to server1, the primary. These generated 144 WAL files. At the end of the 10 minutes, server2, the standby, had already recovered 143 of them and immediately completed the last one.

For 10 minutes, I issued a bunch of rapid-fire DROP TABLE statements to server1, the primary. These generated 28 WAL files. At the end of the 10 minutes, server2, the standby, had only recovered 16 of them.

My expectation is that replaying binary changes on a standby should be at least as fast and sometimes faster than generating them locally. This is because when a logical command like "DROP TABLE" or "INSERT" is issued on a primary, the database has to check for dependencies, satisfy constraints, etc. But once all those decisions are made on the primary, the binary changes should be replayable much faster on the standby.

For example, server1, the primary in my test pair, has significantly better hardware than server2, the standby. If server2 is keeping up with server1 in replication, I predict that server2 should be able to get through the same amount of writes faster as a replica than as a standalone.

This expectation was borne out in my testing for INSERTs but not DROP TABLEs, per the following benchmarks.

I issued 9,000,000 INSERTs to server1, the primary. It took 10 minutes to complete these. At the end of the 10 minutes, server2, the standby, was caught up. When I made server2 a standalone, it took 15 minutes to get through the same 9,000,000 INSERTs. In conclusion, being a standalone was 50% slower for server2 when it came to INSERT.

I issued 100,000 DROP TABLEs to server1, the primary. It took 10 minutes to complete these on server 1, and 20 minutes for server2, the standby, to catch up. When I made server2 a standalone, it took 18 minutes to get through the same 100,000 DROP TABLEs. In conclusion, being a standalone was 10% faster for server2 when it came to DROP TABLE.

It seems that there is something especially inefficient about WAL files generated by DROP TABLE, or the process of recovering them.

I also want to mention that recovering a WAL file generated by DROP TABLE commands consistently takes approximately 100% CPU on the standby, both in testing and in prod. In prod, we also saw high memory use for that process.

The standby with 128 GB of RAM showed this:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

23279 postgres 20 0 57.0g 50g 5.1g R 100.0 40.0 1723:15 postgres: startup process recovering 00000002000003B10000008D

The standby with 32 GB of RAM crashed overnight with an out-of-memory error.

When I restarted the servers and let them resume recovering WALs, I watched %MEM and RES grow slowly and steadily before I gave up and resynced. No memory appeared to be freed up after each WAL file was replayed.

I was not able to reproduce memory usage getting quite this high in the time I was allotted for testing, but I did witness steady climbing up to almost 20%, especially when I mixed DROP TABLEs with regular writes. I suspect there is a sweet (or bitter?) spot of write activity we hit in our prod workload that I was unable to reproduce in testing.

I'm not sure if this is the right mailing list to report this behavior to, but I thought it was the best place to start. Let me know if there's any other information I should provide, or if I should send it to another list.

Thanks,

Sherrylyn

В списке pgsql-general по дате отправления:

Предыдущее

От: Adrian Klaver
Дата: 17 июля 2018 г., 20:17:23
Сообщение: Re: watchdog issues

Следующее

От: Jeff Janes
Дата: 17 июля 2018 г., 23:03:47
Сообщение: Re: Slow WAL recovery for DROP TABLE

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Slow WAL recovery for DROP TABLE

Предыдущее

Следующее