Speed up the removal of WAL files

Поиск
Список
Период
Сортировка
От Tsunakawa, Takayuki
Тема Speed up the removal of WAL files
Дата
Msg-id 0A3221C70F24FB45833433255569204D1F81B0C8@G01JPEXMBYT05
обсуждение исходный текст
Ответы Re: Speed up the removal of WAL files
Список pgsql-hackers
Hello,

The attached patch speeds up the removal of WAL files in the old timelines.  I'll add this to the next CF.


BACKGROUND
==================================================

We need to meet a severe availability requirement of a potential customer.  They will use synchronous streaming
replication. The allowed failover duration, from the failure through failure detection to the failover completion, is
10seconds.  Even one second is precious.
 

During a testing on a fast machine with SSD, we observed about 2 seconds between these messages.  There were no other
messagesbetween them.
 

LOG:  archive recovery complete
LOG:  MultiXact member wraparound protections are now enabled


CAUSE
==================================================

Examining the source code, RemoveNonParentXlogFiles() seems to account for the time.  It syncs pg_wal directory every
timeit deletes a WAL file.  max_wal_size was set to 48GB, so about 1,000 WAL files were probably deleted and hence the
pg_waldirectory was synced as much.
 


FIX
==================================================

unlink() the WAL files, then sync the pg_wal directory once at the end.

Unfortunately, the original machine is now not available, so I confirmed the speedup on a VM with HDD.

[time to remove 1,000 WAL files including the directory sync]
nonpatched: 2.45 seconds
patched:    0.81 seconds


Regards
Takayuki Tsunakawa


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: Add PGDLLIMPORT lines to some variables
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: [HACKERS] Walsender timeouts and large transactions