Re: archive modules

Поиск
Список
Период
Сортировка
От David Steele
Тема Re: archive modules
Дата
Msg-id 53a180a7-bf1b-38c5-7fb4-4088dc022b98@pgmasters.net
обсуждение исходный текст
Ответ на Re: archive modules  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Ответы Re: archive modules  ("Bossart, Nathan" <bossartn@amazon.com>)
Список pgsql-hackers
On 11/7/21 1:04 AM, Fujii Masao wrote:
> 
> On 2021/11/03 0:03, Bossart, Nathan wrote:
>> On 11/1/21, 9:44 PM, "Fujii Masao" <masao.fujii@oss.nttdata.com> wrote:
>>> What is the main motivation of this patch? I was thinking that
>>> it's for parallelizing WAL archiving. But as far as I read
>>> the patch very briefly, WAL file name is still passed to
>>> the archive callback function one by one.
>>
>> The main motivation is provide a way to archive without shelling out.
>> This reduces the amount of overhead, which can improve archival rate
>> significantly.
> 
> It's helpful if you share how much this approach reduces
> the amount of overhead.

FWIW we have a test for this in pgBackRest. Running 
`archive_command=pgbackrest archive-push ...` 1000 times via system() 
yields an average of 3ms per execution. pgBackRest reports ~1ms of time 
here so the system() overhead is ~2ms. These times are on my very fast 
workstation and in my experience servers are quite a bit slower.

This doesn't tell the entire story, though, because in this test 
pgBackRest is just checking notifications being returned by an async 
process that was spawned earlier. This complexity exists to save the 
startup costs of, e.g. establishing an SSH connection, which is often > 
1 second.

This module would make it far easier to pay those startup costs a single 
time, or at least only occasionally, making it possible to write 
performant archivers with less complexity than is currently possible.

>> It should also make it easier to archive more safely.
>> For example, many of the common shell commands used for archiving
>> won't fsync the data, but it isn't too hard to do so via C.
> 
> But probably we can do the same thing even by using the existing
> shell interface? For example, we can implement and provide
> the C program of the archive command that fsync's the file?
> Users can just use it in archive_command.

It is far more common to be writing WAL segments to another host or 
object storage. In either case I believe a local fsync file command is 
not very useful.

>>> I think that it's worth adding this module into core
>>> rather than handling it as test module. It provides very basic
>>> WAL archiving feature, but (I guess) it's enough for some users.
>>
>> Do you think it should go into contrib?

I would prefer this module to be in core as our standard implementation 
and load by default in a vanilla install.

Regards,
-- 
-David
david@pgmasters.net



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Removed unused import modules from tap tests
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Frontend error logging style