Re: parallelizing the archiver

Поиск
Список
Период
Сортировка
От Jacob Champion
Тема Re: parallelizing the archiver
Дата
Msg-id 78728f8c5e413c05e00426369f79780a35caef5c.camel@vmware.com
обсуждение исходный текст
Ответ на Re: parallelizing the archiver  (Julien Rouhaud <rjuju123@gmail.com>)
Список pgsql-hackers
On Fri, 2021-09-10 at 23:48 +0800, Julien Rouhaud wrote:
> I totally agree that batching as many file as possible in a single
> command is probably what's gonna achieve the best performance.  But if
> the archiver only gets an answer from the archive_command once it
> tried to process all of the file, it also means that postgres won't be
> able to remove any WAL file until all of them could be processed.  It
> means that users will likely have to limit the batch size and
> therefore pay more startup overhead than they would like.  In case of
> archiving on server with high latency / connection overhead it may be
> better to be able to run multiple commands in parallel.

Well, users would also have to limit the parallelism, right? If
connections are high-overhead, I wouldn't imagine that running hundreds
of them simultaneously would work very well in practice. (The proof
would be in an actual benchmark, obviously, but usually I would rather
have one process handling a hundred items than a hundred processes
handling one item each.)

For a batching scheme, would it be that big a deal to wait for all of
them to be archived before removal?

> > That is possibly true. I think it might work to just assume that you
> > have to retry everything if it exits non-zero, but that requires the
> > archive command to be smart enough to do something sensible if an
> > identical file is already present in the archive.
> 
> Yes, it could be.  I think that we need more feedback for that too.

Seems like this is the sticking point. What would be the smartest thing
for the command to do? If there's a destination file already, checksum
it and make sure it matches the source before continuing?

--Jacob

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Bossart, Nathan"
Дата:
Сообщение: Re: parallelizing the archiver
Следующее
От: Robert Haas
Дата:
Сообщение: Re: parallelizing the archiver