Re: design for parallel backup

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: design for parallel backup
Дата	1 мая 2020 г. 01:06:23
Msg-id	CA+TgmoaFQ56UTUJWpk0NkOhpJGfq5qOSwVHfoRMwJO8Xy13rKQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: design for parallel backup (Andres Freund <andres@anarazel.de>)
Ответы	Re: design for parallel backup (Robert Haas <robertmhaas@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Thu, Apr 30, 2020 at 3:52 PM Andres Freund <andres@anarazel.de> wrote:
> Why 8kb? That's smaller than what we currently do in pg_basebackup,
> afaictl, and you're actually going to be bottlenecked by syscall
> overhead at that point (unless you disable / don't have the whole intel
> security mitigation stuff).

I just picked something. Could easily try other things.

> > , divided into 1, 2, 4, 8, or 16 equal size files, with each file
> > written by a separate process, and an fsync() at the end before
> > process exit. So in this test, there is no question of whether the
> > master can read the data fast enough, nor is there any issue of
> > network bandwidth. It's purely a test of whether it's faster to have
> > one process write a big file or whether it's faster to have multiple
> > processes each write a smaller file.
>
> That's not necessarily the only question though, right? There's also the
> approach one process writing out multiple files (via buffered, not async
> IO)? E.g. one basebackup connecting to multiple backends, or just
> shuffeling multiple files through one copy stream.

Sure, but that seems like it can't scale better than this. You have
the scaling limitations of the filesystem, plus the possibility that
the process is busy doing something else when it could be writing to
any particular file.

> If you can provide me with the test program, I'd happily run it on some
> decent, but not upper end, NVMe SSDs.

It was attached, but I forgot to mention that in the body of the email.

> I think you might also be seeing some interaction with write caching on
> the raid controller here. The file sizes are small enough to fit in
> there to a significant degree for the single file tests.

Yeah, that's possible.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Thomas Munro
Дата: 30 апреля 2020 г., 23:59:35
Сообщение: Re: Avoiding hash join batch explosions with extreme skew and weird stats

Следующее

От: "Jonah H. Harris"
Дата: 01 мая 2020 г., 03:27:49
Сообщение: Re: Raw device on PostgreSQL

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: design for parallel backup

Предыдущее

Следующее