Re: Reduce the time required for a database recovery from archive.

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Reduce the time required for a database recovery from archive.
Дата
Msg-id 20201019162551.GA16415@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: Reduce the time required for a database recovery from archive.  (Dmitry Shulga <d.shulga@postgrespro.ru>)
Ответы Re: Reduce the time required for a database recovery from archive.  (Dmitry Shulga <d.shulga@postgrespro.ru>)
Список pgsql-hackers
Greetings,

* Dmitry Shulga (d.shulga@postgrespro.ru) wrote:
> > On 9 Sep 2020, at 21:26, Stephen Frost <sfrost@snowman.net> wrote:
> > * Dmitry Shulga (d.shulga@postgrespro.ru) wrote:
> >> Overall archive file processing is done one by one, and this might
> >> create a performance bottleneck if archived WAL files are delivered slowly,
> >> because the database server has to wait for arrival of the next
> >> WAL segment before applying its records.
> >>
> >> To address this issue it is proposed to receive archived WAL files in parallel
> >> so that when the next WAL segment file is required for processing of redo log
> >> records it would be already available.
> >
> > Yes, pgbackrest already does exactly this (if configured)- uses parallel
> > processes to fetch the WAL and have it be available ahead of time.
>
> pgbackrest is a third-party software that should be additionally installed on customer's premises.
>
> On the other hand, built-in support of this optimization in PostgresSQL is a good argument to add
> this feature  and provide it to customers just out of the box.

Sure, having core do pre-fetching could be useful, though there's the
downside that it, unfortunately, can't know how much WAL is actually
going to be needed as we play forward since we don't know where we will
end up finding the target we've been asked for.  Unlikely that'll be too
much of an issue with the traditional 16 MB WAL files, but having a more
integrated backup/restore solution would be able to address that by
tracking the restore targets that are in each WAL (which is something
we've had on our todo list for pgbackrest for a while, and that would
also let someone ask "am I able to reach this restore target?").

> >> Implementation of this approach assumes running several background processes (bgworkers)
> >> each of which runs a shell command specified by the parameter restore_command
> >> to deliver an archived WAL file. Number of running parallel processes is limited
> >> by the new parameter max_restore_command_workers. If this parameter has value 0
> >> then WAL files delivery is performed using the original algorithm, that is in
> >> one-by-one manner. If this parameter has value greater than 0 then the database
> >> server starts several bgworker processes up to the limit specified by
> >> the parameter max_restore_command_workers and passes to every process
> >> WAL file name to deliver. Active processes start prefetching of specified
> >> WAL files and store received files in the directory pg_wal/pgsql_tmp. After
> >> bgworker process finishes receiving a file it marks itself as a free process
> >> and waits for a new request to receive a next WAL file. The main process
> >> performing database recovery still handles WAL files in one-by-one manner,
> >> but instead of waiting for a next required WAL file's availability it checks for
> >> that file in the prefetched directory. If a new file is present there,
> >> the main process starts its processing.
> >
> > I'm a bit confused about this description- surely it makes sense for the
> OK. The description I originally provided was probably pretty misleading so I will try to clarify it a bit.
>
> So, as soon as a bgworker process finishes delivering a WAL file it marks itself as a free.
>
> WAL records applier working in parallel and processing the WAL files in sequential manner.
> Once it finishes handling of the current WAL file, it checks whether it is possible to run extra bgworker processes
> to deliver WAL files which will be required a bit later. If there are free bgworker processes then applier requests
> to start downloading of one or more extra WAL files. After that applier determines a name of next WAL file to handle
> and checks whether it exist in the prefetching directory. If it does exist then applier starts handling it and
> processing loop is repeated.

Ok- so the idea is that each time the applying process finishes with a
WAL file then it'll see if there's an available worker and, if so, will
give it the next file to go get (which would presumably be some number
in the future and the actual next file the applying process needs is
already available).  That sounds better, at least, though I'm not sure
why we're making it the job of the applying process to push the workers
each time..?  Also, I'm not sure about the interface- wouldn't it make
more sense to have a "pre-fetch this amount of WAL" kind of parameter
directly instead of tying that to the number of background workers?  You
might only need one or two processes doing WAL fetching to be able to
fetch faster than the applying process is able to apply it, but you
probably want to pre-fetch more than just one or two 16 MB WAL files.

In other words, I would have thought we'd have:

wal_prefetch_amount = 1GB
max_restore_command_workers = 2

and then you'd have up to 2 worker processes running and they'd be
keeping 1GB of WAL pre-fetched at all times.  If we have just
'max_restore_command_workers' and you want to pre-fetch 1GB of WAL then
you'd have to have a pretty high value there and you'd end up with
a bunch of threads that all spike to go do work each time the applying
process finishes a WAL file but then just sit around doing nothing while
waiting for the applying process to finish another segment.

Thanks,

Stephen

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: John Naylor
Дата:
Сообщение: Re: speed up unicode normalization quick check
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Probable documentation errors or improvements