Re: Pg 16: will pg_dump & pg_restore be faster?

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Pg 16: will pg_dump & pg_restore be faster?
Дата
Msg-id 20230531134552.yvouy5k573irlddt@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Pg 16: will pg_dump & pg_restore be faster?  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-general
Hi,

On 2023-05-30 21:13:08 -0400, Bruce Momjian wrote:
> On Wed, May 31, 2023 at 09:14:20AM +1200, David Rowley wrote:
> > On Wed, 31 May 2023 at 08:54, Ron <ronljohnsonjr@gmail.com> wrote:
> > > https://www.postgresql.org/about/news/postgresql-16-beta-1-released-2643/
> > > says "PostgreSQL 16 can also improve the performance of concurrent bulk
> > > loading of data using COPY up to 300%."
> > >
> > > Since pg_dump & pg_restore use COPY (or something very similar), will the
> > > speed increase translate to higher speeds for those utilities?
> > 
> > I think the improvements to relation extension only help when multiple
> > backends need to extend the relation at the same time.  pg_restore can
> > have multiple workers, but the tasks that each worker performs are
> > only divided as far as an entire table, i.e. 2 workers will never be
> > working on the same table at the same time. So there is no concurrency
> > in terms of 2 or more workers working on loading data into the same
> > table at the same time.
> > 
> > It might be an interesting project now that we have TidRange scans, to
> > have pg_dump split larger tables into chunks so that they can be
> > restored in parallel.
> 
> Uh, the release notes say:
> 
>     <!--
>     Author: Andres Freund <andres@anarazel.de>
>     2023-04-06 [00d1e02be] hio: Use ExtendBufferedRelBy() to extend tables more eff
>     Author: Andres Freund <andres@anarazel.de>
>     2023-04-06 [26158b852] Use ExtendBufferedRelTo() in XLogReadBufferExtended()
>     -->
>     
>     <listitem>
>     <para>
>     Allow more efficient addition of heap and index pages (Andres Freund)
>     </para>
>     </listitem>
> 
> There is no mention of concurrency being a requirement.  Is it wrong?  I
> think there was a question of whether you had to add _multiple_ blocks
> ot get a benefit, not if concurrency was needed.  This email about the
> release notes didn't mention the concurrent requirement:

>     https://www.postgresql.org/message-id/20230521171341.jjxykfsefsek4kzj%40awork3.anarazel.de

There's multiple improvements that work together to get the overall
improvement. One part of that is filesystem interactions, another is holding
the relation extension lock for a *much* shorter time. The former helps
regardless of concurrency, the latter only with concurrency.

Regards,

Andres



В списке pgsql-general по дате отправления:

Предыдущее
От: "peter.borissow@kartographia.com"
Дата:
Сообщение: Hash Index on Partitioned Table
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Hash Index on Partitioned Table