Re: GenBKI emits useless open;close for catalogs without rows

Поиск
Список
Период
Сортировка
От Matthias van de Meent
Тема Re: GenBKI emits useless open;close for catalogs without rows
Дата
Msg-id CAEze2WidWxspc8LaQAQF3bURGjZm0yCN=eA_9cbq1=8zE9LvPQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: GenBKI emits useless open;close for catalogs without rows  (Andres Freund <andres@anarazel.de>)
Ответы Re: GenBKI emits useless open;close for catalogs without rows  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
Список pgsql-hackers
On Fri, 22 Sept 2023 at 00:25, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-09-19 21:05:41 +0300, Heikki Linnakangas wrote:
> > On 18/09/2023 17:50, Matthias van de Meent wrote:
> > > (initdb takes about 73ms locally with syncing disabled)
> >
> > That's impressive. It takes about 600 ms on my laptop. Of which about 140 ms
> > goes into processing the BKI file. And that's with "initdb -no-sync" option.
>
> I think there must be a digit missing in Matthias' numbers.

Yes, kind of. The run was on 50 iterations, not the assumed 250.
Also note that the improved measurements were recorded inside the
boostrap-mode PostgreSQL instance, not inside the initdb that was
processing the postgres.bki file. So it might well be that I didn't
improve the total timing by much.

> > > Various methods of reducing the size of postgres.bki were applied, as
> > > detailed in the patch's commit message. I believe the current output
> > > is still quite human readable.
> >
> > Overall this does not seem very worthwhile to me.
>
> Because the wins are too small?
>
> FWIW, Making postgres.bki smaller and improving bootstrapping time does seem
> worthwhile to me. But it doesn't seem quite right to handle the batching in
> the file format, it should be on the server side, no?

The main reason I did batching in the file format is to reduce the
storage overhead of the current one "INSERT" per row. Batching
improved that by replacing the token with a different construct, but
it's not neccessarily the only solution. The actual parser still
inserts the tuples one by one in the relation, as I didn't spend time
on making a simple_heap_insert analog for bulk insertions.

> We really should stop emitting WAL during initdb...

I think it's quite elegant that we're able to bootstrap the relation
data of a new PostgreSQL cluster from the WAL generated in another
cluster, even if it is indeed a bit wasteful. I do see your point
though - the WAL shouldn't be needed if we're already fsyncing the
files to disk.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Anthonin Bonnefoy
Дата:
Сообщение: Re: POC: Extension for adding distributed tracing - pg_tracing
Следующее
От: vignesh C
Дата:
Сообщение: Invalidate the subscription worker in cases where a user loses their superuser status