RE: BUG #18016: REINDEX TABLE failure

Поиск

Список

Период

Сортировка

От	Richard Veselý
Тема	RE: BUG #18016: REINDEX TABLE failure
Дата	11 июля 2023 г. 09:31:06
Msg-id	AM9PR02MB675496D71C5F610056BF7FBB9F31A@AM9PR02MB6754.eurprd02.prod.outlook.com обсуждение исходный текст
Ответ на	Re: BUG #18016: REINDEX TABLE failure (Gurjeet Singh <gurjeet@singh.im>)
Список	pgsql-hackers

Дерево обсуждения

Hi Gurjeet,

Thank you for the follow-up. I was worried my message got buried in the middle of the thread. I also appreciate your
workon the patch to fix/improve the REINDEX TABLE behavior even though most people would never encounter it in the
wild.

As a preface I would first like to say that I can appreciate the emphasis on general maintainability of the codebase,
tryingto avoid having some overly clever hacks that might impede understanding, having ideally one way of doing things
likehaving a common database page structure, etc. The more one keeps to this "happy" path the better the general state
ofthe project end up by keeping it accessible to the rest of the community and attracting more contributions in turn.

That being said, PostgreSQL can be extremely conservative in scenarios where it might not be warranted while giving a
limitedopportunity to influence said behavior. This often leads to a very low hardware resource utilization. You can
easilyfind many instances across StackOverflow, dba.stackexchange.com, /r/postgres and pgsql-performance where people
runinto ingress/egress bottlenecks even though their hardware can trivially support much larger workload.

In my experience, you can be very hard-pressed in many cases to saturate even a modest enterprise HDD while observing
theofficial guidelines (https://www.postgresql.org/docs/current/populate.html), e.g. minimal WAL and host of other
configurationoptimizations, having no indexes and constraints and creating table and filling it with binary COPY within
thesame transaction. And the situation with pg_dump/pg_restore is often much worse.

Is there an interest in improving the current state of affairs? I will be rewriting the indexing first to get the whole
picture,but I can already tell you that there is a -lot- of performance left on the table even considering the effort
toimprove COPY performance in PostgreSQL 16. Given sufficient hardware, you should always be heavily IO-bound without
exceptionand saturate any reasonable number of NVMe SSDs.

Best regards,
Richard

-----Original Message-----
From: Gurjeet Singh <gurjeet@singh.im> 
Sent: Monday, July 10, 2023 6:44 PM
To: Richard Veselý <richard.vesely@softea.sk>; Postgres Hackers <pgsql-hackers@postgresql.org>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; Michael Paquier <michael@paquier.xyz>
Subject: Re: BUG #18016: REINDEX TABLE failure

On Sun, Jul 9, 2023 at 7:21 AM Richard Veselý <richard.vesely@softea.sk> wrote:
>
> ... there's no shortage of people that suffer from sluggish pg_dump/pg_restore cycle and I imagine there are any
numberof people that would be interested in improving bulk ingestion which is often a bottleneck for analytical
workloadsas you are well aware. What's the best place to discuss this topic further - pgsql-performance or someplace
else?

(moved conversation to -hackers, and moved -bugs to BCC)

> I was dissatisfied with storage layer performance, especially during 
> the initial database population, so I rewrote it for my use case. I'm 
> done with the heap, but for the moment I still rely on PostgreSQL to 
> build indexes,

It sounds like you've developed a method to speed up loading of tables, and might have ideas/suggestions for speeding
upCREATE INDEX/REINDEX. The -hackers list feels like a place to discuss such changes.

Best regards,
Gurjeet
http://Gurje.et

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

RE: BUG #18016: REINDEX TABLE failure