Обсуждение: Parallelizing startup with many databases

Поиск

Список

Период

Сортировка

Parallelizing startup with many databases

От

Babak Ghadiri

Дата:

02 января, 10:55:07

Hello,
I hope you are doing well.

In PostgreSQL 16, startup appears to initialize databases sequentially and
primarily uses a single CPU core. In clusters with a very large number of
databases (around 5,000 in our case), this results in noticeably long
startup times after restarts or crash recovery.

I would like to ask:

- Is the largely single-threaded startup behavior a fundamental architectural
constraint (e.g. catalog dependencies, locking, recovery ordering), or mainly
an unimplemented optimization?
- Are there any existing discussions, patches, versions (18+) to parallelize parts of startup or otherwise improve startup scalability with many databases?
- Are there any PostgreSQL configuration settings known to dramatically reduce startup time, or is startup performance mostly fixed by architecture in this scenario?

I understand that having many databases in a single cluster is not the most
common or recommended multi-tenant model, but this is an existing system and
I’m trying to better understand the current limits and future direction.

Thank you for your time and insights.

Best regards.

Re: Parallelizing startup with many databases

От

Andreas Karlsson

Дата:

03 января, 02:38:10

On 1/2/26 8:55 AM, Babak Ghadiri wrote:
> In PostgreSQL 16, startup appears to initialize databases sequentially and
> primarily uses a single CPU core. In clusters with a very large number of
> databases (around 5,000 in our case), this results in noticeably long
> startup times after restarts or crash recovery.

Have you measured what is actually causing the slow startup? Without 
knowing what is actually slow it is hard to say if threading would even 
help.

How slow are we talking about and have you managed to create a minimal 
case for reproducing the issue?

> - Is the largely single-threaded startup behavior a fundamental 
> architectural
>    constraint (e.g. catalog dependencies, locking, recovery ordering), 
> or mainly
>    an unimplemented optimization?

PostgreSQL does not support threading, it uses a multi-process model to 
implement for example parallel queries. And there is no way threading 
would be introduced just to improved startup performance.

> - Are there any existing discussions, patches, versions (18+) to 
> parallelize parts of startup or otherwise improve startup scalability 
> with many databases?

Not as far as I am aware but you can search our archives.

> - Are there any PostgreSQL configuration settings known to dramatically 
> reduce startup time, or is startup performance mostly fixed by 
> architecture in this scenario?

I would first start trying to figure out why startup is slow before 
doing anything else.

Andreas

Re: Parallelizing startup with many databases

От

Tom Lane

Дата:

03 января, 03:02:27

Andreas Karlsson <andreas@proxel.se> writes:
> On 1/2/26 8:55 AM, Babak Ghadiri wrote:
>> In PostgreSQL 16, startup appears to initialize databases sequentially and
>> primarily uses a single CPU core. In clusters with a very large number of
>> databases (around 5,000 in our case), this results in noticeably long
>> startup times after restarts or crash recovery.

> Have you measured what is actually causing the slow startup? Without 
> knowing what is actually slow it is hard to say if threading would even 
> help.

"perf" results would likely be useful.

I tried creating 5000 databases here and didn't notice any particular
increase in server startup time (didn't try crash-recovery case).
So whatever this is is likely somewhat configuration- or
platform-dependent.

Having said that, 5000 databases sounds like an anti-pattern to
begin with.  You're paying for an additional copy of the system
catalogs for each one.

            regards, tom lane

Re: Parallelizing startup with many databases

От

Jelte Fennema-Nio

Дата:

03 января, 03:58:25

On Fri, Jan 2, 2026, 08:55 Babak Ghadiri <bbkghadiri6@gmail.com> wrote:

Hello,
I hope you are doing well.

In PostgreSQL 16, startup appears to initialize databases sequentially and
primarily uses a single CPU core. In clusters with a very large number of
databases (around 5,000 in our case), this results in noticeably long
startup times after restarts or crash recovery.

You probably want to consider setting:

recovery_init_sync_method=syncfs

I'm 99% certain that that will solve your problem.

https://www.postgresql.org/docs/current/runtime-config-error-handling.html

https://www.postgresql.org/message-id/flat/11bc2bb7-ecb5-3ad0-b39f-df632734cd81@discourse.org

PS It took me way to long to find that setting. I think we should move it from the error handling docs page to the page with all of the other recovery settings. https://www.postgresql.org/docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY

Re: Parallelizing startup with many databases

От

Andreas Karlsson

Дата:

03 января, 22:09:00

On 1/3/26 1:58 AM, Jelte Fennema-Nio wrote:
> PS It took me way to long to find that setting. I think we should move 
> it from the error handling docs page to the page with all of the other 
> recovery settings. https://www.postgresql.org/docs/current/runtime- 
> config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY <https://www.postgresql.org/ 
> docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY>

I agree that it is currently not in exactly a great location but the 
issue is that the "Recovery" section is a subsection of the "WAL" 
section, and syncing the data directory is only loosely related to WAL. 
One could argue it is related to WAL as in that it is something done 
before replaying WAL but it is not an obvious location either. Or is it?

Andreas

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Parallelizing startup with many databases

Parallelizing startup with many databases

Re: Parallelizing startup with many databases

Re: Parallelizing startup with many databases

Re: Parallelizing startup with many databases

Re: Parallelizing startup with many databases