Обсуждение: Parallelizing startup with many databases
Hello,
I hope you are doing well.
In PostgreSQL 16, startup appears to initialize databases sequentially and
primarily uses a single CPU core. In clusters with a very large number of
databases (around 5,000 in our case), this results in noticeably long
startup times after restarts or crash recovery.
I would like to ask:
- Is the largely single-threaded startup behavior a fundamental architectural
constraint (e.g. catalog dependencies, locking, recovery ordering), or mainly
an unimplemented optimization?
- Are there any existing discussions, patches, versions (18+) to parallelize parts of startup or otherwise improve startup scalability with many databases?
- Are there any PostgreSQL configuration settings known to dramatically reduce startup time, or is startup performance mostly fixed by architecture in this scenario?
I understand that having many databases in a single cluster is not the most
common or recommended multi-tenant model, but this is an existing system and
I’m trying to better understand the current limits and future direction.
Thank you for your time and insights.
Best regards.
I hope you are doing well.
In PostgreSQL 16, startup appears to initialize databases sequentially and
primarily uses a single CPU core. In clusters with a very large number of
databases (around 5,000 in our case), this results in noticeably long
startup times after restarts or crash recovery.
I would like to ask:
- Is the largely single-threaded startup behavior a fundamental architectural
constraint (e.g. catalog dependencies, locking, recovery ordering), or mainly
an unimplemented optimization?
- Are there any existing discussions, patches, versions (18+) to parallelize parts of startup or otherwise improve startup scalability with many databases?
- Are there any PostgreSQL configuration settings known to dramatically reduce startup time, or is startup performance mostly fixed by architecture in this scenario?
I understand that having many databases in a single cluster is not the most
common or recommended multi-tenant model, but this is an existing system and
I’m trying to better understand the current limits and future direction.
Thank you for your time and insights.
Best regards.
On 1/2/26 8:55 AM, Babak Ghadiri wrote: > In PostgreSQL 16, startup appears to initialize databases sequentially and > primarily uses a single CPU core. In clusters with a very large number of > databases (around 5,000 in our case), this results in noticeably long > startup times after restarts or crash recovery. Have you measured what is actually causing the slow startup? Without knowing what is actually slow it is hard to say if threading would even help. How slow are we talking about and have you managed to create a minimal case for reproducing the issue? > - Is the largely single-threaded startup behavior a fundamental > architectural > constraint (e.g. catalog dependencies, locking, recovery ordering), > or mainly > an unimplemented optimization? PostgreSQL does not support threading, it uses a multi-process model to implement for example parallel queries. And there is no way threading would be introduced just to improved startup performance. > - Are there any existing discussions, patches, versions (18+) to > parallelize parts of startup or otherwise improve startup scalability > with many databases? Not as far as I am aware but you can search our archives. > - Are there any PostgreSQL configuration settings known to dramatically > reduce startup time, or is startup performance mostly fixed by > architecture in this scenario? I would first start trying to figure out why startup is slow before doing anything else. Andreas
Andreas Karlsson <andreas@proxel.se> writes:
> On 1/2/26 8:55 AM, Babak Ghadiri wrote:
>> In PostgreSQL 16, startup appears to initialize databases sequentially and
>> primarily uses a single CPU core. In clusters with a very large number of
>> databases (around 5,000 in our case), this results in noticeably long
>> startup times after restarts or crash recovery.
> Have you measured what is actually causing the slow startup? Without
> knowing what is actually slow it is hard to say if threading would even
> help.
"perf" results would likely be useful.
I tried creating 5000 databases here and didn't notice any particular
increase in server startup time (didn't try crash-recovery case).
So whatever this is is likely somewhat configuration- or
platform-dependent.
Having said that, 5000 databases sounds like an anti-pattern to
begin with. You're paying for an additional copy of the system
catalogs for each one.
regards, tom lane
On Fri, Jan 2, 2026, 08:55 Babak Ghadiri <bbkghadiri6@gmail.com> wrote:
Hello,
I hope you are doing well.
In PostgreSQL 16, startup appears to initialize databases sequentially and
primarily uses a single CPU core. In clusters with a very large number of
databases (around 5,000 in our case), this results in noticeably long
startup times after restarts or crash recovery.
You probably want to consider setting:
recovery_init_sync_method=syncfsI'm 99% certain that that will solve your problem.
PS It took me way to long to find that setting. I think we should move it from the error handling docs page to the page with all of the other recovery settings. https://www.postgresql.org/docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY
On 1/3/26 1:58 AM, Jelte Fennema-Nio wrote: > PS It took me way to long to find that setting. I think we should move > it from the error handling docs page to the page with all of the other > recovery settings. https://www.postgresql.org/docs/current/runtime- > config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY <https://www.postgresql.org/ > docs/current/runtime-config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY> I agree that it is currently not in exactly a great location but the issue is that the "Recovery" section is a subsection of the "WAL" section, and syncing the data directory is only loosely related to WAL. One could argue it is related to WAL as in that it is something done before replaying WAL but it is not an obvious location either. Or is it? Andreas