Re: optimizing pg_upgrade's once-in-each-database steps
От | Nathan Bossart |
---|---|
Тема | Re: optimizing pg_upgrade's once-in-each-database steps |
Дата | |
Msg-id | ZoMHsBasBy8LqkEU@nathan обсуждение исходный текст |
Ответ на | Re: optimizing pg_upgrade's once-in-each-database steps (Nathan Bossart <nathandbossart@gmail.com>) |
Ответы |
Re: optimizing pg_upgrade's once-in-each-database steps
|
Список | pgsql-hackers |
I figured I'd post what I have so far since this thread hasn't been updated in a while. The attached patches are still "proof-of-concept grade," but they are at least moving in the right direction (IMHO). The variable naming is still not great, and they are woefully undercommented, among other things. 0001 introduces a new API for registering callbacks and running them in parallel on all databases in the cluster. This new system manages a set of "slots" that follow a simple state machine to asynchronously establish a connection and run the queries. It uses system() to wait for these asynchronous tasks to complete. Users of this API only need to provide two callbacks: one to return the query that should be run on each database and another to process the results of that query. If multiple queries are required for each database, users can provide multiple sets of callbacks. The other patches change several of the existing tasks to use this new API. With these patches applied, I see the following differences in the output of 'pg_upgrade | ts -i' for a cluster with 1k empty databases: WITHOUT PATCH 00:00:19 Checking database user is the install user ok 00:00:02 Checking for subscription state ok 00:00:06 Adding ".old" suffix to old global/pg_control ok 00:00:04 Checking for extension updates ok WITH PATCHES (--jobs 1) 00:00:10 Checking database user is the install user ok 00:00:02 Checking for subscription state ok 00:00:07 Adding ".old" suffix to old global/pg_control ok 00:00:05 Checking for extension updates ok WITH PATCHES (--jobs 4) 00:00:06 Checking database user is the install user ok 00:00:00 Checking for subscription state ok 00:00:02 Adding ".old" suffix to old global/pg_control ok 00:00:01 Checking for extension updates ok Note that the "Checking database user is the install user" time also includes the call to get_db_rel_and_slot_infos() on the old cluster as well as the call to get_loadable_libraries() on the old cluster. I believe the improvement with the patches with just one job is due to the consolidation of the queries into one database connection (presently, get_db_rel_and_slot_infos() creates 3 connections per database for some upgrades). Similarly, the "Adding \".old\" suffix to old global/pg_control" time includes the call to get_db_rel_and_slot_infos() on the new cluster. There are several remaining places where we could use this new API to speed up upgrades. For example, I haven't attempted to use it for the data type checks yet, and that tends to eat up a sizable chunk of time when there are many databases. On Thu, May 16, 2024 at 08:24:08PM -0500, Nathan Bossart wrote: > On Thu, May 16, 2024 at 05:09:55PM -0700, Jeff Davis wrote: >> Also, did you consider connecting once to each database and running >> many queries? Most of those seem like just checks. > > This was the idea behind 347758b. It may be possible to do more along > these lines. IMO parallelizing will still be useful even if we do combine > more of the steps. My current thinking is that any possible further consolidation should happen as part of a follow-up effort to parallelization. I'm cautiously optimistic that the parallelization work will make the consolidation easier since it moves things to rigidly-defined callback functions. A separate piece of off-list feedback from Michael Paquier is that this new parallel system might be something we can teach the ParallelSlot code used by bin/scripts/ to do. I've yet to look too deeply into this, but I suspect that it will be difficult to combine the two. For example, the ParallelSlot system doesn't seem well-suited for the kind of run-once-in-each-database tasks required by pg_upgrade, and the error handling is probably little different, too. However, it's still worth a closer look, and I'm interested in folks' opinions on the subject. -- nathan
Вложения
- v2-0001-introduce-framework-for-parallelizing-pg_upgrade-.patch
- v2-0002-use-new-pg_upgrade-async-API-for-subscription-sta.patch
- v2-0003-move-live_check-variable-to-user_opts.patch
- v2-0004-use-new-pg_upgrade-async-API-for-retrieving-relin.patch
- v2-0005-use-new-pg_upgrade-async-API-to-parallelize-getti.patch
- v2-0006-use-new-pg_upgrade-async-API-to-parallelize-repor.patch
В списке pgsql-hackers по дате отправления: