optimizing pg_upgrade's once-in-each-database steps
От | Nathan Bossart |
---|---|
Тема | optimizing pg_upgrade's once-in-each-database steps |
Дата | |
Msg-id | 20240516211638.GA1688936@nathanxps13 обсуждение исходный текст |
Ответы |
Re: optimizing pg_upgrade's once-in-each-database steps
|
Список | pgsql-hackers |
A number of pg_upgrade steps require connecting to each database and running a query. When there are many databases, these steps are particularly time-consuming, especially since this is done sequentially in a single process. At a quick glance, I see the following such steps: * create_logical_replication_slots * check_for_data_types_usage * check_for_isn_and_int8_passing_mismatch * check_for_user_defined_postfix_ops * check_for_incompatible_polymorphics * check_for_tables_with_oids * check_for_user_defined_encoding_conversions * check_old_cluster_subscription_state * get_loadable_libraries * get_db_rel_and_slot_infos * old_9_6_invalidate_hash_indexes * report_extension_updates I set out to parallelize these kinds of steps via multiple threads or processes, but I ended up realizing that we could likely achieve much of the same gain with libpq's asynchronous APIs. Specifically, both establishing the connections and running the queries can be done without blocking, so we can just loop over a handful of slots and advance a simple state machine for each. The attached is a proof-of-concept grade patch for doing this for get_db_rel_and_slot_infos(), which yielded the following results on my laptop for "pg_upgrade --link --sync-method=syncfs --jobs 8" for a cluster with 10K empty databases. total pg_upgrade_time: * HEAD: 14m 8s * patch: 10m 58s get_db_rel_and_slot_infos() on old cluster: * HEAD: 2m 45s * patch: 36s get_db_rel_and_slot_infos() on new cluster: * HEAD: 1m 46s * patch: 29s I am posting this early to get thoughts on the general approach. If we proceeded with this strategy, I'd probably create some generic tooling that each relevant step would provide a set of callback functions. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Вложения
В списке pgsql-hackers по дате отправления: