improve performance of pg_dump --binary-upgrade
| От | Nathan Bossart |
|---|---|
| Тема | improve performance of pg_dump --binary-upgrade |
| Дата | |
| Msg-id | 20240418041712.GA3441570@nathanxps13 обсуждение исходный текст |
| Ответы |
Re: improve performance of pg_dump --binary-upgrade
Re: improve performance of pg_dump --binary-upgrade |
| Список | pgsql-hackers |
While examining pg_upgrade on a cluster with many tables (created with the command in [0]), I noticed that a huge amount of pg_dump time goes towards the binary_upgrade_set_pg_class_oids() function. This function executes a rather expensive query for a single row, and this function appears to be called for most of the rows in pg_class. The attached work-in-progress patch speeds up 'pg_dump --binary-upgrade' for this case. Instead of executing the query in every call to the function, we can execute it once during the first call and store all the required information in a sorted array that we can bsearch() in future calls. For the aformentioned test, pg_dump on my machine goes from ~2 minutes to ~18 seconds, which is much closer to the ~14 seconds it takes without --binary-upgrade. One downside of this approach is the memory usage. This was more-or-less the first approach that crossed my mind, so I wouldn't be surprised if there's a better way. I tried to keep the pg_dump output the same, but if that isn't important, maybe we could dump all the pg_class OIDs at once instead of calling binary_upgrade_set_pg_class_oids() for each one. [0] https://postgr.es/m/3612876.1689443232%40sss.pgh.pa.us -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Вложения
В списке pgsql-hackers по дате отправления: