Re: Certain options in pg_upgrade don't seem to work.

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Certain options in pg_upgrade don't seem to work.
Дата
Msg-id 210ebcb9-4bf0-45c5-99bc-c0a2c8b77df4@vondra.me
обсуждение исходный текст
Список pgsql-bugs

On 8/14/24 03:41, DBA wrote:
> Hello, I'm inquiring because --j option don't seem to be working
> properly in pg_upgrade.
> 
> 
> *1. A description of what you are trying to achieve and what results you
> expect : *
> 
> 
> We took to much time to upgrade postgresql with pg_upgrade.
> 
> 
> So we have tested with --j option in pg_upgrade by below command 
> 
> 
> *time PGPASSWORD='1q2w3e4r' nohup /usr/lib/postgresql/16/bin/pg_upgrade
> -b /usr/lib/postgresql/15/bin/ -B /usr/lib/postgresql/16/bin/ -d
> /data/PG15 -D /data/PG16 -k -j 16 -U ecount_own > /data/tmp4/upgrade.log &*
> 
> 
> It took 520 minutes to upgrade and we used -j option to make pg_upgrade
> use multiple CPU processor but it didn't use them. (Only use single
> processor) 
> 
> (It took the same amount of time as when tested without --j option.)
> 
> 
> Is it normal not to use multiple cpu cores even with the --j option?
> 
> 
> The docs says that you use multi-processors for restoring.
> 
> 
> *note.)* We have about 80 partition tables and each partition table has
> 9000 children tables.
> 
> 
>        I think it could be affected by the number of partition tables.
> 

Very likely. The thing is, pg_upgrade does multiple tasks, and only some
of those are parallelized - the pg_upgrade [1] docs say:

   The --jobs option allows multiple CPU cores to be used for
   copying/linking of files and to dump and restore database schemas
   in parallel; a good place to start is the maximum of the number of
   CPU cores and tablespaces. This option can dramatically reduce the
   time to upgrade a multi-database server running on a multiprocessor
   machine.

So it's only the "linking" of files that's done using multiple jobs:

   ...
   Linking user relation files                                   ok
   ...

But with 80 partitioned tables, and 9000 partitions for each, the most
expensive step is likely dump/restore of the schema. And those are not
parallelized - it always happens in a single process.

You can confirm this by watching the pg_upgrade output, to see which
steps take most of the time.


regards

-- 
Tomas Vondra



В списке pgsql-bugs по дате отправления: