Re: pg_upgrade parallelism

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: pg_upgrade parallelism
Дата
Msg-id 20211118224324.GA8246@momjian.us
обсуждение исходный текст
Ответ на pg_upgrade parallelism  (Jaime Casanova <jcasanov@systemguards.com.ec>)
Список pgsql-hackers
On Wed, Nov 17, 2021 at 02:44:52PM -0500, Jaime Casanova wrote:
> Hi,
> 
> Currently docs about pg_upgrade says:
> 
> """
>     <para>
>      The <option>--jobs</option> option allows multiple CPU cores to be used
>      for copying/linking of files and to dump and reload database schemas
>      in parallel;  a good place to start is the maximum of the number of
>      CPU cores and tablespaces.  This option can dramatically reduce the
>      time to upgrade a multi-database server running on a multiprocessor
>      machine.
>     </para>
> """
> 
> Which make the user think that the --jobs option could use all CPU
> cores. Which is not true. Or that it has anything to do with multiple
> databases, which is true only to some extent.

Uh, the behavior is a little more complicated.  The --jobs option in
pg_upgrade is used to parallelize three operations:

*  copying relation files

*  dumping old cluster objects (via parallel_exec_prog())

*  creating objects in the new cluster (via parallel_exec_prog())

The last two basically operate on databases in parallel --- they can't
dump/load a single database in parallel, but they can dump/load several
databases in parallel.

The documentation you quote above is saying that you set jobs based on
the number of CPUs (for dump/reload which are assumed to be CPU bound)
and the number of tablespaces (which is assumed to be I/O bound).

I am not sure how we can improve that text.  We could just say the max
of the number of databases and tablespaces, but then the number of CPUs
needs to be involved since, if you only have one CPU core, you don't
want parallel dumps/loads happening since that will just cause CPU
contention with little benefit.  We mention tablespaces because even if
you only have once CPU core, since tablespace copying is I/O bound, you
can still benefit from --jobs.

> What that option really improves are upgrading servers with multiple
> tablespaces, of course if --link or --clone are used pg_upgrade is still
> very fast but used with the --copy option is not what one could expect.
> 
> As an example, a customer with a 25Tb database, 40 cores and lots of ram
> used --jobs=35 and got only 7 processes (they have 6 tablespaces) and
> the disks where not used at maximum speed either. They expected 35
> processes copying lots of files at the same time.
> 
> So, first I would like to improve documentation. What about something
> like the attached? 
> 
> Now, a couple of questions:
> 
> - in src/bin/pg_upgrade/file.c at copyFile() we define a buffer to
>   determine the amount of bytes that should be used in read()/write() to
>   copy the relfilenode segments. And we define it as (50 * BLCKSZ),
>   which is 400Kb. Isn't this too small?

Uh, if you find that increasing that helps, we can increase it --- I
don't know how that value was chosen.  However, we are really just
copying the data into the kernel, not forcing it to storage, so I don't
know if a larger value would help.

> - why we read()/write() at all? is not a faster way of copying the file?
>   i'm asking that because i don't actually know.

Uh, we could use buffered I/O, I guess, but again, would there be a
benefit?

> I'm trying to add more parallelism by copying individual segments
> of a relfilenode in different processes. Does anyone one see a big
> problem in trying to do that? I'm asking because no one did it before,
> that could not be a good sign.

I think we were assuming the copy would be I/O bound and that
parallelism wouldn't help in a single tablespace.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Melanie Plageman
Дата:
Сообщение: Re: Showing I/O timings spent reading/writing temp buffers in EXPLAIN
Следующее
От: Peter Smith
Дата:
Сообщение: Re: row filtering for logical replication