Re: pg_upgrade parallelism
От | Jaime Casanova |
---|---|
Тема | Re: pg_upgrade parallelism |
Дата | |
Msg-id | Yd5eOwB6uBQdA11T@ahch-to обсуждение исходный текст |
Ответ на | Re: pg_upgrade parallelism (Jacob Champion <pchampion@vmware.com>) |
Список | pgsql-hackers |
On Wed, Nov 17, 2021 at 08:04:41PM +0000, Jacob Champion wrote: > On Wed, 2021-11-17 at 14:44 -0500, Jaime Casanova wrote: > > I'm trying to add more parallelism by copying individual segments > > of a relfilenode in different processes. Does anyone one see a big > > problem in trying to do that? I'm asking because no one did it before, > > that could not be a good sign. > > I looked into speeding this up a while back, too. For the use case I > was looking at -- Greenplum, which has huge numbers of relfilenodes -- > spinning disk I/O was absolutely the bottleneck and that is typically > not easily parallelizable. (In fact I felt at the time that Andres' > work on async I/O might be a better way forward, at least for some > filesystems.) > > But you mentioned that you were seeing disks that weren't saturated, so > maybe some CPU optimization is still valuable? I am a little skeptical > that more parallelism is the way to do that, but numbers trump my > skepticism. > Sorry for being unresponsive too long. I did add a new --jobs-per-disk option, this is a simple patch I made for the customer and ignored all WIN32 parts because I don't know anything about that part. I was wanting to complete that part but it has been in the same state two months now. AFAIU, it seems there is a different struct for the parameters of the function that will be called on the thread. I also decided to create a new reap_*_child() function for using with the new parameter. Now, the customer went from copy 25Tb in 6 hours to 4h 45min, which is an improvement of 20%! > > - why we read()/write() at all? is not a faster way of copying the file? > > i'm asking that because i don't actually know. > > I have idly wondered if something based on splice() would be faster, > but I haven't actually tried it. > I tried and got no better result. > But there is now support for copy-on-write with the clone mode, isn't > there? Or are you not able to take advantage of it? > That's sadly not possible because those are different disks, and yes I know that's something that pg_upgrade normally doesn't allow but is not difficult to make it happen. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Вложения
В списке pgsql-hackers по дате отправления: