Re: Parallel copy
От | Tomas Vondra |
---|---|
Тема | Re: Parallel copy |
Дата | |
Msg-id | 20201003004959.73ot57oeikhtuq4u@development обсуждение исходный текст |
Ответ на | Re: Parallel copy (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Ответы |
Re: Parallel copy
Re: Parallel copy |
Список | pgsql-hackers |
Hello Vignesh, I've done some basic benchmarking on the v4 version of the patches (but AFAIKC the v5 should perform about the same), and some initial review. For the benchmarking, I used the lineitem table from TPC-H - for 75GB data set, this largest table is about 64GB once loaded, with another 54GB in 5 indexes. This is on a server with 32 cores, 64GB of RAM and NVME storage. The COPY duration with varying number of workers (specified using the parallel COPY option) looks like this: workers duration --------------------- 0 1366 1 1255 2 704 3 526 4 434 5 385 6 347 7 322 8 327 So this seems to work pretty well - initially we get almost linear speedup, then it slows down (likely due to contention for locks, I/O etc.). Not bad. I've only done a quick review, but overall the patch looks in fairly good shape. 1) I don't quite understand why we need INCREMENTPROCESSED and RETURNPROCESSED, considering it just does ++ or return. It just obfuscated the code, I think. 2) I find it somewhat strange that BeginParallelCopy can just decide not to do parallel copy after all. Why not to do this decisions in the caller? Or maybe it's fine this way, not sure. 3) AFAIK we don't modify typedefs.list in patches, so these changes should be removed. 4) IsTriggerFunctionParallelSafe actually checks all triggers, not just one, so the comment needs minor rewording. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: