Re: parallel data loading for pgbench -i
| От | Mircea Cadariu |
|---|---|
| Тема | Re: parallel data loading for pgbench -i |
| Дата | |
| Msg-id | 10868918-cdf9-49dc-99af-8e8ccd6e368c@gmail.com обсуждение исходный текст |
| Ответ на | Re: parallel data loading for pgbench -i (lakshmi <lakshmigcdac@gmail.com>) |
| Список | pgsql-hackers |
Hi Lakshmi, Hayato, Thanks a lot for your input! I'm not sure why the VACUUM phase takes longer compared to the serial run. We can potentially get a clue with a profiler. I know there is an ongoing effort to introduce parallel heap vacuum [1] which I expect will help with this. The code comments you have provided me have been applied to the v2 patch attached. Below I provide answers to the questions. > Also, why is -j accepted in case of non-partitions? For non-partitioned tables, each worker loads a separate range of rows via its own connection in parallel. > Copying seems to be divided into chunks per COPY_BATCH_SIZE. Is it really > essential to parallelize the initialization? I feel it may optimize even > serialized case thus can be discussed independently. You're right that the COPY batching is an optimization that's independent. I wanted to see how fast I can get this patch, so I looked for bottlenecks in the new code with a profiler and this was one of them. I agree it makes sense to apply this for the serialised case separately. > Per my understanding, each thread creates its tables, and all of them are > attached to the parent table. Is it right? I think it needs more code > changes, and I am not sure it is critical to make initialization faster. Yes, that's correct. Each worker creates its assigned partitions as standalone tables, loads data into them, and then the main thread attaches them all to the parent after loading completes. It's to avoid AccessExclusiveLock contention on the parent table during parallel loading and allow each worker to use COPY FREEZE on its standalone table. > So I suggest using the incremental approach. The first patch only > parallelizes > the data load, and the second patch implements the CREATE TABLE and > ALTER TABLE > ATTACH PARTITION. You can benchmark three patterns, master, 0001, and > 0001 + 0002, then compare the results. IIUC, this is the common > approach to > reduce the patch size and make them more reviewable. Thanks for the recommendation, I extracted 0001 and 0002 as per your suggestion. I will see if I can split it more, as indeed it helps with the review. Results are similar with the previous runs. master pgbench -i -s 100 -j 10 done in 20.95 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 14.51 s, vacuum 0.27 s, primary keys 6.16 s). pgbench -i -s 100 -j 10 --partitions=10 done in 29.73 s (drop tables 0.00 s, create tables 0.02 s, client-side generate 16.33 s, vacuum 8.72 s, primary keys 4.67 s). 0001 pgbench -i -s 100 -j 10 done in 18.75 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 6.51 s, vacuum 5.73 s, primary keys 6.50 s). pgbench -i -s 100 -j 10 --partitions=10 done in 29.33 s (drop tables 0.00 s, create tables 0.02 s, client-side generate 16.48 s, vacuum 7.59 s, primary keys 5.24 s). 0002 pgbench -i -s 100 -j 10 done in 18.12 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 6.64 s, vacuum 5.81 s, primary keys 5.65 s). pgbench -i -s 100 -j 10 --partitions=10 done in 14.38 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 7.97 s, vacuum 1.55 s, primary keys 4.85 s). Looking forward to your feedback. [1]: https://www.postgresql.org/message-id/CAD21AoAEfCNv-GgaDheDJ%2Bs-p_Lv1H24AiJeNoPGCmZNSwL1YA%40mail.gmail.com -- Thanks, Mircea Cadariu
Вложения
В списке pgsql-hackers по дате отправления: