parallel data loading for pgbench -i
| От | Mircea Cadariu |
|---|---|
| Тема | parallel data loading for pgbench -i |
| Дата | |
| Msg-id | cb014f00-66b2-4328-a65e-d11c681c9f45@gmail.com обсуждение исходный текст |
| Список | pgsql-hackers |
Hi, I propose a patch for speeding up pgbench -i through multithreading. To enable this, pass -j and then the number of workers you want to use. Here are some results I got on my laptop: master --- -i -s 100 done in 20.95 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 14.51 s, vacuum 0.27 s, primary keys 6.16 s). -i -s 100 --partitions=10 done in 29.73 s (drop tables 0.00 s, create tables 0.02 s, client-side generate 16.33 s, vacuum 8.72 s, primary keys 4.67 s). patch (-j 10) --- -i -s 100 -j 10 done in 18.64 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 5.82 s, vacuum 6.89 s, primary keys 5.93 s). -i -s 100 -j 10 --partitions=10 done in 14.66 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 8.42 s, vacuum 1.55 s, primary keys 4.68 s). The speedup is more significant for the partitioned use-case. This is because all workers can use COPY FREEZE (thus incurring a lower vacuum penalty) because they create their separate partitions. For the non-partitioned case the speedup is lower, but I observe it improves somewhat with larger scale factors. When parallel vacuum support is merged, this should further reduce the time. I'd still need to update docs, tests, better integrate the code with its surroundings, and other aspects. Would appreciate any feedback on what I have so far though. Thanks! Kind regards, Mircea Cadariu
Вложения
В списке pgsql-hackers по дате отправления: