Re: Use COPY for populating all pgbench tables

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Use COPY for populating all pgbench tables
Дата
Msg-id CAApHDvqP4k0r+HPcbG_TNDUSa2be+813anX4v_KEVR6SyzZi3A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Use COPY for populating all pgbench tables  ("Tristan Partin" <tristan@neon.tech>)
Ответы Re: Use COPY for populating all pgbench tables  (Hannu Krosing <hannuk@google.com>)
Re: Use COPY for populating all pgbench tables  ("Tristan Partin" <tristan@neon.tech>)
Re: Use COPY for populating all pgbench tables  ("Tristan Partin" <tristan@neon.tech>)
Список pgsql-hackers
On Thu, 8 Jun 2023 at 07:16, Tristan Partin <tristan@neon.tech> wrote:
>
> master:
>
> 50000000 of 50000000 tuples (100%) done (elapsed 260.93 s, remaining 0.00 s))
> vacuuming...
> creating primary keys...
> done in 1414.26 s (drop tables 0.20 s, create tables 0.82 s, client-side generate 1280.43 s, vacuum 2.55 s, primary
keys130.25 s).
 
>
> patchset:
>
> 50000000 of 50000000 tuples (100%) of pgbench_accounts done (elapsed 243.82 s, remaining 0.00 s))
> vacuuming...
> creating primary keys...
> done in 375.66 s (drop tables 0.14 s, create tables 0.73 s, client-side generate 246.27 s, vacuum 2.77 s, primary
keys125.75 s).
 

I've also previously found pgbench -i to be slow.  It was a while ago,
and IIRC, it was due to the printfPQExpBuffer() being a bottleneck
inside pgbench.

On seeing your email, it makes me wonder if PG16's hex integer
literals might help here.  These should be much faster to generate in
pgbench and also parse on the postgres side.

I wrote a quick and dirty patch to try that and I'm not really getting
the same performance increases as I'd have expected. I also tested
with your patch too and it does not look that impressive either when
running pgbench on the same machine as postgres.

pgbench copy speedup

** master
drowley@amd3990x:~$ pgbench -i -s 1000 postgres
100000000 of 100000000 tuples (100%) done (elapsed 74.15 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 95.71 s (drop tables 0.00 s, create tables 0.01 s, client-side
generate 74.45 s, vacuum 0.12 s, primary keys 21.13 s).

** David's Patched
drowley@amd3990x:~$ pgbench -i -s 1000 postgres
100000000 of 100000000 tuples (100%) done (elapsed 69.64 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 90.22 s (drop tables 0.00 s, create tables 0.01 s, client-side
generate 69.91 s, vacuum 0.12 s, primary keys 20.18 s).

** Tristan's patch
drowley@amd3990x:~$ pgbench -i -s 1000 postgres
100000000 of 100000000 tuples (100%) of pgbench_accounts done (elapsed
77.44 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 98.64 s (drop tables 0.00 s, create tables 0.01 s, client-side
generate 77.47 s, vacuum 0.12 s, primary keys 21.04 s).

I'm interested to see what numbers you get.  You'd need to test on
PG16 however. I left the old code in place to generate the decimal
numbers for versions < 16.

David

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jaime Casanova
Дата:
Сообщение: Re: is pg_log_standby_snapshot() really needed?
Следующее
От: "Drouvot, Bertrand"
Дата:
Сообщение: Re: Let's make PostgreSQL multi-threaded