Re: pgbench - implement strict TPC-B benchmark
| От | Fabien COELHO |
|---|---|
| Тема | Re: pgbench - implement strict TPC-B benchmark |
| Дата | |
| Msg-id | alpine.DEB.2.21.1908052208280.26206@lancre обсуждение исходный текст |
| Ответ на | Re: pgbench - implement strict TPC-B benchmark (Andres Freund <andres@anarazel.de>) |
| Список | pgsql-hackers |
Hello Andres,
>> Which is a (somehow disappointing) * 3.3 speedup. The impact on the 3
>> complex expressions tests is not measurable, though.
>
> I don't know why that could be disappointing. We put in much more work
> for much smaller gains in other places.
Probably, but I thought I would have a better deal by eliminating most
string stuff from variables.
>> Questions:
>> - how likely is such a patch to pass? (IMHO not likely)
>
> I don't see why? I didn't review the patch in any detail, but it didn't
> look crazy in quick skim? Increasing how much load can be simulated
> using pgbench, is something I personally find much more interesting than
> adding capabilities that very few people will ever use.
Yep, but my point is that the bottleneck is mostly libpq/system, as I
tried to demonstrate with the few experiments I reported.
> FWIW, the areas I find current pgbench "most lacking" during development
> work are:
>
> 1) Data load speed. The data creation is bottlenecked on fprintf in a
> single process.
snprintf actually, could be replaced.
I submitted a patch to add more control on initialization, including a
server-side loading feature, i.e. the client does not send data, the
server generates its own, see 'G':
https://commitfest.postgresql.org/24/2086/
However on my laptop it is slower than client-side loading on a local
socket. The client version is doing around 70 MB/s, the client load is
20-30%, postgres load is 85%, but I'm not sure I can hope for much more on
my SSD. On my laptop the bottleneck is postgres/disk, not fprintf.
> The index builds are done serially. The vacuum could be replaced by COPY
> FREEZE.
Well, it could be added?
> For a lot of meaningful tests one needs 10-1000s of GB of testdata -
> creating that is pretty painful.
Yep.
> 2) Lack of proper initialization integration for custom
> scripts.
Hmmm…
You can always write a psql script for schema and possibly simplistic data
initialization?
However, generating meaningful pseudo-random data for an arbitrary schema
is a pain. I did an external tool for that a few years ago:
http://www.coelho.net/datafiller.html
but it is still a pain.
> I.e. have steps that are in the custom script that allow -i, vacuum, etc
> to be part of the script, rather than separately executable steps.
> --init-steps doesn't do anything for that.
Sure. It just gives some control.
> 3) pgbench overhead, although that's to a significant degree libpq's fault
I'm afraid that is currently the case.
> 4) Ability to cancel pgbench and get approximate results. That currently
> works if the server kicks out the clients, but not when interrupting
> pgbench - which is just plain weird. Obviously that doesn't matter
> for "proper" benchmark runs, but often during development, it's
> enough to run pgbench past some events (say the next checkpoint).
Do you mean have a report anyway on "Ctrl-C"?
I usually do a -P 1 to see the progress, but making Ctrl-C work should be
reasonably easy.
>> - what is its impact to overall performance when actual queries
>> are performed (IMHO very small).
>
> Obviously not huge - I'd also not expect it to be unobservably small
> either.
Hmmm… Indeed, the 20 \set script runs at 2.6 M/s, that is 0.019 µs per
\set, and any discussion over the connection is at least 15 µs (for one
client on a local socket).
--
Fabien.
В списке pgsql-hackers по дате отправления: