Re: Gather performance analysis

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Gather performance analysis
Дата
Msg-id CA+Tgmob9KxnHnX8bciGpf2mMDsNtwqkwJ+UYtpO5Z_f=d_dfog@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Gather performance analysis  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: Gather performance analysis  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Re: Gather performance analysis  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
On Thu, Sep 23, 2021 at 4:00 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> I did find some suspicious behavior on the bigger box I have available
> (with 2x xeon e5-2620v3), see the attached spreadsheet. But it seems
> pretty weird because the worst affected case is with no parallel workers
> (so the queue changes should affect it). Not sure how to explain it, but
> the behavior seems consistent.

That is pretty odd. I'm inclined to mostly discount the runs with
10000 tuples because sending such a tiny number of tuples doesn't
really take any significant amount of time, and it seems possible that
variations in the runtime of other code due to code movement effects
could end up mattering more than the changes to the performance of
shm_mq. However, the results with a million tuples seem like they're
probably delivering statistically significant results ... and I guess
maybe what's happening is that the patch hurts when the tuples are too
big relative to the queue size.

I guess your columns are an md5 value each, which is 32 bytes +
overhead, so a 20-columns tuple is ~1kB. Since Dilip's patch flushes
the value to shared memory when more than a quarter of the queue has
been filled, that probably means we flush every 4-5 tuples. I wonder
if that means we need a smaller threshold, like 1/8 of the queue size?
Or maybe the behavior should be adaptive somehow, depending on whether
the receiver ends up waiting for data? Or ... perhaps only small
tuples are worth batching, so that the threshold for posting to shared
memory should be a constant rather than a fraction of the queue size?
I guess we need to know why we see the time spike up in those cases,
if we want to improve them.

--
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [PATCH] Allow queries in WHEN expression of FOR EACH STATEMENT triggers
Следующее
От: Melanie Plageman
Дата:
Сообщение: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)