Re: Gather performance analysis

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: Gather performance analysis
Дата	23 сентября 2021 г. 23:31:09
Msg-id	CA+Tgmob9KxnHnX8bciGpf2mMDsNtwqkwJ+UYtpO5Z_f=d_dfog@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Gather performance analysis (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы	Re: Gather performance analysis Re: Gather performance analysis
Список	pgsql-hackers

Дерево обсуждения

On Thu, Sep 23, 2021 at 4:00 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> I did find some suspicious behavior on the bigger box I have available
> (with 2x xeon e5-2620v3), see the attached spreadsheet. But it seems
> pretty weird because the worst affected case is with no parallel workers
> (so the queue changes should affect it). Not sure how to explain it, but
> the behavior seems consistent.

That is pretty odd. I'm inclined to mostly discount the runs with
10000 tuples because sending such a tiny number of tuples doesn't
really take any significant amount of time, and it seems possible that
variations in the runtime of other code due to code movement effects
could end up mattering more than the changes to the performance of
shm_mq. However, the results with a million tuples seem like they're
probably delivering statistically significant results ... and I guess
maybe what's happening is that the patch hurts when the tuples are too
big relative to the queue size.

I guess your columns are an md5 value each, which is 32 bytes +
overhead, so a 20-columns tuple is ~1kB. Since Dilip's patch flushes
the value to shared memory when more than a quarter of the queue has
been filled, that probably means we flush every 4-5 tuples. I wonder
if that means we need a smaller threshold, like 1/8 of the queue size?
Or maybe the behavior should be adaptive somehow, depending on whether
the receiver ends up waiting for data? Or ... perhaps only small
tuples are worth batching, so that the threshold for posting to shared
memory should be a constant rather than a fraction of the queue size?
I guess we need to know why we see the time spike up in those cases,
if we want to improve them.

--
Robert Haas
EDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Robert Haas
Дата: 23 сентября 2021 г., 23:02:38
Сообщение: Re: [PATCH] Allow queries in WHEN expression of FOR EACH STATEMENT triggers

Следующее

От: Melanie Plageman
Дата: 24 сентября 2021 г., 00:05:07
Сообщение: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Gather performance analysis

Предыдущее

Следующее