Re: Hybrid Hash/Nested Loop joins and caching results from subplans

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Дата	19 августа 2020 г. 23:56:20
Msg-id	CAApHDvpd9bdsiH5CZSiEANUoHshOEkLJ92npbWKG7sT0CLSKCw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Hybrid Hash/Nested Loop joins and caching results from subplans (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список	pgsql-hackers

Дерево обсуждения

On Thu, 20 Aug 2020 at 10:58, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> On the performance aspect, I wonder what the overhead is, particularly
> considering Tom's point of making these nodes more expensive for cases
> with no caching.

It's likely small. I've not written any code but only thought about it
and I think it would be something like if (node->tuplecache != NULL).
I imagine that in simple cases the branch predictor would likely
realise the likely prediction fairly quickly and predict with 100%
accuracy, once learned. But it's perhaps possible that some other
branch shares the same slot in the branch predictor and causes some
conflicting predictions. The size of the branch predictor cache is
limited, of course.  Certainly introducing new branches that
mispredict and cause a pipeline stall during execution would be a very
bad thing for performance.  I'm unsure what would happen if there's
say, 2 Nested loops, one with caching = on and one with caching = off
where the number of tuples between the two is highly variable.  I'm
not sure a branch predictor would handle that well given that the two
branches will be at the same address but have different predictions.
However, if the predictor was to hash in the stack pointer too, then
that might not be a problem. Perhaps someone with a better
understanding of modern branch predictors can share their insight
there.

> And also, as the JIT saga continues, aren't we going
> to get plan trees recompiled too, at which point it won't matter much?

I was thinking batch execution would be our solution to the node
overhead problem.  We'll get there one day... we just need to finish
with the infinite other optimisations there are to do first.

David

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Hybrid Hash/Nested Loop joins and caching results from subplans