Re: Hybrid Hash/Nested Loop joins and caching results from subplans

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Дата
Msg-id CAApHDvorfu97V1G=+WvwzdWwO7XjFERqHg55umqF6nvh2kZJQQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Hybrid Hash/Nested Loop joins and caching results from subplans  (Andy Fan <zhihui.fan1213@gmail.com>)
Ответы Re: Hybrid Hash/Nested Loop joins and caching results from subplans  (Andy Fan <zhihui.fan1213@gmail.com>)
Список pgsql-hackers
On Tue, 2 Jun 2020 at 21:05, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> Today I tested the correctness & performance of this patch based on TPC-H
> workload, the environment is setup based on [1]. Correctness is tested by
> storing the result into another table when this feature is not introduced and
> then enable this feature and comparing the result with the original ones. No
> issue is found at this stage.

Thank you for testing it out.

> I also checked the performance gain for TPC-H workload, totally 4 out of the 22
> queries uses this new path, 3 of them are subplan, 1 of them is nestloop. All of
> changes gets a better result. You can check the attachments for reference.
> normal.log is the data without this feature, patched.log is the data with the
> feature. The data doesn't show the 10x performance gain, I think that's mainly
> data size related.

Thanks for running those tests.  I had a quick look at the results and
I think to say that all 4 are better is not quite right. One is
actually a tiny bit slower and one is only faster due to a plan
change.  Here's my full analysis.

Q2 uses a result cache for the subplan and has about a 37.5% hit ratio
which reduces the execution time of the query down to 67% of the
original.
Q17 uses a result cache for the subplan and has about a 96.5% hit
ratio which reduces the execution time of the query down to 24% of the
original time.
Q18 uses a result cache for 2 x nested loop joins and has a 0% hit
ratio.  The execution time is reduced to 91% of the original time only
because the planner uses a different plan, which just happens to be
faster by chance.
Q20 uses a result cache for the subplan and has a 0% hit ratio.  The
execution time is 100.27% of the original time. There are 8620 cache
misses.
All other queries use the same plan with and without the patch.

> At the code level,  I mainly checked nestloop path and cost_resultcache_rescan,
> everything looks good to me. I'd like to check the other parts in the following days.

Great.

> [1] https://ankane.org/tpc-h



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Default gucs for EXPLAIN
Следующее
От: Vik Fearing
Дата:
Сообщение: Re: Default gucs for EXPLAIN