Re: Eager aggregation, take 3
От | Matheus Alcantara |
---|---|
Тема | Re: Eager aggregation, take 3 |
Дата | |
Msg-id | DD8YF2IZ2M1C.1C0WAMOSIW59K@gmail.com обсуждение исходный текст |
Ответ на | Re: Eager aggregation, take 3 (Richard Guo <guofenglinux@gmail.com>) |
Список | pgsql-hackers |
On Fri Oct 3, 2025 at 12:14 AM -03, Richard Guo wrote: >> The only query that I see a considerable regression is query 23 which I >> get a 23% worst execution time. I'm attaching the EXPLAIN(ANALYZE) >> output from master and from the patched version if it's interesting. > > I tested query 23 in my local environment but didn't observe the > regression. > > -- on master > Planning Time: 1.950 ms > Execution Time: 3260.924 ms > > -- on patched > Planning Time: 2.197 ms > Execution Time: 3237.287 ms > > I ran the benchmark at scale factor 1 and executed ANALYZE beforehand. > For the build configuration, I disabled cassert. > I've disabled the cassert and executed the ANALYZE again before benchmarking and now I have similar results with a improvement on eager aggregate version: -- master Planning Time: 2.734 ms Execution Time: 5238.128 ms -- patched Planning Time: 2.578 ms Execution Time: 4732.584 ms > Comparing the plans, I noticed one key difference: in the plan you > provided (query-23.patch.explain), the frequent_ss_items CTE uses > parallel aggregation, whereas in my local environment it does not. > This leads to a different final join order between the two plans. > > However, given the highly inaccurate size and cost estimates for the > CTE Scan nodes, I'm not sure it's worth investigating further. I'm > starting to feel that trying to tune performance here, with such > inaccurate underlying estimates for CTEs, is like building on sand. > > [ ...] > >> I'm just wondering if there is anything that can be done on the planner >> to prevent this type of situation? > > I think the ideal solution is to improve our estimates for CTE > relations to make the plans for TPC-DS queries more reasonable. Of > course, for queries from other benchmarks, the issues may stem from > other plan nodes. IMHO, we really need some improvements in our cost > estimation. > Fair points, agree. The performance results look good to me. I don't have to much comments about the code although I'm still learning about the planner internals this patch seems in good shape to me. I'm just attaching a new csv with the last results after running with cassert disabled and after executing ANALYZE. It looks good to me. Thanks for working on this! -- Matheus Alcantara
Вложения
В списке pgsql-hackers по дате отправления: