Re: Eager aggregation, take 3
От | Robert Haas |
---|---|
Тема | Re: Eager aggregation, take 3 |
Дата | |
Msg-id | CA+TgmobT156YuP5HUscbR_17nQs=KC703T=SB3VkQMcFbK4Gqw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Eager aggregation, take 3 (Richard Guo <guofenglinux@gmail.com>) |
Список | pgsql-hackers |
On Tue, Sep 9, 2025 at 6:30 AM Richard Guo <guofenglinux@gmail.com> wrote: > > I think it would be worth considering generating the partially grouped > > relations in a second pass. Right now, as you progress from the bottom > > of the join tree towards the top, you created grouped rels as you go. > > But you could equally well finish planning everything up to the > > scan/join target first and then go back and add grouped_rels to > > relations where it seems worthwhile. > > Hmm, I don't think so. I think the presence of eager aggregation > could change the best join order. For example, without eager > aggregation, the optimizer might find that (A JOIN B) JOIN C the best > join order. But with eager aggregation on B, the optimizer could > prefer A JOIN (AGG(B) JOIN C). I'm not sure how we could find the > best join order with eager aggregation applied without building the > join tree from the bottom up. Oh, that is a problem, yes. :-( > > I haven't done a detailed comparison of generate_grouped_paths() to > > other parts of the code, but I have an uncomfortable feeling that it > > might be rather similar to some existing code that probably already > > exists in multiple, slightly-different versions. Is there any > > refactoring we could do here? > > Yeah, we currently have several functions that do similar, but not > exactly the same, things. Maybe some refactoring is possible -- maybe > not -- I haven't looked into it closely yet. However, I'd prefer to > address that in a separate patch if possible, since this issue also > exists on master, and I want to avoid introducing such changes in this > already large patch. Well, it's not just a matter of "this already exists" -- it gets harder and harder to unify things the more near-copies you add. > Good point. I do have manually tested GEQO by setting geqo_threshold > to 2 and running the regression tests to check for any planning > errors, crashes, or incorrect results. However, I'm not sure where > test cases for GEQO should be added. I searched the regression tests > and found only one explicit GEQO test, added back in 2009 (commit > a43b190e3). It's not quite clear to me what the current policy is for > adding GEQO test cases. > > Anyway, I will add some test cases in eager_aggregate.sql with > geqo_threshold set to 2. Sounds good. I think GEQO is mostly-unmaintained these days, but if we're updating the code, I think it is good to add tests. Being that the code is so old, it probably lacks adequate test coverage. -- Robert Haas EDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: