Обсуждение: Relids instead of Bitmapset * in plannode.h
Hi, For some reason plannode.h has declared variable to hold RTIs as Bitmapset * instead of Relids like other places. Here's patch to fix it. This is superficial change as Relids is typedefed to Bitmapset *. Build succeeds for me and also make check passes. -- Best Wishes, Ashutosh Bapat
Вложения
Hello, On 2023-Oct-31, Ashutosh Bapat wrote: > For some reason plannode.h has declared variable to hold RTIs as > Bitmapset * instead of Relids like other places. Here's patch to fix > it. This is superficial change as Relids is typedefed to Bitmapset *. > Build succeeds for me and also make check passes. I think the reason for having done it this way, was mostly to avoid including pathnodes.h in plannodes.h. Did you explore what the consequences are? Starting here: https://doxygen.postgresql.org/plannodes_8h.html While looking at it, I noticed that tcopprot.h includes both plannodes.h and parsenodes.h, but there's no reason to include the latter (or at least headerscheck doesn't complain after removing it), so I propose to remove it, per 0001 here. There's a couple of files that need to be repaired for this change. windowfuncs.c is a no-brainer. However, having to edit bootstrap.h is a bit surprising -- I think before dac048f71ebb ("Build all Flex files standalone") this inclusion wasn't necessary, because the .y file already includes parsenodes.h; but after that commit, bootparse.h needs parsenodes.h to declare YYSTYPE, per comments in bootscanner.l. Anyway, this seems a good change. I also noticed while looking that I messed up in commit 7103ebb7aae8 ("Add support for MERGE SQL command") on this point, because I added #include parsenodes.h to plannodes.h. This is because MergeAction, which is in parsenodes.h, is also needed by some executor code. But the real way to fix that is to define that struct in primnodes.h. 0002 does that. (I'm forced to also move enum OverridingKind there, which is a bit annoying.) 0003 here is your patch, which I include because of conflicts with my 0002. After my 0002, plannodes.h is pretty slim, so I'd be hesitant to include pathnodes.h just to be able to change the Bitmapset * to Relids. But on the other hand, it doesn't seem to have too bad an effect overall (if only because plannodes.h is included by rather few files), so +0.1 on doing this. I would be more at ease if we didn't have to include parsenodes.h in pathnodes.h, though, but that looks more difficult to achieve. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Вложения
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > On 2023-Oct-31, Ashutosh Bapat wrote: >> For some reason plannode.h has declared variable to hold RTIs as >> Bitmapset * instead of Relids like other places. Here's patch to fix >> it. This is superficial change as Relids is typedefed to Bitmapset *. >> Build succeeds for me and also make check passes. > I think the reason for having done it this way, was mostly to avoid > including pathnodes.h in plannodes.h. Yes, I'm pretty sure that's exactly the reason, and I'm strongly against the initially-proposed patch. The include footprint of pathnodes.h would be greatly expanded, for no real benefit. Among other things, that fuzzes the distinction between planner modules and non-planner modules. > While looking at it, I noticed that tcopprot.h includes both plannodes.h > and parsenodes.h, but there's no reason to include the latter (or at > least headerscheck doesn't complain after removing it), so I propose to > remove it, per 0001 here. 0001 is ok, except check #include alphabetization. > I also noticed while looking that I messed up in commit 7103ebb7aae8 > ("Add support for MERGE SQL command") on this point, because I added > #include parsenodes.h to plannodes.h. This is because MergeAction, > which is in parsenodes.h, is also needed by some executor code. But the > real way to fix that is to define that struct in primnodes.h. 0002 does > that. (I'm forced to also move enum OverridingKind there, which is a > bit annoying.) This seems OK. It seems to me that parsenodes.h has been required by plannodes.h for a long time, but if we can decouple them, all the better. > 0003 here is your patch, which I include because of conflicts with my > 0002. Still don't like it. > ... I would be more at ease if we didn't have to include > parsenodes.h in pathnodes.h, though, but that looks more difficult to > achieve. Yeah, that dependency has been there a long time too. I'm not too fussed by dependencies on parsenodes.h, because anything involved with either planning or execution will certainly be looking at query trees too. But I don't want to add dependencies that tie planning and execution together. regards, tom lane
On Tue, Nov 7, 2023 at 8:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > On 2023-Oct-31, Ashutosh Bapat wrote: > >> For some reason plannode.h has declared variable to hold RTIs as > >> Bitmapset * instead of Relids like other places. Here's patch to fix > >> it. This is superficial change as Relids is typedefed to Bitmapset *. > >> Build succeeds for me and also make check passes. > > > I think the reason for having done it this way, was mostly to avoid > > including pathnodes.h in plannodes.h. > > Yes, I'm pretty sure that's exactly the reason, and I'm strongly > against the initially-proposed patch. The include footprint of > pathnodes.h would be greatly expanded, for no real benefit. > Among other things, that fuzzes the distinction between planner > modules and non-planner modules. As I mentioned in [1] the Bitmapset implementation is not space efficient to be used as Relids when there are thousands of partitions. I was assessing all usages of Bitmapset to find if there are other places where this is an issue. That's when I noticed this. At some point in future (possibly quite near) when queries will involved thousands of relations (partitions or otherwise) we will need to implement Relids in more space efficient way. Having all Relids usages of Bitmapset labelled as Relids will help us then. If we don't want to add pathnodes.h to plannodes.h there will be more work to identify Relids usage. That shouldn't be a couple of days work, so it's ok. Other possibilities are 1. Define Relids in bitmapset.h itself and use Relids everywhere Bitmapset * is really Relids. Wherever Relids is used bitmapset.h must have been included one or other other way. That's a bigger churn. 2. Replace RTIs with Relids in the comments and add the following comment somewhere near the #include section. "The Relids members in various structures in this file have been declared as Bitmapset * to avoid including pathnodes.h in this file. This include has greatly expanded footprint for no real benefit.". 3. Do nothing right now. If and when we implement Relids as a separate datastructure, it will get its own module. We will be able to place it somewhere properly. I have no additional comments on other patches. [1] https://www.postgresql.org/message-id/CAExHW5s4EqY43oB%3Dne6B2%3D-xLgrs9ZGeTr1NXwkGFt2j-OmaQQ%40mail.gmail.com -- Best Wishes, Ashutosh Bapat
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes: > On Tue, Nov 7, 2023 at 8:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Yes, I'm pretty sure that's exactly the reason, and I'm strongly >> against the initially-proposed patch. The include footprint of >> pathnodes.h would be greatly expanded, for no real benefit. > As I mentioned in [1] the Bitmapset implementation is not space > efficient to be used as Relids when there are thousands of partitions. TBH, I'd be very strongly against "optimizing" that case by adopting a data structure that is less efficient for typical rangetable sizes. I continue to think that anybody who is using that many partitions is Doing It Wrong and has no excuse for thinking it'll be free. Moreover, the size of their relid sets is pretty unlikely to be their worst pain point. In any case, that is a poor argument for weakening the separation between planner and executor. When and if somebody comes up with a credible replacement for bitmapsets here, we can consider what we want to do in terms of header-file organization --- but I do not think including pathnodes.h into executor files will be it. regards, tom lane