Обсуждение: Relids instead of Bitmapset * in plannode.h

Поиск
Список
Период
Сортировка

Relids instead of Bitmapset * in plannode.h

От
Ashutosh Bapat
Дата:
Hi,
For some reason plannode.h has declared variable to hold RTIs as
Bitmapset * instead of Relids like other places. Here's patch to fix
it. This is superficial change as Relids is typedefed to Bitmapset *.
Build succeeds for me and also make check passes.

-- 
Best Wishes,
Ashutosh Bapat

Вложения

Re: Relids instead of Bitmapset * in plannode.h

От
Alvaro Herrera
Дата:
Hello,

On 2023-Oct-31, Ashutosh Bapat wrote:

> For some reason plannode.h has declared variable to hold RTIs as
> Bitmapset * instead of Relids like other places. Here's patch to fix
> it. This is superficial change as Relids is typedefed to Bitmapset *.
> Build succeeds for me and also make check passes.

I think the reason for having done it this way, was mostly to avoid
including pathnodes.h in plannodes.h.  Did you explore what the
consequences are?  Starting here:
https://doxygen.postgresql.org/plannodes_8h.html

While looking at it, I noticed that tcopprot.h includes both plannodes.h
and parsenodes.h, but there's no reason to include the latter (or at
least headerscheck doesn't complain after removing it), so I propose to
remove it, per 0001 here.  There's a couple of files that need to be
repaired for this change.  windowfuncs.c is a no-brainer.  However,
having to edit bootstrap.h is a bit surprising -- I think before
dac048f71ebb ("Build all Flex files standalone") this inclusion wasn't
necessary, because the .y file already includes parsenodes.h; but after
that commit, bootparse.h needs parsenodes.h to declare YYSTYPE, per
comments in bootscanner.l.  Anyway, this seems a good change.

I also noticed while looking that I messed up in commit 7103ebb7aae8
("Add support for MERGE SQL command") on this point, because I added
#include parsenodes.h to plannodes.h.  This is because MergeAction,
which is in parsenodes.h, is also needed by some executor code.  But the
real way to fix that is to define that struct in primnodes.h.  0002 does
that.  (I'm forced to also move enum OverridingKind there, which is a
bit annoying.)

0003 here is your patch, which I include because of conflicts with my
0002.  After my 0002, plannodes.h is pretty slim, so I'd be hesitant to
include pathnodes.h just to be able to change the Bitmapset * to Relids.
But on the other hand, it doesn't seem to have too bad an effect overall
(if only because plannodes.h is included by rather few files), so +0.1
on doing this.  I would be more at ease if we didn't have to include
parsenodes.h in pathnodes.h, though, but that looks more difficult to
achieve.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/

Вложения

Re: Relids instead of Bitmapset * in plannode.h

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2023-Oct-31, Ashutosh Bapat wrote:
>> For some reason plannode.h has declared variable to hold RTIs as
>> Bitmapset * instead of Relids like other places. Here's patch to fix
>> it. This is superficial change as Relids is typedefed to Bitmapset *.
>> Build succeeds for me and also make check passes.

> I think the reason for having done it this way, was mostly to avoid
> including pathnodes.h in plannodes.h.

Yes, I'm pretty sure that's exactly the reason, and I'm strongly
against the initially-proposed patch.  The include footprint of
pathnodes.h would be greatly expanded, for no real benefit.
Among other things, that fuzzes the distinction between planner
modules and non-planner modules.

> While looking at it, I noticed that tcopprot.h includes both plannodes.h
> and parsenodes.h, but there's no reason to include the latter (or at
> least headerscheck doesn't complain after removing it), so I propose to
> remove it, per 0001 here.

0001 is ok, except check #include alphabetization.

> I also noticed while looking that I messed up in commit 7103ebb7aae8
> ("Add support for MERGE SQL command") on this point, because I added
> #include parsenodes.h to plannodes.h.  This is because MergeAction,
> which is in parsenodes.h, is also needed by some executor code.  But the
> real way to fix that is to define that struct in primnodes.h.  0002 does
> that.  (I'm forced to also move enum OverridingKind there, which is a
> bit annoying.)

This seems OK.  It seems to me that parsenodes.h has been required
by plannodes.h for a long time, but if we can decouple them, all
the better.

> 0003 here is your patch, which I include because of conflicts with my
> 0002.

Still don't like it.

> ... I would be more at ease if we didn't have to include
> parsenodes.h in pathnodes.h, though, but that looks more difficult to
> achieve.

Yeah, that dependency has been there a long time too.  I'm not too
fussed by dependencies on parsenodes.h, because anything involved
with either planning or execution will certainly be looking at
query trees too.  But I don't want to add dependencies that tie
planning and execution together.

            regards, tom lane



Re: Relids instead of Bitmapset * in plannode.h

От
Ashutosh Bapat
Дата:
On Tue, Nov 7, 2023 at 8:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > On 2023-Oct-31, Ashutosh Bapat wrote:
> >> For some reason plannode.h has declared variable to hold RTIs as
> >> Bitmapset * instead of Relids like other places. Here's patch to fix
> >> it. This is superficial change as Relids is typedefed to Bitmapset *.
> >> Build succeeds for me and also make check passes.
>
> > I think the reason for having done it this way, was mostly to avoid
> > including pathnodes.h in plannodes.h.
>
> Yes, I'm pretty sure that's exactly the reason, and I'm strongly
> against the initially-proposed patch.  The include footprint of
> pathnodes.h would be greatly expanded, for no real benefit.
> Among other things, that fuzzes the distinction between planner
> modules and non-planner modules.

As I mentioned in [1] the Bitmapset implementation is not space
efficient to be used as Relids when there are thousands of partitions.
I was assessing all usages of Bitmapset to find if there are other
places where this is an issue. That's when I noticed this. At some
point in future (possibly quite near) when queries will involved
thousands of relations (partitions or otherwise) we will need to
implement Relids in more space efficient way. Having all Relids usages
of Bitmapset labelled as Relids will help us then. If we don't want to
add pathnodes.h to plannodes.h there will be more work to identify
Relids usage. That shouldn't be a couple of days work, so it's ok.

Other possibilities are
1. Define Relids in bitmapset.h itself and use Relids everywhere
Bitmapset * is really Relids. Wherever Relids is used bitmapset.h must
have been included one or other other way. That's a bigger churn.

2. Replace RTIs with Relids in the comments and add the following
comment somewhere near the #include section. "The Relids members in
various structures in this file have been declared as Bitmapset * to
avoid including pathnodes.h in this file. This include has greatly
expanded footprint for no real benefit.".

3. Do nothing right now. If and when we implement Relids as a separate
datastructure, it will get its own module. We will be able to place it
somewhere properly.

I have no additional comments on other patches.

[1] https://www.postgresql.org/message-id/CAExHW5s4EqY43oB%3Dne6B2%3D-xLgrs9ZGeTr1NXwkGFt2j-OmaQQ%40mail.gmail.com

--
Best Wishes,
Ashutosh Bapat



Re: Relids instead of Bitmapset * in plannode.h

От
Tom Lane
Дата:
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes:
> On Tue, Nov 7, 2023 at 8:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yes, I'm pretty sure that's exactly the reason, and I'm strongly
>> against the initially-proposed patch.  The include footprint of
>> pathnodes.h would be greatly expanded, for no real benefit.

> As I mentioned in [1] the Bitmapset implementation is not space
> efficient to be used as Relids when there are thousands of partitions.

TBH, I'd be very strongly against "optimizing" that case by adopting a
data structure that is less efficient for typical rangetable sizes.
I continue to think that anybody who is using that many partitions
is Doing It Wrong and has no excuse for thinking it'll be free.
Moreover, the size of their relid sets is pretty unlikely to be
their worst pain point.

In any case, that is a poor argument for weakening the separation
between planner and executor.  When and if somebody comes up with
a credible replacement for bitmapsets here, we can consider what
we want to do in terms of header-file organization --- but I do
not think including pathnodes.h into executor files will be it.

            regards, tom lane