Обсуждение: pg_plan_advice

Поиск
Список
Период
Сортировка

pg_plan_advice

От
Robert Haas
Дата:
As I have mentioned on previous threads, for the past while I have
been working on planner extensibility. I've posted some extensibility
patches previously, and got a few of them committed in
Sepember/October with Tom's help, but I think the time has come a
patch which actually makes use of that infrastructure as well as some
further infrastructure that I'm also including in this posting.[1] The
final patch in this series adds a new contrib module called
pg_plan_advice. Very briefly, what pg_plan_advice knows how to do is
process a plan and emits a (potentially long) long text string in a
special-purpose mini-language that describes a bunch of key planning
decisions, such as the join order, selected join methods, types of
scans used to access individual tables, and where and how
partitionwise join and parallelism were used. You can then set
pg_plan_advice.advice to that string to get a future attempt to plan
the same query to reproduce those decisions, or (maybe a better idea)
you can trim that string down to constrain some decisions (e.g. the
join order) but not others (e.g. the join methods), or (if you want to
make your life more exciting) you can edit that advice string and
thereby attempt to coerce the planner into planning the query the way
you think best. There is a README that explains the design philosophy
and thinking in a lot more detail, which is a good place to start if
you're curious, and I implore you to read it if you're interested, and
*especially* if you're thinking of flaming me.

But that doesn't mean that you *shouldn't* flame me. There are a
remarkable number of things that someone could legitimately be unhappy
about in this patch set. First, any form of user control over the
planner tends to be a lightning rod for criticism around here. I've
come to believe that's the wrong way of thinking about it: we can want
to improve the planner over the long term and *also* want to have
tools available to work around problems with it in the short term.
Further, we should not imagine that we're going to solve problems that
have stumped other successful database projects any time in the
foreseeable future; no product will ever get 100% of cases right, and
you don't need to get to very obscure cases before other products
throw up their hands just as we do. But, second, even if you're OK
with the idea of some kind of user control over the planner, you could
very well be of the opinion that what I've implemented here is utter
crap. I've certainly had to make a ton of very opinionated decisions
to get to this point, and you are entitled to hate them. Of course, I
did have *reasons* for making the decisions, so if your operating
theory as to why I did something is that I'm a stupid moron, perhaps
consider an alternative explanation or two as well. Finally, even if
you're OK with the concept and feel that I've made some basically
reasonable design decisions, you might notice that the code is full of
bugs, needs a lot of cleanup, is missing features, lacks
documentation, and a bunch of other stuff. In that judgement, you
would be absolutely correct. I'm not posting it here because I'm
hoping to get it committed in November -- or at least, not THIS
November.  What I would like to do is getting some design feedback on
the preliminary patches, which I think will be more possible if
reviewers also have the main pg_plan_advice to look at as a way of
understanding why the exist, and also some feedback on the
pg_plan_advice patch itself.

Now I do want to caveat the statement that I am looking for feedback
just a little bit. I imagine that there will be some people reading
this who are already imagining how great life will be when they put
this into production, and begin complaining about either (1) features
that it's missing or (2) things that they don't like about the design
of the advice mini-language. What I'd ask you to keep in mind is that
you will not be able to put this into production unless and until
something gets committed, and getting this committed is probably going
to be super-hard even if you don't creep the scope, so maybe don't do
that, especially if you haven't read the README yet to understand what
the scope is actually intended to be. The details of the advice
mini-language are certainly open to negotiation; of everything, that
would be one of the easier things to change. However, keep in mind
that there are probably LOTS AND LOTS of people who all have their own
opinions about what decisions I should have made when designing that
mini-language, and an outcome where you personally get everything you
want and everyone who disagrees is out of luck is unlikely. In other
words, constructive suggestions for improvement are welcome, but
please think twice before turning this into a bikeshedding nightmare.
Now is the time to talk about whether I've got the overall design
somewhat correct moreso than whether I've spelled everything the way
you happen to prefer.[2]

I want to mention that, beyond the fact that I'm sure some people will
want to use something like this (with more feature and a lot fewer
bugs) in production, it seems to be super-useful for testing. We have
a lot of regression test cases that try to coerce the planner to do a
particular thing by manipulating enable_* GUCs, and I've spent a lot
of time trying to do similar things by hand, either for regression
test coverage or just private testing. This facility, even with all of
the bugs and limitations that it currently has, is exponentially more
powerful than frobbing enable_* GUCs. Once you get the hang of the
advice mini-language, you can very quickly experiment with all sorts
of plan shapes in ways that are currently very hard to do, and thereby
find out how expensive the planner thinks those things are and which
ones it thinks are even legal. So I see this as not only something
that people might find useful for in production deployments, but also
something that can potentially be really useful to advance PostgreSQL
development.

Which brings me to the question of where this code ought to go if it
goes anywhere at all. I decided to propose pg_plan_advice as a contrib
module rather than a part of core because I had to make a WHOLE lot of
opinionated design decisions just to get to the point of having
something that I could post and hopefully get feedback on. I figured
that all of those opinionated decisions would be a bit less
unpalatable if they were mostly encapsulated in a contrib module, with
the potential for some future patch author to write a different
contrib module that adopted different solutions to all of those
problems. But what I've also come to realize is that there's so much
infrastructure here that leaving the next person to reinvent it may
not be all that appealing. Query jumbling is a previous case where we
initially thought that different people might want to do different
things, but eventually realized that most people really just wanted
some solution that they didn't have to think too hard about. Likewise,
in this patch, the relation identifier system described in the README
is the only thing of its kind, to my knowledge, and any system that
wants to accomplish something similar to what pg_plan_advice does
would need a system like that. pg_hint_plan doesn't have something
like that, because pg_hint_plan is just trying to do hints. This is
trying to do round-trip-safe plan stability, where the system will
tell you how to refer unambiguously to a certain part of the query in
a way that will work correctly on every single query regardless of how
it's structured or how many times it refers to the same tables or to
different tables using the same aliases. If we say that we're never
going to put any of that infrastructure in core, then anyone who wants
to write a module to control the planner is going to need to start by
either (a) reinventing something similar, (b) cloning all the relevant
code, or (c) just giving up on the idea of unambiguous references to
parts of a query. None of those seem like great options, so now I'm
less sure whether contrib is actually the right place for this code,
but that's where I have put it for now. Feedback welcome, on this and
everything else.

Perhaps more than any other patch I've ever written, I know I'm
playing with fire here just by putting this out on the list, but I'm
nevertheless hopeful that something good can come of it, and I hope we
can have a constructive discussion about what that thing should be. I
think there is unquestionably is a lot of demand for the ability to
influence the planner in some form, but there is a lot of room for
debate about what exactly that should mean in practice. While I
personally am pretty happy with the direction of the code I've
written, modulo the large amount of not-yet-completed bug fixing and
cleanup, there's certainly plenty of room for other people to feel
differently, and finding out what other people think is, of course,
the whole point of posting things publicly before committing them --
or in this case, before even finishing them.[3] If you're interested
it contributing to the conversation, I urge you to start with the
following things: (1) the README in the final patch; (2) the
regression test examples in the final patch, which give a good sense
of what it actually looks like to use this; and (3) the earlier
patches, which show the minimum amount of core infrastructure that I
think we need in order to make something like this workable (ideas on
how to further reduce that footprint are very welcome).

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com

[1] All of the earlier patches have been posted previously in some
form, but the commit messages have been rewritten for clarity, and the
"Allow for plugin control over path generation strategies" patch has
been heavily rewritten since it was last posted; the earlier versions
turned out to have substantial inadequacies.

[2] This is not to say that proposal to modify or improve the syntax
are unwelcome, but the bigger obstacle to getting something committed
here is probably reaching some agreement on the internal details. Any
changes to src/include/optimizer or src/backend/optimizer need careful
scrutiny from a design perspective. Also, keep in mind that the syntax
needs to fit what we can actually do: a proposal to change the syntax
to something that implies semantics we can't implement is a dead
letter.

[3] Note, however, that a proposal to achieve the same or similar
goals by different means is more welcome than a proposal that I should
have done some other project entirely. I've already put a lot of work
into these goals and hope to achieve them, at least to some degree,
before I start working toward something else.

Вложения

Re: pg_plan_advice

От
Jakub Wartak
Дата:
On Thu, Oct 30, 2025 at 3:00 PM Robert Haas <robertmhaas@gmail.com> wrote:

[..over 400kB of attachments, yay]

Thank You for working on this!

My gcc-13 was nitpicking a little bit (see
compilation_warnings_v1.txt), so attached is just a tiny diff to fix
some of those issues. After that, clang-20 run was clean too.

> First, any form of user control over the planner tends to be a lightning rod for criticism around here.

I do not know where this is coming from, but everybody I've talked to
was saying this is needed to handle real enterprise databases and
applications. I just really love it, how one could precisely adjust
the plan with this even with the presence of heavy aliasing:

postgres=# explain (plan_advice, costs off) SELECT * FROM (select *
from t1 a join t2 b using (id)) a, t2 b, t3 c WHERE a.id = b.id and
b.id = c.id;
                     QUERY PLAN
-----------------------------------------------------
 Merge Join
   Merge Cond: (a.id = c.id)
   ->  Merge Join
         Merge Cond: (a.id = b.id)
         ->  Index Scan using t1_pkey on t1 a
         ->  Index Scan using t2_pkey on t2 b
   ->  Sort
         Sort Key: c.id
         ->  Seq Scan on t3 c
 Supplied Plan Advice:
   SEQ_SCAN(ble5) /* not matched */
 Generated Plan Advice:
   JOIN_ORDER(a#2 b#2 c)
   MERGE_JOIN_PLAIN(b#2 c)
   SEQ_SCAN(c)
   INDEX_SCAN(a#2 public.t1_pkey )
   NO_GATHER(c a#2 b#2)
(17 rows)

postgres=# set pg_plan_advice.advice = 'SEQ_SCAN(b#2)';
SET
postgres=# explain (plan_advice, costs off) SELECT * FROM (select *
from t1 a join t2 b using (id)) a, t2 b, t3 c WHERE a.id = b.id and
b.id = c.id;
                     QUERY PLAN
----------------------------------------------------
 Hash Join
   Hash Cond: (b.id = a.id)
   ->  Seq Scan on t2 b
   ->  Hash
         ->  Merge Join
               Merge Cond: (a.id = c.id)
               ->  Index Scan using t1_pkey on t1 a
               ->  Sort
                     Sort Key: c.id
                     ->  Seq Scan on t3 c
 Supplied Plan Advice:
   SEQ_SCAN(b#2) /* matched */
 Generated Plan Advice:
   JOIN_ORDER(b#2 (a#2 c))
   MERGE_JOIN_PLAIN(c)
   HASH_JOIN(c)
   SEQ_SCAN(b#2 c)
   INDEX_SCAN(a#2 public.t1_pkey)
   NO_GATHER(c a#2 b#2)

To attract a little attention to the $thread, the only bigger design
(usability) question that keeps ringing in my head is how we are going
to bind it to specific queries without even issuing any SETs(or ALTER
USER) in the far future in the grand scheme of things. The discussed
query id (hash), full query text comparison, maybe even strstr(query ,
"partial hit") or regex all seem to be kind too limited in terms of
what crazy ORMs can come up with (each query will be potentially
slightly different, but if optimizer reference points are stable that
should nail it good enough, but just enabling it for the very specific
set of queries and not the others [with same aliases] is some major
challenge).

Due to this, at some point I was even thinking about some hashes for
every plan node (including hashes of subplans), e.g.:
 Merge Join // hash(MERGE_JOIN_PLAIN(b#2) + ';' somehashval1 + ';'+
somehahsval2 ) => somehashval3
   Merge Cond: (a.id = c.id)
   ->  Merge Join
         Merge Cond: (a.id = b.id)
         ->  Index Scan using t1_pkey on t1 a // hash(INDEX_SCAN(a#2
public.t1_pkey)) => somehashval1
         ->  Index Scan using t2_pkey on t2 b // hash(INDEX_SCAN(b#2
public.t2_pkey)) => somehashval2

and then having a way to use `somehashval3` (let's say it's SHA1) as a
way to activate the necessary advice. Something like having a way to
express it using plan_advice.on_subplanhashes_plan_advice =
'somehashval3: SEQ_SCAN(b#2)'. This would have the benefit of being
able to override multiple similiar SQL queries in one go rather than
collecting all possible query_ids, but probably it's stupid,
heavyweight, but that would be my dream ;)

-J.

Вложения

Re: pg_plan_advice

От
Robert Haas
Дата:
On Fri, Oct 31, 2025 at 5:59 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
> > First, any form of user control over the planner tends to be a lightning rod for criticism around here.
>
> I do not know where this is coming from, but everybody I've talked to
> was saying this is needed to handle real enterprise databases and
> applications. I just really love it, how one could precisely adjust
> the plan with this even with the presence of heavy aliasing:

Thanks for the kind words.

I'll respond to the points about compiler warnings later.

> To attract a little attention to the $thread, the only bigger design
> (usability) question that keeps ringing in my head is how we are going
> to bind it to specific queries without even issuing any SETs(or ALTER
> USER) in the far future in the grand scheme of things. The discussed
> query id (hash), full query text comparison, maybe even strstr(query ,
> "partial hit") or regex all seem to be kind too limited in terms of
> what crazy ORMs can come up with (each query will be potentially
> slightly different, but if optimizer reference points are stable that
> should nail it good enough, but just enabling it for the very specific
> set of queries and not the others [with same aliases] is some major
> challenge).

Yeah, I haven't really dealt with this problem yet.

> Due to this, at some point I was even thinking about some hashes for
> every plan node (including hashes of subplans),
[...]
>
> and then having a way to use `somehashval3` (let's say it's SHA1) as a
> way to activate the necessary advice. Something like having a way to

This doesn't make sense to me, because it seems circular. We can't use
anything in the plan to choose which advice string to use, because the
purpose of the advice string is to influence the choice of plan. In
other words, our choice of what advice string to use has to be based
on the properties of the query, not the plan. We can implement
anything we want to do in terms of exactly how that works: we can use
the query ID, or the query text, or the query node tree.
Hypothetically, we could call out to a user-defined function and pass
the query text or the query node tree as an argument and let it do
whatever it wants to decide on an advice string. The practical problem
here is computational cost -- any computation that gets performed for
every single query is going to have to be pretty cheap to avoid
creating a performance problem. That's why I thought matching on query
ID or exact matching on query text would likely be the most practical
approaches, aside from the obvious alternative of setting and
resetting pg_plan_advice.advice manually. But I haven't really
explored this area too much yet, because I need to get all the basics
working first.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Alastair Turner
Дата:

On Fri, 31 Oct 2025, 12:51 Robert Haas, <robertmhaas@gmail.com> wrote:
On Fri, Oct 31, 2025 at 5:59 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
> > First, any form of user control over the planner tends to be a lightning rod for criticism around here.
>
> I do not know where this is coming from, but everybody I've talked to
> was saying this is needed to handle real enterprise databases and
> applications. I just really love it, how one could precisely adjust
> the plan with this even with the presence of heavy aliasing:

I really like the functionality of the current patch as well, even though I am suspicious of user control over the planner. By giving concise, precise control over a plan, this allows people who believe they can out-plan the planner to test their alternative, and possibly fail. 

Whatever other UIs and integrations you build as you develop this towards you goal, please keep what's currently there user accessible. Not only for testing code, but also for testing users' belief that they know better. 

Alastair 

Re: pg_plan_advice

От
Hannu Krosing
Дата:
This weas recently shared in LinkedIn
https://www.vldb.org/pvldb/vol18/p5126-bress.pdf

For example it says that 31% of all queries are metadata queries, 78%
have LIMIT, 20% of queries have 10+ joins, with 0.52% exceeding 100
joins. , 12% of expressions have depths between 11-100 levels, some
exceeding 100. These deeply nested conditions create optimization
challenges benchmarks don't capture.etc.

This reinforces my belief thet we either should have some kind of
two-level optimization, where most queries are handled quickly but
with something to trigger a more elaborate optimisation and
investigation workflow.

Or alternatively we could just have an extra layer before the query is
sent to the database which deals with unwinding the product of
excessively stupid query generators (usually, but not always, some BI
tools :) )


On Fri, Oct 31, 2025 at 10:18 PM Alastair Turner <minion@decodable.me> wrote:
>
>
> On Fri, 31 Oct 2025, 12:51 Robert Haas, <robertmhaas@gmail.com> wrote:
>>
>> On Fri, Oct 31, 2025 at 5:59 AM Jakub Wartak
>> <jakub.wartak@enterprisedb.com> wrote:
>> > > First, any form of user control over the planner tends to be a lightning rod for criticism around here.
>> >
>> > I do not know where this is coming from, but everybody I've talked to
>> > was saying this is needed to handle real enterprise databases and
>> > applications. I just really love it, how one could precisely adjust
>> > the plan with this even with the presence of heavy aliasing:
>
>
> I really like the functionality of the current patch as well, even though I am suspicious of user control over the
planner.By giving concise, precise control over a plan, this allows people who believe they can out-plan the planner to
testtheir alternative, and possibly fail. 
>
> Whatever other UIs and integrations you build as you develop this towards you goal, please keep what's currently
thereuser accessible. Not only for testing code, but also for testing users' belief that they know better. 
>
> Alastair



Re: pg_plan_advice

От
Robert Haas
Дата:
On Fri, Oct 31, 2025 at 5:17 PM Alastair Turner <minion@decodable.me> wrote:
> I really like the functionality of the current patch as well, even though I am suspicious of user control over the
planner.By giving concise, precise control over a plan, this allows people who believe they can out-plan the planner to
testtheir alternative, and possibly fail. 

Indeed. The downside of letting people control anything is that they
may leverage that control to do something bad. However, I think it is
unlikely that very many people would prefer to write an entire query
plan by hand. If you wanted to do that, why would you being using
PostgreSQL in the first place? Furthermore, if somebody does try to do
that, I expect that they will find it frustrating and difficult: the
planner considers a large number of options even for simple queries
and an absolutely vast number of options for more difficult queries,
and a human being trying possibilities one by one is only ever going
to consider a tiny fraction of those possibilities. The ideal
possibility often won't be in that small subset of the search space,
and the user will be wasting their time. If that were the design goal
of this feature, I don't think it would be worth having.

But it isn't. As I say in the README, what I consider the principal
use case is reproducing plans that you know to have worked well in the
past. Sometimes, the planner is correct for a while and then it's
wrong later. We don't need to accept the proposition that users can
out-plan the planner. We only need to accept that they can tell good
plans from bad plans better than the planner. That is a low bar to
clear. The planner never finds out what happens when the plans that it
generates are actually executed, but users do. If they are
sufficiently experienced, they can make reasonable judgements about
whether the plan they're currently getting is one they'd like to
continue getting. Of course, they may make wrong judgements even then,
because they lack knowledge or experience or just make a mistake, but
it's not a farcically unreasonable thing to do. I've basically never
wanted to write my own query plan from scratch, but I've certainly
looked at many plans over the years and judged them to be great, or
terrible, or good for now but risky in the long-term; and I'm probably
not the only human being on the planet capable of making such
judgements with some degree of competence.

> Whatever other UIs and integrations you build as you develop this towards you goal, please keep what's currently
thereuser accessible. Not only for testing code, but also for testing users' belief that they know better. 

And this is also a good point. Knowledgeable and experienced users can
look at a plan that the planner generated, feel like it's bad, and
wonder why the planner picked it. You can try to figure that out by,
for example, setting enable_SOMETHING = false and re-running EXPLAIN,
but since there aren't that many such knobs relevant to any given
query, and since changing any of those knobs can affect large swathes
of the query and not just the part you're trying to understand better,
it can actually be really difficult to understand why the planner
thought that something was the best option. Sometimes you can't even
tell whether the planner thinks that the plan you expected to be
chosen is *impossible* or just *more expensive*, which is always one
of the things that I'm keen to find out when something weird is
happening. This can make answering that question a great deal easier.
If some important index is not getting used, you can say "no, really,
I want to see what happens with this query when you plan it with that
index" -- and then it either gives you a plan that does use that
index, and you can see how much more expensive it is and why, or it
still doesn't give you a plan using that index, and you know that the
index is inapplicable to the query or unusable in general for some
reason. You don't necessarily have it as a goal to coerce the planner
in production; your goal may very well be to find out why your belief
that you know better is incorrect.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Robert Haas
Дата:
On Sat, Nov 1, 2025 at 12:10 PM Hannu Krosing <hannuk@google.com> wrote:
> This reinforces my belief thet we either should have some kind of
> two-level optimization, where most queries are handled quickly but
> with something to trigger a more elaborate optimisation and
> investigation workflow.
>
> Or alternatively we could just have an extra layer before the query is
> sent to the database which deals with unwinding the product of
> excessively stupid query generators (usually, but not always, some BI
> tools :) )

I'd like to keep the focus of this thread on the patches that I'm
proposing, rather than other ideas for improving the planner. I
actually agree with you that at least the first of these things might
be a very good idea, but that would be an entirely separate project
from these patches, and I feel a lot more qualified to do this project
than that one.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
John Naylor
Дата:
On Thu, Oct 30, 2025 at 9:00 PM Robert Haas <robertmhaas@gmail.com> wrote:
> First, any form of user control over the
> planner tends to be a lightning rod for criticism around here. I've
> come to believe that's the wrong way of thinking about it: we can want
> to improve the planner over the long term and *also* want to have
> tools available to work around problems with it in the short term.

The most frustrating real-world incidents I've had were in the course
of customers planning a major version upgrade, or worse, after
upgrading and finding that a 5 minute query now takes 5 hours. I
mention this to emphasize that workarounds will be needed also to deal
with rare unintended effects that arise from our very attempts to
improve the planner.

> Further, we should not imagine that we're going to solve problems that
> have stumped other successful database projects any time in the
> foreseeable future; no product will ever get 100% of cases right, and
> you don't need to get to very obscure cases before other products
> throw up their hands just as we do.

Right.

> it seems to be super-useful for testing. We have
> a lot of regression test cases that try to coerce the planner to do a
> particular thing by manipulating enable_* GUCs, and I've spent a lot
> of time trying to do similar things by hand, either for regression
> test coverage or just private testing. This facility, even with all of
> the bugs and limitations that it currently has, is exponentially more
> powerful than frobbing enable_* GUCs. Once you get the hang of the
> advice mini-language, you can very quickly experiment with all sorts
> of plan shapes in ways that are currently very hard to do, and thereby
> find out how expensive the planner thinks those things are and which
> ones it thinks are even legal. So I see this as not only something
> that people might find useful for in production deployments, but also
> something that can potentially be really useful to advance PostgreSQL
> development.

That sounds very useful as well.

--
John Naylor
Amazon Web Services



Re: pg_plan_advice

От
Robert Haas
Дата:
On Fri, Oct 31, 2025 at 5:59 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
> My gcc-13 was nitpicking a little bit (see
> compilation_warnings_v1.txt), so attached is just a tiny diff to fix
> some of those issues. After that, clang-20 run was clean too.

Here's v2. Change log:

- Attempted to fix the compiler warnings. I didn't add elog() before
pg_unreachable() as you suggested; instead, I added a dummy return
afterwards. Let's see if that works. Also, I decided after reading the
comment for list_truncate() that what I'd done there was not going to
be acceptable, so I rewrote the code slightly. It now copies the list
when adding to it, instead of relying on the ability to use
list_truncate() to recreate the prior tstate.

- Deleted the SQL-callable pg_parse_advice function and related code.
That was useful to me early in development but I don't think anyone
will need it at this point; if you want to test whether an advice
string can be parsed, just try setting pg_plan_advice.advice.

- Fixed a couple of dumb bugs in pgpa_trove.c.

- Added a few more regression test scenarios.

- Fixed a couple of typos/thinkos.

--
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

Re: pg_plan_advice

От
Robert Haas
Дата:

Re: pg_plan_advice

От
"Matheus Alcantara"
Дата:
Hi

On Thu Nov 6, 2025 at 1:45 PM -03, Robert Haas wrote:
> Here's v3. I've attempted to fix some more things that cfbot didn't
> like, one of which was an actual bug in 0005, and I also fixed a
> stupid few bugs in pgpa_collector.c and added a few more tests.
>
I've spent some time playing with these patches. I still don't have to
much comments on the syntax yet but I've noticed a small bug or perhaps
I'm missing something?

When I run CREATE EXTENSION pg_plan_advice I'm able to use the
EXPLAIN(plan_advice) but if try to open another connection, with the
extension already previously created, I'm unable to use once I drop and
re-create the extension.

tpch=# create extension pg_plan_advice;
ERROR:  extension "pg_plan_advice" already exists
tpch=# explain(plan_advice) select 1;
ERROR:  unrecognized EXPLAIN option "plan_advice"
LINE 1: explain(plan_advice) select 1;
                ^
tpch=# drop extension pg_plan_advice ;
DROP EXTENSION
tpch=# create extension pg_plan_advice;
CREATE EXTENSION
tpch=# explain(plan_advice) select 1;
                QUERY PLAN
------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=4)
 Generated Plan Advice:
   NO_GATHER("*RESULT*")


And thanks for working on this. I think that this can be a very useful
feature for both users and for postgres hackers, +1 for the idea.

--
Matheus Alcantara
EDB: http://www.enterprisedb.com




Re: pg_plan_advice

От
Robert Haas
Дата:
On Mon, Nov 17, 2025 at 9:42 AM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:
> I've spent some time playing with these patches. I still don't have to
> much comments on the syntax yet but I've noticed a small bug or perhaps
> I'm missing something?

Cool, thanks for looking. I am guessing that the paucity of feedback
thus far is partly because there's a lot of stuff to absorb -- though
the main point at this stage is really to get some opinions on the
planner infrastructure/hooks, which don't necessarily require full
understanding of (never mind agreement with) the design of
pg_plan_advice itself.

> When I run CREATE EXTENSION pg_plan_advice I'm able to use the
> EXPLAIN(plan_advice) but if try to open another connection, with the
> extension already previously created, I'm unable to use once I drop and
> re-create the extension.

This is just an idiosyncrasy of PostgreSQL's extension framework.
Whether or not EXPLAIN (PLAN_ADVICE) works depends on whether the
shared module has been loaded, not whether the extension has been
created. The purpose of CREATE EXTENSION is to put SQL objects, such
as function definitions, into the database, but there's no SQL
required to enable EXPLAIN (PLAN_ADVICE) -- or for setting the
pg_plan_advice.advice GUC. However, running CREATE EXTENSION to
establish the function definitions will incidentally load the shared
module into that particular session.

Therefore, the best way to use this module is to add pg_plan_advice to
shared_preload_libraries. Alternatively, you can use
session_preload_libraries or run LOAD in an individual session. If you
don't care about the collector interface, that's really all you need.
If you do care about the collector interface, then in addition you
will need to run CREATE EXTENSION, so that the SQL functions needed to
access it are available.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Robert Haas
Дата:

Re: pg_plan_advice

От
"Dian Fay"
Дата:
On Tue Nov 18, 2025 at 11:19 AM EST, Robert Haas wrote:
> Here's v4. This version has some bug fixes and test case changes to
> 0005 and 0006, with the goal of getting CI to pass cleanly (which it
> now does for me, but let's see if cfbot agrees).

Thanks for working on this, Robert! I think the design seems solid (and
very powerful) from a user perspective. I was curious what would happen
with row-level security interactions so I tried it out on a toy example
I put together a while back. I found one case where scan advice fails on
an intentionally naive/bad policy implementation, but I'm not sure why
and it seems like the kind of weird corner case that might be useful to
reason about. See attached for the setup script, then:

set pg_plan_advice.advice = 'BITMAP_HEAP_SCAN(item public.item_tags_idx)';
set item_reader.allowed_tags = '{alpha,beta}';
set role item_reader;

explain (plan_advice, analyze, verbose, costs, timing)
select * from item
where value ilike 'a%' and tags && array[1];

Seq Scan on public.item  (cost=0.00..41777312.00 rows=54961 width=67) (actual time=2.947..8603.333 rows=6762.00
loops=1)
 Disabled: true
 Output: item.id, item.value, item.tags
 Filter: (EXISTS(SubPlan exists_1) AND (item.value ~~* 'a%'::text) AND (item.tags && '{1}'::integer[]))
 Rows Removed by Filter: 993238
 Buffers: shared hit=1012312
 SubPlan exists_1
   ->  Seq Scan on public.tag  (cost=0.00..41.75 rows=1 width=0) (actual time=0.008..0.008 rows=0.21 loops=1000000)
         Filter: ((current_setting('item_reader.allowed_tags'::text) IS NOT NULL) AND
((current_setting('item_reader.allowed_tags'::text))::text[]@> ARRAY[tag.name]) AND (item.tags @> ARRAY[tag.id])) 
         Rows Removed by Filter: 18
         Buffers: shared hit=1000000
Planning Time: 1.168 ms
Supplied Plan Advice:
 BITMAP_HEAP_SCAN(item public.item_tags_idx) /* matched, failed */
Generated Plan Advice:
 SEQ_SCAN(item tag@exists_1)
 NO_GATHER(item tag@exists_1)
Execution Time: 8603.615 ms

Since the policies don't contain any execution boundaries, all the quals
should be going into a single bucket for planning if I understand the
process correctly. The bitmap heap scan should be a candidate given the
`tags &&` predicate (and indeed if I switch to a privileged role, the
advice matches successfully without any policies in the mix), but gdb
shows the walker bouncing out of pgpa_walker_contains_scan without any
candidate scans for the BITMAP_HEAP_SCAN strategy.

I do want to avoid getting bikesheddy about the advice language so I'll
forbear from syntax discussion, but one design thought with lower-level
implications did occur to me as I was playing with this: it might be
useful in some situations to influence the planner _away_ from known
worse paths while leaving it room to decide on the best other option. I
think the work you did in path management should make this pretty
straightforward for join and scan strategies, since it looks like you've
basically made the enable_* gucs a runtime-configurable bitmask (which
seems like a perfectly reasonable approach to my "have done some source
diving but not an internals hacker" eyes), and could disable one as
easily as forcing one.

"Don't use this one index" sounds more fiddly to implement, but also
less valuable since in that case you probably already know which other
index it should be using.

Вложения

Re: pg_plan_advice

От
Robert Haas
Дата:
On Sat, Nov 22, 2025 at 7:43 PM Dian Fay <di@nmfay.com> wrote:
> Thanks for working on this, Robert!

Thanks for looking at it! I was hoping for a bit more in the way of
responses by now, honestly.

> Since the policies don't contain any execution boundaries, all the quals
> should be going into a single bucket for planning if I understand the
> process correctly. The bitmap heap scan should be a candidate given the
> `tags &&` predicate (and indeed if I switch to a privileged role, the
> advice matches successfully without any policies in the mix), but gdb
> shows the walker bouncing out of pgpa_walker_contains_scan without any
> candidate scans for the BITMAP_HEAP_SCAN strategy.

I can understand why it seems that way, but when I try setting
enable_seqscan=false instead of using pg_plan_advice, I get exactly
the same result. I think this is actually a great example both of why
this is actually a very powerful tool and also why it has the
potential to be really confusing. The power comes from the fact that
you can find out whether the planner thinks that the thing you want to
do is even possible. In this case, that's easy anyway because the
example is simple enough, but sometimes you can't set
enable_seqscan=false or similar because it would change too many other
things in the plan at that same time and you wouldn't be able to
compare. In those situations, this figures to be useful. However, all
this can do is tell you that the answer to the question "is this a
possible plan shape?" is "no". It cannot tell you why, and you may
easily find the result counterintuitive.

And honestly, this is one of the things I'm worried about if we go
forward with this, that we'll get a ton of people who think it doesn't
work because it doesn't force the planner to do things which the
planner rejects on non-cost considerations. We're going to need really
good documentation to explain to people that if you use this to try to
force a plan and you can't, that's not a bug, that's the planner
telling you that that plan shape is not able to be considered for some
reason. That won't keep people from complaining about things that
aren't really bugs, but at least it will mean that there's a link we
can give them to explain why the way they're thinking about it is
incorrect. However, that will just beg the next question of WHY the
planner doesn't think a certain plan can be considered, and honestly,
I've found over the years that I often need to resort to the source
code to answer those kinds of questions. People who are not good at
reading C source code are not going to like that answer very much, but
I still think it's better if they know THAT the planner thinks the
plan shape is impossible even if we can't tell them WHY the planner
thinks that the plan shape is impossible. We probably will want to
document at least some of the common reasons why this happens, to cut
down on getting the same questions over and over again.

In this particular case, I think the problem is that the user-supplied
qual item.tags @> ARRAY[id] is not leakproof and therefore must be
tested after the security qual. There's no way to use a Bitmap Heap
Scan without reversing the order of those tests.

> I do want to avoid getting bikesheddy about the advice language so I'll
> forbear from syntax discussion, but one design thought with lower-level
> implications did occur to me as I was playing with this: it might be
> useful in some situations to influence the planner _away_ from known
> worse paths while leaving it room to decide on the best other option. I
> think the work you did in path management should make this pretty
> straightforward for join and scan strategies, since it looks like you've
> basically made the enable_* gucs a runtime-configurable bitmask (which
> seems like a perfectly reasonable approach to my "have done some source
> diving but not an internals hacker" eyes), and could disable one as
> easily as forcing one.

I mostly agree. Saying not to use a sequential scan on a certain
table, or not to use a particular index, or not to use a particular
join method seem like things that would be potentially useful, and
they would be straightforward generalizations of what the code already
does. For me, that would principally be a way to understand better why
the planner chose what it did. I often wonder what the planner's
second choice would have been, but I don't just want the plan with the
second-cheapest overall cost, because that will be something just
trivially different. I want the cheapest plan that excludes some key
element of the current plan, so I can see a meaningfully different
alternative.

That said, I don't see this being a general thing that would make
sense across all of the tags that pg_plan_advice supports. For
example, NO_JOIN_ORDER() sounds hard to implement and largely useless.

The main reason I haven't done this is that I want to keep the focus
on plan stability, or said differently, on things that can properly
round-trip. You should be able to run a query with EXPLAIN
(PLAN_ADVICE), then set pg_plan_advice.advice to the resulting string,
rerun the query, and get the same plan with all of the advice
successfully matching. Since EXPLAIN (PLAN_ADVICE) would never emit
these proposed negative tags, we'd need to think a little bit harder
about how that stuff should be tested. That's not necessarily a big
deal or anything, but I didn't think it was an essential element of
the initial scope, so I left it out. I'm happy to add it in at some
point, or for someone else to do so, but not until this much is
working well.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
"Dian Fay"
Дата:
On Mon Nov 24, 2025 at 11:14 AM EST, Robert Haas wrote:
> On Sat, Nov 22, 2025 at 7:43 PM Dian Fay <di@nmfay.com> wrote:
>> Since the policies don't contain any execution boundaries, all the quals
>> should be going into a single bucket for planning if I understand the
>> process correctly. The bitmap heap scan should be a candidate given the
>> `tags &&` predicate (and indeed if I switch to a privileged role, the
>> advice matches successfully without any policies in the mix), but gdb
>> shows the walker bouncing out of pgpa_walker_contains_scan without any
>> candidate scans for the BITMAP_HEAP_SCAN strategy.
>
> In this particular case, I think the problem is that the user-supplied
> qual item.tags @> ARRAY[id] is not leakproof and therefore must be
> tested after the security qual. There's no way to use a Bitmap Heap
> Scan without reversing the order of those tests.

Right, I keep forgetting the functions underneath those array operators
aren't leakproof. Thanks for digging.

> And honestly, this is one of the things I'm worried about if we go
> forward with this, that we'll get a ton of people who think it doesn't
> work because it doesn't force the planner to do things which the
> planner rejects on non-cost considerations. We're going to need really
> good documentation to explain to people that if you use this to try to
> force a plan and you can't, that's not a bug, that's the planner
> telling you that that plan shape is not able to be considered for some
> reason.

Once we're closer to consensus on pg_plan_advice or something like it
landing, I'm interested in helping out on this end of things!



Re: pg_plan_advice

От
Robert Haas
Дата:
On Sat, Nov 29, 2025 at 10:17 PM Dian Fay <di@nmfay.com> wrote:
> Once we're closer to consensus on pg_plan_advice or something like it
> landing, I'm interested in helping out on this end of things!

Thanks!

014f9a831a320666bf2195949f41710f970c54ad removes the need for what was
previously 0004, so here is a new patch series with that dropped, to
avoid confusing cfbot or human reviewers.

--
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

Re: pg_plan_advice

От
"Greg Burd"
Дата:
On Fri, Dec 5, 2025, at 2:57 PM, Robert Haas wrote:
> 014f9a831a320666bf2195949f41710f970c54ad removes the need for what was
> previously 0004, so here is a new patch series with that dropped, to
> avoid confusing cfbot or human reviewers.
>
> -- 
> Robert Haas
> EDB: http://www.enterprisedb.com

Hey Robert,

Thanks for working on this!  I think the idea has merit, I hope it lands sometime soon.

I've worked on extending PostgreSQL's planner for a JSONB-like document system. The new query shapes frequently caused
mysteriousplanner issues. Having the ability to:
 

1. Dump detailed planner decisions for comparison during development
2. Record planner choices from customer databases to reproduce their issues
3. Apply fixed plans to specific queries as a quick customer workaround while addressing root causes in later releases

...would have been invaluable.

Yes, there's danger here ("with great power comes great responsibility"), but I see this as providing more information
tomake better decisions when working with the black art of planner logic.
 

ORM Query Variation Challenge

Jakob's point about "crazy ORMs" is important. ORMs generate queries with minor variations that should ideally match
thesame plan advice. I need to study the plan advice matching logic more deeply to understand how it handles query
variations.

This reminded me of Erlang/Elixir's "parse transforms" - compiled code forms an AST that registered transforms can
modifyvia pattern matching. The concept might be relevant here: pattern-matching portions of query ASTs to apply advice
despitesyntactic variations. I'd need to think more about whether this intersects well with the current design or if
it'simpractical, but it's worth exploring.
 

> Attachments:
> * v5-0001-Store-information-about-range-table-flattening-in.patch

contrib/pg_overexplain/pg_overexplain.c

+               /* Advance to next SubRTInfo, if it's time. */
+               if (lc_subrtinfo != NULL)
+               {
+                       next_rtinfo = lfirst(lc_subrtinfo);
+                       if (rti > next_rtinfo->rtoffset)

Should the test be >= not >? Unless I am I reading this wrong, when rti == rtoffset, that's the first entry of the new
subplan'srange table. That would mean that the current logic skips displaying the subplan name for the first RTE of
eachsubplan.
 

in src/include/nodes/plannodes.h there is:

+typedef struct SubPlanRTInfo
+{
+       NodeTag         type;
+       const char *plan_name;
+       Index           rtoffset;
+       bool            dummy;
+} SubPlanRTInfo;

This is where I get confused, if rtoffset is an Index, then the comparison (above) in pg_overexplain uses rti >
next_rtinfo->rtoffsetwhere rti starts at 1. If rtoffset is 0 for the first subplan, the logic might be off-by-one, no?
 

> This commit teaches pg_overexplain'e RANGE_TABLE option to make use
Minor nit in the commit message, "pg_overexplain'e" should be "pg_overexplain's"

> * v5-0002-Store-information-about-elided-nodes-in-the-final.patch

+/*
+ * Record some details about a node removed from the plan during setrefs
+ * procesing, for the benefit of code trying to reconstruct planner decisions
+ * from examination of the final plan tree.
+ */

Nit, "procesing" should be "processing"

> * v5-0003-Store-information-about-Append-node-consolidation.patch

src/backend/optimizer/path/allpaths.c

/* Now consider each interesting sort ordering */
foreach(lcp, all_child_pathkeys)
{
        List       *subpaths = NIL;
        bool            subpaths_valid = true;
+       List       *subpath_cars = NIL;
        List       *startup_subpaths = NIL;
        bool            startup_subpaths_valid = true;
+       List       *startup_subpath_cars = NIL;
        List       *partial_subpaths = NIL;
+       List       *partial_subpath_cars = NIL;
        List       *pa_partial_subpaths = NIL;
        List       *pa_nonpartial_subpaths = NIL;
+       List       *pa_subpath_cars = NIL;

I find "cars" a bit cryptic (albeit clever), I think I've decoded it properly and it stands for
"child_append_relid_sets",correct?  Could you add a comment or use a clearer name like subpath_child_relids or
consolidated_relid_sets?


+accumulate_append_subpath(Path *path, List **subpaths, List **special_subpaths,
+                                                 List **child_append_relid_sets)
 {
        if (IsA(path, AppendPath))
        {
@@ -2219,6 +2256,8 @@ accumulate_append_subpath(Path *path, List **subpaths, List **special_subpaths)
                if (!apath->path.parallel_aware || apath->first_partial_path == 0)
                {
                        *subpaths = list_concat(*subpaths, apath->subpaths);
+                       *child_append_relid_sets =
+                               lappend(*child_append_relid_sets, path->parent->relids);

Is it possible that when pulling up multiple subpaths from an AppendPath, only ONE relid set is added to
child_append_relid_sets,but MULTIPLE paths are added to subpaths?  If so, that would break the correspondence between
thelists which would be bad, right?
 

src/include/nodes/pathnodes.h
+ * Whenever accumulate_append_subpath() allows us to consolidate multiple
+ * levels of Append paths are consolidated down to one, we store the RTI
+ * sets for the omitted paths in child_append_relid_sets. This is not necessary
+ * for planning or execution; we do it for the benefit of code that wants
+ * to inspect the final plan and understand how it came to be.

Minor: "paths are consolidated" is redundant, should be "paths consolidated" or "allows us to consolidate".

> * v5-0004-Allow-for-plugin-control-over-path-generation-str.patch

src/backend/optimizer/path/costsize.c
+       else
+               enable_mask |= PGS_CONSIDER_NONPARTIAL;

-       path->disabled_nodes = enable_seqscan ? 0 : 1;
+       path->disabled_nodes =
+               (baserel->pgs_mask & enable_mask) == enable_mask ? 0 : 1;

When parallel_workers > 0 the path is partial and doesn't need PGS_CONSIDER_NONPARTIAL. But if parallel_workers == 0,
it'snon-partial and DOES need it, right?  Would this mean that non-partial paths can be disabled even when the scan
typeitself (e.g., PGS_SEQSCAN) is enabled?  Intentional?
 

> * v5-0005-WIP-Add-pg_plan_advice-contrib-module.patch

It seems this is still WIP with a solid start, I'm not going to dig too much into it. :)

Keep it up, best.

-greg



Re: pg_plan_advice

От
Jacob Champion
Дата:
Hello,

On Fri, Dec 5, 2025 at 11:57 AM Robert Haas <robertmhaas@gmail.com> wrote:
> 014f9a831a320666bf2195949f41710f970c54ad removes the need for what was
> previously 0004, so here is a new patch series with that dropped, to
> avoid confusing cfbot or human reviewers.

I really like this idea! Telling the planner, "if you need to make a
decision for [this thing], choose [this way]," seems to be a really
nice way of sidestepping many of the concerns with "user control".

I've started an attempt to throw a fuzzer at this, because I'm pretty
useless when it comes to planner/optimizer review. I don't really know
what the overall fuzzing strategy is going to be, given the multiple
complicated inputs that have to be constructed and somehow correlated
with each other, but I'll try to start small and expand:

a) fuzz the parser first, because it's easy and we can get interesting inputs
b) fuzz the AST utilities, seeded with "successful" corpus members from a)
c) stare really hard at the corpus of b) and figure out how to
usefully mutate a PlannedStmt with it
d) use c) to fuzz pgpa_plan_walker, then pgpa_output_advice, then...?

I'm in the middle of an implementation of b) now, and it noticed the
following code (which probably bodes well for the fuzzer itself!):

>        if (rid->partnsp == NULL)
>            result = psprintf("%s/%s", result,
>                              quote_identifier(rid->partnsp));

I assume that should be quote_identifier(rid->partrel)?

= Other Notes =

GCC 11 complains about the following code in pgpa_collect_advice():

>        dsa_area   *area = pg_plan_advice_dsa_area();
>        dsa_pointer ca_pointer;
>
>        pgpa_make_collected_advice(userid, dbid, queryId, now,
>                                   query_string, advice_string, area,
>                                   &ca_pointer);
>        pgpa_store_shared_advice(ca_pointer);

It doesn't know that area is guaranteed to be non-NULL, so it can't
prove that ca_pointer is initialized.

(GCC also complains about unique_nonjoin_rtekind() not initializing
the rtekind, but I think that's because of a bug [1].)

--Jacob

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107838



Re: pg_plan_advice

От
Robert Haas
Дата:
On Mon, Dec 8, 2025 at 3:39 PM Greg Burd <greg@burd.me> wrote:
> Thanks for working on this!  I think the idea has merit, I hope it lands sometime soon.

Thanks. I think it needs a good deal more review first, but I
appreciate the support.

> > Attachments:
> > * v5-0001-Store-information-about-range-table-flattening-in.patch
>
> contrib/pg_overexplain/pg_overexplain.c
>
> +               /* Advance to next SubRTInfo, if it's time. */
> +               if (lc_subrtinfo != NULL)
> +               {
> +                       next_rtinfo = lfirst(lc_subrtinfo);
> +                       if (rti > next_rtinfo->rtoffset)
>
> Should the test be >= not >? Unless I am I reading this wrong, when rti == rtoffset, that's the first entry of the
newsubplan's range table. That would mean that the current logic skips displaying the subplan name for the first RTE of
eachsubplan. 

I don't think so. I think I actually had it that way at one point, and
I believe I found that it was wrong. RTIs are 1-based, so the smallest
per-subquery RTI is 1. rtoffset is the amount that must be added to
the per-subquery RTI to get a "flat" RTI that can be used to index
into the final range table. But if you find that theoretical argument
unconvincing, by all means please test it and see what happens!

> > This commit teaches pg_overexplain'e RANGE_TABLE option to make use
> Minor nit in the commit message, "pg_overexplain'e" should be "pg_overexplain's"

Thanks, fixed in my local branch.

> > * v5-0002-Store-information-about-elided-nodes-in-the-final.patch
>
> +/*
> + * Record some details about a node removed from the plan during setrefs
> + * procesing, for the benefit of code trying to reconstruct planner decisions
> + * from examination of the final plan tree.
> + */
>
> Nit, "procesing" should be "processing"

Thanks, fixed in my local branch.

> > * v5-0003-Store-information-about-Append-node-consolidation.patch
>
> src/backend/optimizer/path/allpaths.c
>
> /* Now consider each interesting sort ordering */
> foreach(lcp, all_child_pathkeys)
> {
>         List       *subpaths = NIL;
>         bool            subpaths_valid = true;
> +       List       *subpath_cars = NIL;
>         List       *startup_subpaths = NIL;
>         bool            startup_subpaths_valid = true;
> +       List       *startup_subpath_cars = NIL;
>         List       *partial_subpaths = NIL;
> +       List       *partial_subpath_cars = NIL;
>         List       *pa_partial_subpaths = NIL;
>         List       *pa_nonpartial_subpaths = NIL;
> +       List       *pa_subpath_cars = NIL;
>
> I find "cars" a bit cryptic (albeit clever), I think I've decoded it properly and it stands for
"child_append_relid_sets",correct?  Could you add a comment or use a clearer name like subpath_child_relids or
consolidated_relid_sets?

I certainly admit that this is a bit too clever. I am not entirely
sure how to make it less clever. There needs to be a
child-append-relid-sets list corresponding to every current and future
subpath list, and the names of some of those subpath lists are already
quite long, so whatever naming convention we choose for the "cars"
lists had better not add too much more length to the variable name. I
felt like someone looking at this might initially be confused by what
"cars" meant, but then I thought that they would probably look at how
the variable was used and see that it was for example being passed as
the second argument to get_singleton_append_subpath(), which is named
child_append_relid_sets, or being passed to create_append_path or
create_merge_append_path, which also use that naming. I figured that
this would clear up the confusion pretty quickly. I could certainly
add a comment above this block of variable assignments saying
something like "for each list of paths, we must also maintain a list
of child append relid sets, etc. etc." but I worried that this would
create as much confusion as it solved, i.e. somebody reading the code
would be going: why is this comment here? Is it trying to tell me that
there's something weirder going on than what is anyway obvious?

If I get more opinions that some clarification is needed here, I'm
happy to change it, especially if those opinions agree with each other
on exactly what to change, but I think for now I'll leave it as it is.

> +accumulate_append_subpath(Path *path, List **subpaths, List **special_subpaths,
> +                                                 List **child_append_relid_sets)
>  {
>         if (IsA(path, AppendPath))
>         {
> @@ -2219,6 +2256,8 @@ accumulate_append_subpath(Path *path, List **subpaths, List **special_subpaths)
>                 if (!apath->path.parallel_aware || apath->first_partial_path == 0)
>                 {
>                         *subpaths = list_concat(*subpaths, apath->subpaths);
> +                       *child_append_relid_sets =
> +                               lappend(*child_append_relid_sets, path->parent->relids);
>
> Is it possible that when pulling up multiple subpaths from an AppendPath, only ONE relid set is added to
child_append_relid_sets,but MULTIPLE paths are added to subpaths?  If so, that would break the correspondence between
thelists which would be bad, right? 

That would indeed be bad, but I'm not clear on how you think it could
happen. Can you clarify?

> src/include/nodes/pathnodes.h
> + * Whenever accumulate_append_subpath() allows us to consolidate multiple
> + * levels of Append paths are consolidated down to one, we store the RTI
> + * sets for the omitted paths in child_append_relid_sets. This is not necessary
> + * for planning or execution; we do it for the benefit of code that wants
> + * to inspect the final plan and understand how it came to be.
>
> Minor: "paths are consolidated" is redundant, should be "paths consolidated" or "allows us to consolidate".

Thanks, fixed in my local branch.

> > * v5-0004-Allow-for-plugin-control-over-path-generation-str.patch
>
> src/backend/optimizer/path/costsize.c
> +       else
> +               enable_mask |= PGS_CONSIDER_NONPARTIAL;
>
> -       path->disabled_nodes = enable_seqscan ? 0 : 1;
> +       path->disabled_nodes =
> +               (baserel->pgs_mask & enable_mask) == enable_mask ? 0 : 1;
>
> When parallel_workers > 0 the path is partial and doesn't need PGS_CONSIDER_NONPARTIAL. But if parallel_workers == 0,
it'snon-partial and DOES need it, right?  Would this mean that non-partial paths can be disabled even when the scan
typeitself (e.g., PGS_SEQSCAN) is enabled?  Intentional? 

See this comment:

 * Finally, unsetting PGS_CONSIDER_NONPARTIAL disables all non-partial paths
 * except those that use Gather or Gather Merge. In most other cases, a
 * plugin can nudge the planner toward a particular strategy by disabling
 * all of the others, but that doesn't work here: unsetting PGS_SEQSCAN,
 * for instance, would disable both partial and non-partial sequential scans.

> It seems this is still WIP with a solid start, I'm not going to dig too much into it. :)
>
> Keep it up, best.

Thanks for the review so far!


--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Robert Haas
Дата:
On Mon, Dec 8, 2025 at 8:19 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> I really like this idea! Telling the planner, "if you need to make a
> decision for [this thing], choose [this way]," seems to be a really
> nice way of sidestepping many of the concerns with "user control".
>
> I've started an attempt to throw a fuzzer at this, because I'm pretty
> useless when it comes to planner/optimizer review. I don't really know
> what the overall fuzzing strategy is going to be, given the multiple
> complicated inputs that have to be constructed and somehow correlated
> with each other, but I'll try to start small and expand:
>
> a) fuzz the parser first, because it's easy and we can get interesting inputs
> b) fuzz the AST utilities, seeded with "successful" corpus members from a)
> c) stare really hard at the corpus of b) and figure out how to
> usefully mutate a PlannedStmt with it
> d) use c) to fuzz pgpa_plan_walker, then pgpa_output_advice, then...?

Cool. I'm bad at fuzzing, but I think fuzzing by someone who is good
at it is very promising for this kind of patch.

> I'm in the middle of an implementation of b) now, and it noticed the
> following code (which probably bodes well for the fuzzer itself!):
>
> >        if (rid->partnsp == NULL)
> >            result = psprintf("%s/%s", result,
> >                              quote_identifier(rid->partnsp));
>
> I assume that should be quote_identifier(rid->partrel)?

Yes, thanks. Fixed locally. By the way, if your fuzzer can also
produces some things to add contrib/pg_plan_advice/sql for cases like
this, that would be quite helpful. Ideally I would have caught this
with a manually-written test case, but obviously that didn't happen.

> = Other Notes =
>
> GCC 11 complains about the following code in pgpa_collect_advice():
>
> >        dsa_area   *area = pg_plan_advice_dsa_area();
> >        dsa_pointer ca_pointer;
> >
> >        pgpa_make_collected_advice(userid, dbid, queryId, now,
> >                                   query_string, advice_string, area,
> >                                   &ca_pointer);
> >        pgpa_store_shared_advice(ca_pointer);
>
> It doesn't know that area is guaranteed to be non-NULL, so it can't
> prove that ca_pointer is initialized.

I don't know what to do about that. I can understand why it might be
unable to prove that, but I don't see an obvious way to change the
code that would make life easier. I could add Assert(area != NULL)
before the call to pgpa_make_collected_advice() if that helps.

> (GCC also complains about unique_nonjoin_rtekind() not initializing
> the rtekind, but I think that's because of a bug [1].)

This one could be fixed with a dummy initialization, if needed.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Amit Langote
Дата:
Hi Robert,

On Thu, Oct 30, 2025 at 11:00 PM Robert Haas <robertmhaas@gmail.com> wrote:
> As I have mentioned on previous threads, for the past while I have
> been working on planner extensibility. I've posted some extensibility
> patches previously, and got a few of them committed in
> Sepember/October with Tom's help, but I think the time has come a
> patch which actually makes use of that infrastructure as well as some
> further infrastructure that I'm also including in this posting.[1] The
> final patch in this series adds a new contrib module called
> pg_plan_advice. Very briefly, what pg_plan_advice knows how to do is
> process a plan and emits a (potentially long) long text string in a
> special-purpose mini-language that describes a bunch of key planning
> decisions, such as the join order, selected join methods, types of
> scans used to access individual tables, and where and how
> partitionwise join and parallelism were used. You can then set
> pg_plan_advice.advice to that string to get a future attempt to plan
> the same query to reproduce those decisions, or (maybe a better idea)
> you can trim that string down to constrain some decisions (e.g. the
> join order) but not others (e.g. the join methods), or (if you want to
> make your life more exciting) you can edit that advice string and
> thereby attempt to coerce the planner into planning the query the way
> you think best. There is a README that explains the design philosophy
> and thinking in a lot more detail, which is a good place to start if
> you're curious, and I implore you to read it if you're interested, and
> *especially* if you're thinking of flaming me.

Thanks for posting this.  Looks very interesting to me.

These are just high-level comments after browsing the patches and
reading some bits like pgpa_identifier to get myself familiarized with
the project.  I like that the key concept here is plan stability
rather than plan control, because that framing makes it easier to
treat this as infrastructure instead of policy.

> I want to mention that, beyond the fact that I'm sure some people will
> want to use something like this (with more feature and a lot fewer
> bugs) in production, it seems to be super-useful for testing. We have
> a lot of regression test cases that try to coerce the planner to do a
> particular thing by manipulating enable_* GUCs, and I've spent a lot
> of time trying to do similar things by hand, either for regression
> test coverage or just private testing. This facility, even with all of
> the bugs and limitations that it currently has, is exponentially more
> powerful than frobbing enable_* GUCs. Once you get the hang of the
> advice mini-language, you can very quickly experiment with all sorts
> of plan shapes in ways that are currently very hard to do, and thereby
> find out how expensive the planner thinks those things are and which
> ones it thinks are even legal. So I see this as not only something
> that people might find useful for in production deployments, but also
> something that can potentially be really useful to advance PostgreSQL
> development.

+1, the testing benefits make this worthwhile.

> Which brings me to the question of where this code ought to go if it
> goes anywhere at all. I decided to propose pg_plan_advice as a contrib
> module rather than a part of core because I had to make a WHOLE lot of
> opinionated design decisions just to get to the point of having
> something that I could post and hopefully get feedback on. I figured
> that all of those opinionated decisions would be a bit less
> unpalatable if they were mostly encapsulated in a contrib module, with
> the potential for some future patch author to write a different
> contrib module that adopted different solutions to all of those
> problems. But what I've also come to realize is that there's so much
> infrastructure here that leaving the next person to reinvent it may
> not be all that appealing. Query jumbling is a previous case where we
> initially thought that different people might want to do different
> things, but eventually realized that most people really just wanted
> some solution that they didn't have to think too hard about. Likewise,
> in this patch, the relation identifier system described in the README
> is the only thing of its kind, to my knowledge, and any system that
> wants to accomplish something similar to what pg_plan_advice does
> would need a system like that. pg_hint_plan doesn't have something
> like that, because pg_hint_plan is just trying to do hints. This is
> trying to do round-trip-safe plan stability, where the system will
> tell you how to refer unambiguously to a certain part of the query in
> a way that will work correctly on every single query regardless of how
> it's structured or how many times it refers to the same tables or to
> different tables using the same aliases. If we say that we're never
> going to put any of that infrastructure in core, then anyone who wants
> to write a module to control the planner is going to need to start by
> either (a) reinventing something similar, (b) cloning all the relevant
> code, or (c) just giving up on the idea of unambiguous references to
> parts of a query. None of those seem like great options, so now I'm
> less sure whether contrib is actually the right place for this code,
> but that's where I have put it for now. Feedback welcome, on this and
> everything else.

On the relation identifier system: IMHO this part doesn't seem as
opinionated as the advice mini-language. The requirements pretty much
dictate the design -- you need alias names and occurrence counters to
handle self-joins, partition fields for partitioned tables, and a
string representation to survive dump/restore. There doesn't seem to
be much flexibility in that.

Given that, it seems more practical to put this in core from the
start. Extensions that might want to build plan-advice-like
functionality shouldn’t have to clone this logic and wait another
release for something that’s already well-defined and deterministic.
The mini-language is opinionated and belongs in contrib, but the
identifier infrastructure just solves a fundamental problem cleanly.

On the infrastructure patches (0001-0005): these look sensible. The
range table flattening info, elided node tracking, and append node
consolidation preserve information that's currently lost -- there's
some additional overhead to track this, but it's fixed per-relation
per-subquery, which seems reasonable.  The path generation hooks
(0005) are a clear improvement: moving from global enable_* GUCs to
per-RelOptInfo pgs_mask gives extensions the granularity they need for
relation-specific and join-specific decisions. Yes, you need C code to
use them, but you'd need to write C code to do something of value in
this area anyway, and the hooks give you control that GUCs can't
provide.

Overall, I'm supportive of getting these committed once they're ready.
contrib/pg_plan_advice is a compelling proof-of-concept for why these
hooks are needed.

I'll try to post more specific comments once I've read this some more.

--
Thanks, Amit Langote



Re: pg_plan_advice

От
Jakub Wartak
Дата:
On Fri, Dec 5, 2025 at 8:57 PM Robert Haas <robertmhaas@gmail.com> wrote:
[..]
> 014f9a831a320666bf2195949f41710f970c54ad removes the need for what was
> previously 0004, so here is a new patch series with that dropped, to
> avoid confusing cfbot or human reviewers.

Quick-question regarding cross-interactions of the extensions: would
it be possible for auto_explain to have something like
auto_explain.log_custom_options='PLAN_ADVICES' so that it could be
dumping the advice of the queries involved . I can see there is
ApplyExtensionExplainOption() and that would have to probably be used
by auto_explain(?) Or is there any other better way or perhaps it
somehow is against some design or it's just outside of initial scope?
This would solve two problems:
a) sometimes explaining manually (psql) is simply not realistic as it
is being run by app only
b) auto_explain could log nested queries and could print plan advices
along the way, which can be very painful process otherwise
(reverse-engineering how the optimizer would name things  in more
complex queries run from inside PLPGSQL functions)

BTW, some feedback: the plan advices (plan fixing) seems to work fine
for nested queries inside PLPGSQL, and also I've discovered (?) that
one can do even today with patchset the following:
   alter function blah(bigint) set pg_plan_advice.advice =
'NESTED_LOOP_MATERIALIZE(b)';
which seems to be pretty cool, because it allows more targeted fixes
without even having capability of fixing plans for specific query_id
(as discussed earlier).

For the generation part, the only remaining thing is how it integrates
with partitions (especially the ones being dynamically created/dropped
over time). Right now one needs to keep the advice(s) in sync after
altering the partitions, but it could be expected that some form of
regexp/partition-templating would be built into pg_plan_advices
instead. Anyway, I think this one should go into documentation just as
known-limitations for now.

While scratching my head on how to prove that this is not crashing
I've also checked below ones (TLDR all ok):
1. PG_TEST_INITDB_EXTRA_OPTS="-c
shared_preload_libraries='pg_plan_advice'"  meson test  # It was clean
2. PG_TEST_INITDB_EXTRA_OPTS="-c
shared_preload_libraries='pg_plan_advice'" PGOPTIONS="-c
pg_plan_advice.advice=NESTED_LOOP_MATERIALIZE(certainlynotused)" meson
test # This had several failures, but all is OK: it's just some of
them had to additional (expected) text inside regression.diffs:
NESTED_LOOP_MATERIALIZE(certainlynotused) /* not matched */
3. PG_TEST_INITDB_EXTRA_OPTS="-c
shared_preload_libraries='pg_plan_advice' -c
pg_plan_advice.shared_collection_limit=42"  meson test # It was clean
too

-J.



Re: pg_plan_advice

От
Robert Haas
Дата:
On Wed, Dec 10, 2025 at 6:43 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
> Quick-question regarding cross-interactions of the extensions: would
> it be possible for auto_explain to have something like
> auto_explain.log_custom_options='PLAN_ADVICES' so that it could be
> dumping the advice of the queries involved

Yes, I had the same idea. I think the tricky part here is that an
option can have an argument. Most options will probably have Boolean
arguments, but there are existing in-core counterexamples, such as
FORMAT. We should try to figure out the GUC in such a way that it can
be used either to set a Boolean option by just specifying it, or that
it can be used to set an option to a value by writing both. Maybe it's
fine if the GUC value is just a comma-separated list of entries, and
each entry can either be an option name or an option name followed by
a space followed by an option value, i.e. if FORMAT were custom, then
you could write auto_explain.log_custom_options='format xml,
plan_advice' or auto_explain.log_custom_options='plan_advice true,
range_table false' and have sensible things happen. In fact, very
possibly the GUC should just accept any options whether in-core or
out-of-core and not distinguish, so it would be more like
auto_explain.log_options.

> BTW, some feedback: the plan advices (plan fixing) seems to work fine
> for nested queries inside PLPGSQL, and also I've discovered (?) that
> one can do even today with patchset the following:
>    alter function blah(bigint) set pg_plan_advice.advice =
> 'NESTED_LOOP_MATERIALIZE(b)';
> which seems to be pretty cool, because it allows more targeted fixes
> without even having capability of fixing plans for specific query_id
> (as discussed earlier).

Yes, this is a big advantage of reusing the GUC machinery for this
purpose (but see the thread on "[PATCH] Allow complex data for GUC
extra").

> For the generation part, the only remaining thing is how it integrates
> with partitions (especially the ones being dynamically created/dropped
> over time). Right now one needs to keep the advice(s) in sync after
> altering the partitions, but it could be expected that some form of
> regexp/partition-templating would be built into pg_plan_advices
> instead. Anyway, I think this one should go into documentation just as
> known-limitations for now.

Right. I don't think trying to address this at this stage makes sense.
To maintain my sanity, I want to focus for now only on things that
round-trip: that is, we can generate it, and then we can accept that
same stuff. If we're using a parallel plan for every partition e.g.
they are all sequential scans or all index scans, we could generate
SEQ_SCAN(foo/*) or similar and then we could accept that. But figuring
that out would take a bunch of additional infrastructure that I don't
have the time or energy to create right this minute, and I don't see
it as anywhere close to essential for v1. Some other problems here:

1. What happens when a small number of partitions are different? The
code puts quite a bit of energy into detecting conflicting advice, and
honestly probably should put even more, and you might say, well, if
there's just one partition that used an index scan, then I still want
the advice to read SEQ_SCAN(foo/*) INDEX_SCAN(foo/foo23 foo23_a_idx)
and not signal a conflict, but that's slightly unprincipled.

2. INDEX_SCAN() specifications and similar will tend not to be
different for every partition because the index names will be
different for every partition. You might want something that says "for
each partition of foo, use the index on that partition that is a child
of this index on the parent".

Long run, there's a lot of things that can be added to this to make it
more concise (and more expressive, too). Another similar idea is to
have something like NO_GATHER_UNLESS_I_SAID_SO() so that a
non-parallel query doesn't have to do NO_GATHER(every single relation
including all the partitions). I'm pretty sure this is a valuable
idea, but, again, it's not essential for v1.

> While scratching my head on how to prove that this is not crashing
> I've also checked below ones (TLDR all ok):
> 1. PG_TEST_INITDB_EXTRA_OPTS="-c
> shared_preload_libraries='pg_plan_advice'"  meson test  # It was clean
> 2. PG_TEST_INITDB_EXTRA_OPTS="-c
> shared_preload_libraries='pg_plan_advice'" PGOPTIONS="-c
> pg_plan_advice.advice=NESTED_LOOP_MATERIALIZE(certainlynotused)" meson
> test # This had several failures, but all is OK: it's just some of
> them had to additional (expected) text inside regression.diffs:
> NESTED_LOOP_MATERIALIZE(certainlynotused) /* not matched */
> 3. PG_TEST_INITDB_EXTRA_OPTS="-c
> shared_preload_libraries='pg_plan_advice' -c
> pg_plan_advice.shared_collection_limit=42"  meson test # It was clean
> too

You can set pg_plan_advice.always_explain_supplied_advice=false to
clean up some of the noise here. This kind of testing is why I
invented that option. I think that in production, we REALLY REALLY
want any supplied advice to show up in the EXPLAIN plan even if the
user did not specify the PLAN_ADVICE option to EXPLAIN. Otherwise,
understanding what is going on with an EXPLAIN plan that a
hypothetical customer sends to a hypothetical PostgreSQL expert who
has to support said hypothetical customer will be a miserable
experience. But for testing purposes, it's nice to be able to shut it
off so you don't get random regression diffs.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Robert Haas
Дата:
On Wed, Dec 10, 2025 at 6:20 AM Amit Langote <amitlangote09@gmail.com> wrote:
> These are just high-level comments after browsing the patches and
> reading some bits like pgpa_identifier to get myself familiarized with
> the project.  I like that the key concept here is plan stability
> rather than plan control, because that framing makes it easier to
> treat this as infrastructure instead of policy.

Thanks, I agree. I'm sure people will use this for plan control, but
if you start with that, then it's really unclear what things you
should allow to be controlled and what things not. Defining the focus
as plan stability makes round-trip safety a priority and the scope of
what you can request is what the planner could have generated had the
costing come out just so. There's still some definitional questions at
the margin, but IMHO it's much less fuzzy.

> On the relation identifier system: IMHO this part doesn't seem as
> opinionated as the advice mini-language. The requirements pretty much
> dictate the design -- you need alias names and occurrence counters to
> handle self-joins, partition fields for partitioned tables, and a
> string representation to survive dump/restore. There doesn't seem to
> be much flexibility in that.

Right. There's some flexibility. For instance, you could handle
partitions using occurrence numbers, which would actually save a bunch
of code, but that seems obviously worse in terms of user experience.
Also, you could if you wanted key it off of the name of the table
rather than the relation alias used for the table. I think that's also
worse but possibly it's debatable. You could change the order of the
pieces in the representation; e.g. maybe plan_name should come first
rather than last; or you could change the separator characters. But,
honestly, none of that strikes me as sufficient grounds to want
multiple implementations. If the choices I've made don't seem good to
other people, then we should just change them and hopefully find
something everybody can live with. It's a bit like the way that
extension SQL scripts use "--" as a separator: maybe not everybody
agrees that this is the absolutely most elegant choice, but nobody's
proposing a a second version of the extension mechanism just to do
something different.

> Given that, it seems more practical to put this in core from the
> start. Extensions that might want to build plan-advice-like
> functionality shouldn’t have to clone this logic and wait another
> release for something that’s already well-defined and deterministic.
> The mini-language is opinionated and belongs in contrib, but the
> identifier infrastructure just solves a fundamental problem cleanly.

It's not quite as easy to make a sharp distinction between these
things as someone might hope. Note that the lexer and parser handle
the whole mini-language, which includes parsing the relation
identifiers. That doesn't of course mean that the code to *generate*
relation identifiers couldn't be in core, and I actually had it that
way at one point, but it's not very much code and I wasn't too
impressed with how that turned out. It seemed to couple the core code
to the extension more tightly than necessary for not much real
benefit.

But that's not to say I disagree with you categorically. Suppose we
decided (and I'm not saying we should) to start showing relation
identifiers in EXPLAIN output instead of identifying things in EXPLAIN
output as we do today. Maybe we even decide to show elided subqueries
and similar as first-class parts of the EXPLAIN output, also using
relation identifier syntax. That would be a pretty significant change,
and would destabilize a WHOLE LOT of regression test outputs, but then
relation identifiers become a first-class PostgreSQL concept that
everyone who looks at EXPLAIN output will encounter and, probably,
come to understand. Then obviously the relation identifier generation
code needs to be in core, and that makes total sense because we're
actually using it for something, and arguably we've made life easier
for everyone who wants to use pg_plan_advice in the future because
they're already familiar with the identifier convention. The downside
is everyone has to get used to the new EXPLAIN output even if they
don't care about pg_plan_advice or hate it with a fiery passion.

So my point here is that there are things we can decide to do to make
some or all of this "core," but IMHO it's not just as simple as saying
"this is in, that's out". It's more about deciding what the end state
ought to look like, and how integrated this stuff ought to be into the
fabric of PostgreSQL. I started with the minimal level of integration:
little pieces of core infrastructure, all used by a giant extension.
Now we need to either decide that's where we want to settle, or decide
to push to some greater or lesser degree toward more integration.

> On the infrastructure patches (0001-0005): these look sensible. The
> range table flattening info, elided node tracking, and append node
> consolidation preserve information that's currently lost -- there's
> some additional overhead to track this, but it's fixed per-relation
> per-subquery, which seems reasonable.  The path generation hooks
> (0005) are a clear improvement: moving from global enable_* GUCs to
> per-RelOptInfo pgs_mask gives extensions the granularity they need for
> relation-specific and join-specific decisions. Yes, you need C code to
> use them, but you'd need to write C code to do something of value in
> this area anyway, and the hooks give you control that GUCs can't
> provide.
>
> Overall, I'm supportive of getting these committed once they're ready.
> contrib/pg_plan_advice is a compelling proof-of-concept for why these
> hooks are needed.

Great. I don't think there's anything terribly controversial in
0001-0004. I think the comments and so on might need improving and
there could be little mini-bugs or whatever, but basically I think
they work and I don't anticipate any major problems. However, I'd want
at least one other person to do a detailed review before committing
anything. 0005 might be a little more controversial. There's some
design choices to dislike (though I believe I've made them for good
reason) and there's a question of whether it's as complete as we want.
It might be fine to commit it the way it is and just adjust it later
if we find that something ought to be different, but it's also
possible that we should think harder about some of the choices or hold
off for a bit while other parts of this effort move forward. I'm happy
to hear opinions on the best strategy here.

> I'll try to post more specific comments once I've read this some more.

Thanks for the review so far, and that sounds great!

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Corey Huinker
Дата:
On Wed, Dec 10, 2025 at 9:54 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Dec 10, 2025 at 6:20 AM Amit Langote <amitlangote09@gmail.com> wrote:
> These are just high-level comments after browsing the patches and
> reading some bits like pgpa_identifier to get myself familiarized with
> the project.  I like that the key concept here is plan stability
> rather than plan control, because that framing makes it easier to
> treat this as infrastructure instead of policy.

Thanks, I agree. I'm sure people will use this for plan control, but
if you start with that, then it's really unclear what things you
should allow to be controlled and what things not. Defining the focus
as plan stability makes round-trip safety a priority and the scope of
what you can request is what the planner could have generated had the
costing come out just so. There's still some definitional questions at
the margin, but IMHO it's much less fuzzy.

I couldn't have said this any better than Amit did. In my experience, lack of a plan stability feature is far and away the most cited reason for not porting to PostgreSQL. They want query plan stability first and foremost. The amount of plan tweaking they do is actually pretty minimal, once they get good-enough performance during user acceptance they want to encase those query plans in amber because that's what the customer signed-off on. After that, they're happy to scan the performance trendlines, and only make tweaks when it's worth a change request.

But that's not to say I disagree with you categorically. Suppose we
decided (and I'm not saying we should) to start showing relation
identifiers in EXPLAIN output instead of identifying things in EXPLAIN
output as we do today. Maybe we even decide to show elided subqueries
and similar as first-class parts of the EXPLAIN output, also using
relation identifier syntax. That would be a pretty significant change,
and would destabilize a WHOLE LOT of regression test outputs, but then
relation identifiers become a first-class PostgreSQL concept that
everyone who looks at EXPLAIN output will encounter and, probably,
come to understand.

I think the change would be worth the destabilization, because it makes it so much easier to talk about complex query plans. Additionally, it would make it reasonable to programmatically extract portions of a plan, allowing for much more fine-grained regression tests regarding plans.

Showing the elided subqueries would be a huge benefit, outlining the benefits that the planner is giving you "for free".
 
> On the infrastructure patches (0001-0005): these look sensible. The
> range table flattening info, elided node tracking, and append node

One thing I am curious about is that by tracking the elided nodes, would it make more sense in the long run to have the initial post-naming plan tree be immutable, and generate a separate copy minus the elided parts?

Re: pg_plan_advice

От
Robert Haas
Дата:
On Wed, Dec 10, 2025 at 4:09 PM Corey Huinker <corey.huinker@gmail.com> wrote:
> I think the change would be worth the destabilization, because it makes it so much easier to talk about complex query
plans.Additionally, it would make it reasonable to programmatically extract portions of a plan, allowing for much more
fine-grainedregression tests regarding plans. 

I'll wait for more votes before thinking about doing anything about
this, because I have my doubts about whether the consensus will
actually go in favor of such a large change. Or maybe someone else
would like to try mocking it up (even if somewhat imperfectly) so we
can all see just how large an impact it makes.

>> > On the infrastructure patches (0001-0005): these look sensible. The
>> > range table flattening info, elided node tracking, and append node
>
> One thing I am curious about is that by tracking the elided nodes, would it make more sense in the long run to have
theinitial post-naming plan tree be immutable, and generate a separate copy minus the elided parts? 

Probably not. Having two entire copies of the plan tree would be
pretty expensive. I think that we've bet on the right idea, namely,
that the primary consumer of plan trees should be the executor, and
the primary goal should be to create plan trees that make the executor
run fast. I believe the right approach is basically what we do today:
you're allowed to put things into the plan that aren't technically
necessary for execution, if they're useful for instrumentation and
observability purposes and they don't add an unreasonable amount of
overhead. These patches basically just extend that existing principle
to a few new things.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Robert Haas
Дата:
On Fri, Dec 5, 2025 at 2:57 PM Robert Haas <robertmhaas@gmail.com> wrote:
> 014f9a831a320666bf2195949f41710f970c54ad removes the need for what was
> previously 0004, so here is a new patch series with that dropped, to
> avoid confusing cfbot or human reviewers.

Here's v6, with minor improvements over v5.

0001: Unchanged.

0002, 0003: Unchanged except for typo fixes pointed out by reviewers.

0004: I've improved the hook placement, which was previously such as
to make correct unique-semijoin handling impossible, and I improved
the associated comment about how to use the hook, based on experience
trying to actually do so.

0005: Fixed a small bug related to unique-semijoin handling (other
problems remain). Tidied things up to avoid producing non-actionable
NO_GATHER() advice in a number of cases, per some off-list feedback
from Ajaykumar Pal.

--
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

Re: pg_plan_advice

От
Jacob Champion
Дата:
On Tue, Dec 9, 2025 at 11:46 AM Robert Haas <robertmhaas@gmail.com> wrote:
> By the way, if your fuzzer can also
> produces some things to add contrib/pg_plan_advice/sql for cases like
> this, that would be quite helpful. Ideally I would have caught this
> with a manually-written test case, but obviously that didn't happen.

Sure! (They'll need to be golfed down.) Here are three entries that
hit the crash, each on its own line:

> join_order(qoe((nested_l oindex_scanp_plain))se(nested_loop_plain)nested_loo/_pseq_scanlain)
> join_order(qoe((nested_loop_plain))se(nested_loop_plain)nesemij/insted_loop_plain)
> gather(gather(gar(g/ther0))gtaher(gathethga))

Something the fuzzer really likes is zero-length identifiers ("").
Maybe that's by design, but I thought I'd mention it since the
standard lexer doesn't allow that and syntax.sql doesn't exercise it.

> > It doesn't know that area is guaranteed to be non-NULL, so it can't
> > prove that ca_pointer is initialized.
>
> I don't know what to do about that. I can understand why it might be
> unable to prove that, but I don't see an obvious way to change the
> code that would make life easier. I could add Assert(area != NULL)
> before the call to pgpa_make_collected_advice() if that helps.

With USE_ASSERT_CHECKING, that should help, but I'm not sure if it
does without. (I could have sworn there was a conversation about that
at some point but I can't remember any of the keywords.) Could also
just make a dummy assignment. Or tag pg_plan_advice_dsa_area() with
__attribute__((returns_nonnull)), but that's more portability work.

--Jacob



Re: pg_plan_advice

От
Robert Haas
Дата:
On Thu, Dec 11, 2025 at 8:11 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> Sure! (They'll need to be golfed down.) Here are three entries that
> hit the crash, each on its own line:
>
> > join_order(qoe((nested_l oindex_scanp_plain))se(nested_loop_plain)nested_loo/_pseq_scanlain)
> > join_order(qoe((nested_loop_plain))se(nested_loop_plain)nesemij/insted_loop_plain)
> > gather(gather(gar(g/ther0))gtaher(gathethga))

At least for me, setting pg_plan_advice.advice to any of these strings
does not provoke a crash. What I discovered after a bit of
experimentation is that you get the crash if you (a) set the string to
something like this and then (b) run an EXPLAIN. Turns out, I already
had a test in syntax.sql that is sufficient to provoke the crash, so,
locally, I've added 'EXPLAIN SELECT 1' after each test case in
syntax.sql that is expected to successfully alter the value of the
GUC.

> Something the fuzzer really likes is zero-length identifiers ("").
> Maybe that's by design, but I thought I'd mention it since the
> standard lexer doesn't allow that and syntax.sql doesn't exercise it.

That's not by design. I've added a matching error check locally.

> > > It doesn't know that area is guaranteed to be non-NULL, so it can't
> > > prove that ca_pointer is initialized.
> >
> > I don't know what to do about that. I can understand why it might be
> > unable to prove that, but I don't see an obvious way to change the
> > code that would make life easier. I could add Assert(area != NULL)
> > before the call to pgpa_make_collected_advice() if that helps.
>
> With USE_ASSERT_CHECKING, that should help, but I'm not sure if it
> does without. (I could have sworn there was a conversation about that
> at some point but I can't remember any of the keywords.) Could also
> just make a dummy assignment. Or tag pg_plan_advice_dsa_area() with
> __attribute__((returns_nonnull)), but that's more portability work.

As in initialize ca_pointer to InvalidDsaPointer?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Jacob Champion
Дата:
On Fri, Dec 12, 2025 at 9:36 AM Robert Haas <robertmhaas@gmail.com> wrote:
> At least for me, setting pg_plan_advice.advice to any of these strings
> does not provoke a crash. What I discovered after a bit of
> experimentation is that you get the crash if you (a) set the string to
> something like this and then (b) run an EXPLAIN.

Makes sense (this fuzzer was exercising pgpa_format_advice_target()).

> > With USE_ASSERT_CHECKING, that should help, but I'm not sure if it
> > does without. (I could have sworn there was a conversation about that
> > at some point but I can't remember any of the keywords.) Could also
> > just make a dummy assignment. Or tag pg_plan_advice_dsa_area() with
> > __attribute__((returns_nonnull)), but that's more portability work.
>
> As in initialize ca_pointer to InvalidDsaPointer?

Yeah.

Next bit of fuzzer feedback: I need the following diff in
pgpa_trove_add_to_hash() to avoid a crash when the hashtable starts to
fill up:

>     element = pgpa_trove_entry_insert(hash, key, &found);
> +   if (!found)
> +       element->indexes = NULL;
>     element->indexes = bms_add_member(element->indexes, index);

The advice string that triggered this is horrific, but I can send it
to you offline if you're morbidly curious. (I can spend time to
minimize it or I can get more fuzzer coverage, and I'd rather do the
latter right now :D)

--Jacob



Re: pg_plan_advice

От
Ajay Pal
Дата:
During further testing of the plan_advice patch's latest version, I
observed that the following query is generating a no_gather plan. This
specific plan structure is not being accepted by the query planner.

postgres=*# set local pg_plan_advice.advice='NO_GATHER("*RESULT*")';
SET
postgres=*# explain ( plan_advice) SELECT
CAST('99999999999999999999999999999999999999999999999999' AS NUMERIC);
                QUERY PLAN
-------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=32)
 Supplied Plan Advice:
   NO_GATHER("*RESULT*") /* not matched */
 Generated Plan Advice:
   NO_GATHER("*RESULT*")
(5 rows)

Thanks
Ajay

On Fri, Dec 12, 2025 at 11:40 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
>
> On Fri, Dec 12, 2025 at 9:36 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > At least for me, setting pg_plan_advice.advice to any of these strings
> > does not provoke a crash. What I discovered after a bit of
> > experimentation is that you get the crash if you (a) set the string to
> > something like this and then (b) run an EXPLAIN.
>
> Makes sense (this fuzzer was exercising pgpa_format_advice_target()).
>
> > > With USE_ASSERT_CHECKING, that should help, but I'm not sure if it
> > > does without. (I could have sworn there was a conversation about that
> > > at some point but I can't remember any of the keywords.) Could also
> > > just make a dummy assignment. Or tag pg_plan_advice_dsa_area() with
> > > __attribute__((returns_nonnull)), but that's more portability work.
> >
> > As in initialize ca_pointer to InvalidDsaPointer?
>
> Yeah.
>
> Next bit of fuzzer feedback: I need the following diff in
> pgpa_trove_add_to_hash() to avoid a crash when the hashtable starts to
> fill up:
>
> >     element = pgpa_trove_entry_insert(hash, key, &found);
> > +   if (!found)
> > +       element->indexes = NULL;
> >     element->indexes = bms_add_member(element->indexes, index);
>
> The advice string that triggered this is horrific, but I can send it
> to you offline if you're morbidly curious. (I can spend time to
> minimize it or I can get more fuzzer coverage, and I'd rather do the
> latter right now :D)
>
> --Jacob
>
>



Re: pg_plan_advice

От
Jacob Champion
Дата:
On Fri, Dec 12, 2025 at 10:09 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> Next bit of fuzzer feedback:

And another bit, but this time I was able to minimize into a
regression case, attached.

This comment in pgpa_identifier_matches_target() seems to be incorrect:

>     /*
>      * The identifier must specify a schema, but the target may leave the
>      * schema NULL to match anything.
>      */

But I don't know whether that's because the assumption itself is
wrong, or because a layer above hasn't filtered something out before
getting to this point.

--Jacob

Вложения

Re: pg_plan_advice

От
Robert Haas
Дата:
Here's v7.

In 0001, I removed "const" from a node's struct declaration, because
Tom gave me some feedback to avoid that on another recent patch, and I
noticed I had done it here also. 0002, 0003, and 0004 are unchanged.

In 0005:

- Refactored the code to avoid issuing SEMIJOIN_NON_UNIQUE() advice in
cases where uniqueness wasn't actually considered.
- Adjusted the code not to issue NO_GATHER() advice for non-relation
RTEs. (This is the issue reported by Ajay Pal in a recent message to
this thread, which was also mentioned in an XXX in the code.)
- Reject zero-length delimited identifiers, per Jacob's email.
- Properly initialize element->indexes in pgpa_trove_add_to_hash, per
Jacob'e email.
- Add gather((d d/d.d)) test case, per Jacob, and fix the related bug
in pgpa_identifier_matches_target, per Jacob's email.
- Add EXPLAIN SELECT 1 after various test cases in syntax.sql, to
improve test coverage, per analysis of why the existing test case
didn't catch a bug previously reported by Jacob.
- Added a dummy initialization to pgpa_collector.c to placate nervous
compilers, per discussion with Jacob.

I think this mostly catches me up on responding to issues reported
here, although there is one thing reported to me off-list that I
haven't dealt with yet. If there's anything reported on thread that
isn't addressed here, let me know.

Thanks,

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

Re: pg_plan_advice

От
Jakub Wartak
Дата:
On Mon, Dec 15, 2025 at 9:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> Here's v7.
[..]

OK, so I've tested today from Your's branch directly, so I hope that
was also v7. Given the following q20 query:

SELECT s_name, s_address
FROM supplier, nation
WHERE s_suppkey in
    (SELECT ps_suppkey
     FROM partsupp
     WHERE ps_partkey in
         (SELECT p_partkey
          FROM part
          WHERE p_name LIKE 'forest%' )
       AND ps_availqty >
         (SELECT 0.5 * sum(l_quantity)
          FROM lineitem
          WHERE l_partkey = ps_partkey
            AND l_suppkey = ps_suppkey
            AND l_shipdate >= DATE '1994-01-01'
            AND l_shipdate < DATE '1994-01-01' + INTERVAL '1' year ) )
  AND s_nationkey = n_nationkey
  AND n_name = 'CANADA'
ORDER BY s_name;

in normal conditions (w/o advice) the above query generates:

 Sort  (cost=1010985030.44..1010985030.59 rows=61 width=51)
   Sort Key: supplier.s_name
   ->  Nested Loop  (cost=0.42..1010985028.63 rows=61 width=51)
         Join Filter: (nation.n_nationkey = supplier.s_nationkey)
         ->  Seq Scan on nation  (cost=0.00..1.31 rows=1 width=4)
               Filter: (n_name = 'CANADA'::bpchar)
         ->  Nested Loop Semi Join  (cost=0.42..1010985008.29
rows=1522 width=55)
               Join Filter: (partsupp.ps_suppkey = supplier.s_suppkey)
               ->  Seq Scan on supplier  (cost=0.00..249.30 rows=7730 width=59)
               ->  Materialize  (cost=0.42..1010755994.57 rows=1973 width=4)
                     ->  Nested Loop  (cost=0.42..1010755984.71
rows=1973 width=4)
                           ->  Seq Scan on part  (cost=0.00..4842.25
rows=1469 width=4)
                                 Filter: ((p_name)::text ~~ 'forest%'::text)
                           ->  Index Scan using pk_partsupp on
partsupp  (cost=0.42..688053.87 rows=1 width=8)
                                 Index Cond: (ps_partkey = part.p_partkey)
                                 Filter: ((ps_availqty)::numeric >
(SubPlan expr_1))
                                 SubPlan expr_1
                                   ->  Aggregate
(cost=172009.42..172009.44 rows=1 width=32)
                                         ->  Seq Scan on lineitem
(cost=0.00..172009.42 rows=1 width=5)
                                               Filter: ((l_shipdate >=
'1994-01-01'::date) AND (l_shipdate < '1995-01-01 00:00:00'::timestamp
without time zone) AND (l_partkey = partsupp.ps_partkey) AND
(l_suppkey = partsupp.ps_suppkey))


 Generated Plan Advice:
   JOIN_ORDER(nation (supplier (part partsupp)))
   NESTED_LOOP_PLAIN(partsupp partsupp) <--- [X]
   NESTED_LOOP_MATERIALIZE(partsupp)
   SEQ_SCAN(nation supplier part lineitem@expr_1)
   INDEX_SCAN(partsupp public.pk_partsupp)
   SEMIJOIN_NON_UNIQUE((partsupp part))
   NO_GATHER(supplier nation partsupp part lineitem@expr_1)

Please see the - I think it's confusing? -
NESTED_LOOP_MATERIALIZE(partsupp partsupp) - that's 2x the same
string? This causes it to turn into below plan -- I've marked the
problem with [X]

 Sort  (cost=50035755.50..50035755.66 rows=61 width=51)
   Sort Key: supplier.s_name
   ->  Nested Loop  (cost=12562154.32..50035753.70 rows=61 width=51)
         Join Filter: (nation.n_nationkey = supplier.s_nationkey)
         ->  Seq Scan on nation  (cost=0.00..1.31 rows=1 width=4)
               Filter: (n_name = 'CANADA'::bpchar)
         ->  Nested Loop Semi Join  (cost=12562154.32..50035733.36
rows=1522 width=55)
             [X] -- missing Join Filter here
               ->  Seq Scan on supplier  (cost=0.00..249.30 rows=7730 width=59)
               [X] -- HJ instead of Materialize+Nested Loop below:
               ->  Hash Join  (cost=12562154.32..12567002.09 rows=1 width=4)
                     Hash Cond: (part.p_partkey = partsupp.ps_partkey)
                     ->  Seq Scan on part  (cost=0.00..4842.25
rows=1469 width=4)
                           Filter: ((p_name)::text ~~ 'forest%'::text)
                     ->  Hash  (cost=12562154.02..12562154.02 rows=24 width=8)
                           ->  Index Scan using pk_partsupp on
partsupp  (cost=0.42..12562154.02 rows=24 width=8)
                                 [X] -- wrong Index Cond below
(suppkey instead of partkey)
                                 Index Cond: (ps_suppkey = supplier.s_suppkey)
                                 Filter: ((ps_availqty)::numeric >
(SubPlan expr_1))
                                 SubPlan expr_1
                                   ->  Aggregate
(cost=172009.42..172009.44 rows=1 width=32)
                                         ->  Seq Scan on lineitem
(cost=0.00..172009.42 rows=1 width=5)
                                               Filter: ((l_shipdate >=
'1994-01-01'::date) AND (l_shipdate < '1995-01-01 00:00:00'::timestamp
without time zone) AND (l_partkey = partsupp.ps_partkey) AND
(l_suppkey = partsupp.ps_suppkey))

Supplied Plan Advice:
   SEQ_SCAN(nation) /* matched */
   SEQ_SCAN(supplier) /* matched */
   SEQ_SCAN(part) /* matched */
   SEQ_SCAN(lineitem@expr_1) /* matched */
   INDEX_SCAN(partsupp public.pk_partsupp) /* matched */
   JOIN_ORDER(nation (supplier (part partsupp))) /* matched, conflicting */
   NESTED_LOOP_PLAIN(partsupp) /* matched, conflicting */
   NESTED_LOOP_PLAIN(partsupp) /* matched, conflicting */
   NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting, failed */
   SEMIJOIN_NON_UNIQUE((partsupp part)) /* matched, conflicting */
   NO_GATHER(supplier) /* matched */
   NO_GATHER(nation) /* matched */
   NO_GATHER(partsupp) /* matched */
   NO_GATHER(part) /* matched */
   NO_GATHER(lineitem@expr_1) /* matched */

So the difference is basically between:
    set pg_plan_advice.advice = '[..] NESTED_LOOP_PLAIN(partsupp
partsupp) NESTED_LOOP_MATERIALIZE(partsupp) [..]';
which causes wrong plan and outcome:
    NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting, failed */

and apparently proper advice like below which has better yield:
    set pg_plan_advice.advice = '[..] NESTED_LOOP_PLAIN(part partsupp)
NESTED_LOOP_MATERIALIZE(partsupp) [..]';
which is not generated , but caused good plan, however it also prints:
   NESTED_LOOP_PLAIN(part) /* matched, conflicting, failed */
   NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting */
but that seems "failed" there, seems to be untrue?

Another idea is perhaps, we could have some elog(WARNING) - but not
Asserts() - in assert-only enabled build that could alert us in case
of duplicated entries being detected for the same ops in
pg_plan_advice_explain_feedback()?

-J.



Re: pg_plan_advice

От
Jakub Wartak
Дата:
On Wed, Dec 17, 2025 at 11:12 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
>
> On Mon, Dec 15, 2025 at 9:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > Here's v7.
> [..]
>[..q20..]

OK, now for the q10:

 Sort
   Sort Key: (sum((lineitem.l_extendedprice * ('1'::numeric -
lineitem.l_discount)))) DESC
   ->  Finalize GroupAggregate
         Group Key: customer.c_custkey, nation.n_name
         ->  Gather Merge
               Workers Planned: 2
               ->  Partial GroupAggregate
                     Group Key: customer.c_custkey, nation.n_name
                     ->  Sort
                           Sort Key: customer.c_custkey, nation.n_name
                           ->  Hash Join
                                 Hash Cond: (customer.c_nationkey =
nation.n_nationkey)
                                 ->  Parallel Hash Join
                                       Hash Cond: (orders.o_custkey =
customer.c_custkey)
                                       ->  Nested Loop
                                             ->  Parallel Seq Scan on orders
                                                   Filter:
((o_orderdate >= '1993-10-01'::date) AND (o_orderdate < '1994-01-01
00:00:00'::timestamp without time zone))
                                             ->  Index Scan using
lineitem_l_orderkey_idx_l_returnflag on lineitem
                                                   Index Cond:
(l_orderkey = orders.o_orderkey)
                                       ->  Parallel Hash
                                             ->  Parallel Seq Scan on customer
                                 ->  Hash
                                       ->  Seq Scan on nation
 Generated Plan Advice:
   JOIN_ORDER(orders lineitem customer nation)
   NESTED_LOOP_PLAIN(lineitem)
   HASH_JOIN(customer nation)
   SEQ_SCAN(orders customer nation)
   INDEX_SCAN(lineitem public.lineitem_l_orderkey_idx_l_returnflag)
   GATHER_MERGE((customer orders lineitem nation))

but when set the advice it generates wrong NL instead of expected
Parallel HJ (so another way to fix is to simply disable PQ, yuck),
but:

 Sort
   Sort Key: (sum((lineitem.l_extendedprice * ('1'::numeric -
lineitem.l_discount)))) DESC
   ->  Finalize GroupAggregate
         Group Key: customer.c_custkey, nation.n_name
         ->  Gather Merge
               Workers Planned: 2
               ->  Partial GroupAggregate
                     Group Key: customer.c_custkey, nation.n_name
                     ->  Sort
                           Sort Key: customer.c_custkey, nation.n_name
                           ->  Nested Loop
                                 ->  Hash Join
                                       Hash Cond:
(customer.c_nationkey = nation.n_nationkey)
                                       ->  Parallel Hash Join
                                             Hash Cond:
(orders.o_custkey = customer.c_custkey)
                                             ->  Parallel Seq Scan on orders
                                                   Filter:
((o_orderdate >= '1993-10-01'::date) AND (o_orderdate < '1994-01-01
00:00:00'::timestamp without time zone))
                                             ->  Parallel Hash
                                                   ->  Parallel Seq
Scan on customer
                                       ->  Hash
                                             ->  Seq Scan on nation
                                 ->  Index Scan using
lineitem_l_orderkey_idx_l_returnflag on lineitem
                                       Index Cond: (l_orderkey =
orders.o_orderkey)
 Supplied Plan Advice:
   SEQ_SCAN(orders) /* matched */
   SEQ_SCAN(customer) /* matched */
   SEQ_SCAN(nation) /* matched */
   INDEX_SCAN(lineitem public.lineitem_l_orderkey_idx_l_returnflag) /*
matched */
   JOIN_ORDER(orders lineitem customer nation) /* matched,
conflicting, failed */
   NESTED_LOOP_PLAIN(lineitem) /* matched, conflicting */
   HASH_JOIN(customer) /* matched, conflicting */
   HASH_JOIN(nation) /* matched, conflicting */
   GATHER_MERGE((customer orders lineitem nation)) /* matched */

So to me it looks like in Generated Plan Advice we:
- have proper HASH_JOIN(customer nation)
- but it somehow forgot to include "HASH_JOIN(orders)" to cover for
that Parallel Hash Join on (orders.o_custkey = customer.c_custkey)
with input from NL. After adding that manually, it achieves the same
input plan properly.

Please let me know if I'm wrong, I was kind of thinking Parallel is
not fully supported, but README/tests seem to state otherwise.

-J.



Re: pg_plan_advice

От
Jakub Wartak
Дата:
On Wed, Dec 17, 2025 at 2:44 PM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
>
> On Wed, Dec 17, 2025 at 11:12 AM Jakub Wartak
> <jakub.wartak@enterprisedb.com> wrote:
> >
> > On Mon, Dec 15, 2025 at 9:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > >
> > > Here's v7.
> > [..]
> >[..q20..]
>
> OK, now for the q10:

Hi, this is a follow-up just to the q10.

> So to me it looks like in Generated Plan Advice we:
> - have proper HASH_JOIN(customer nation)
> - but it somehow forgot to include "HASH_JOIN(orders)" to cover for
> that Parallel Hash Join on (orders.o_custkey = customer.c_custkey)
> with input from NL. After adding that manually, it achieves the same
> input plan properly.
[..]

Well, it's quite a ride with the Q10 and I partially wrong with above:

0. The reported earlier wrong missing "HASH_JOIN(orders customer)" -
that part was okay
1. The Incremental Sort is being used in the original plan, but is
still IS not reflected in the generated advice.
2a. I've noticed Memoize/Index Scan was not being respected for "nation"
2b. Seq scan for nation was being done for "nation"

So total modification list, I've ended up doing (+ for adding , - for removing):

+ HASH_JOIN(orders customer) -- from earlier reply
+ NESTED_LOOP_MEMOIZE(nation)
+ INDEX_SCAN(nation public.pk_nation)
- HASH_JOIN(customer nation) -- as it was we were having NL() in org plan
SEQ_SCAN(orders customer nation) ==> SEQ_SCAN(orders customer)

In full the best shape seems to be Q10 with pg_plan_advice.advice =
'HASH_JOIN(orders customer) JOIN_ORDER(orders lineitem customer
nation)    NESTED_LOOP_PLAIN(lineitem)    SEQ_SCAN(orders customer)
INDEX_SCAN(lineitem public.lineitem_l_orderkey_idx_l_returnflag)
GATHER_MERGE((customer orders lineitem nation))
NESTED_LOOP_MEMOIZE(nation)';

which yields:
 Sort
   Sort Key: (sum((lineitem.l_extendedprice * ('1'::numeric -
lineitem.l_discount)))) DESC
   ->  GroupAggregate
         Group Key: customer.c_custkey, nation.n_name
         ->  Gather Merge
               Workers Planned: 2
               ->  Sort
                     Sort Key: customer.c_custkey, nation.n_name
                     ->  Nested Loop
                           ->  Parallel Hash Join
                                 Hash Cond: (orders.o_custkey =
customer.c_custkey)
                                 ->  Nested Loop
                                       ->  Parallel Seq Scan on orders
                                             Filter: ((o_orderdate >=
'1993-10-01'::date) AND (o_orderdate < '1994-01-01
00:00:00'::timestamp without time zone))
                                       ->  Index Scan using
lineitem_l_orderkey_idx_l_returnflag on lineitem
                                             Index Cond: (l_orderkey =
orders.o_orderkey)
                                 ->  Parallel Hash
                                       ->  Parallel Seq Scan on customer
                           ->  Memoize
                                 Cache Key: customer.c_nationkey
                                 Cache Mode: logical
                                 ->  Index Scan using pk_nation on nation
                                       Index Cond: (n_nationkey =
customer.c_nationkey)

but that Incremental Sort *is* still missing. In original plan we are doing
   Incremental Sort (Sort Key: customer.c_custkey, nation.n_name,
Presorted Key: customer.c_custkey)
   <-- .... Sort(Sort Key: customer.c_custkey)

However, even with my overrides I haven't found an immediately obvious
way to force it to use Incremental Sort on a specific field, so it
just sorts on two at once. Maybe it's something that should be
expressed through GATHER_MERGE()?, but that's not obvious how and
where. In terms of raw performance , it seems to be very similiar
(98ms +/- 8ms even between those two).

-J.



Re: pg_plan_advice

От
Robert Haas
Дата:
On Wed, Dec 17, 2025 at 5:12 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
>  Sort  (cost=1010985030.44..1010985030.59 rows=61 width=51)
>    Sort Key: supplier.s_name
>    ->  Nested Loop  (cost=0.42..1010985028.63 rows=61 width=51)
>          Join Filter: (nation.n_nationkey = supplier.s_nationkey)
>          ->  Seq Scan on nation  (cost=0.00..1.31 rows=1 width=4)
>                Filter: (n_name = 'CANADA'::bpchar)
>          ->  Nested Loop Semi Join  (cost=0.42..1010985008.29
> rows=1522 width=55)
>                Join Filter: (partsupp.ps_suppkey = supplier.s_suppkey)
>                ->  Seq Scan on supplier  (cost=0.00..249.30 rows=7730 width=59)
>                ->  Materialize  (cost=0.42..1010755994.57 rows=1973 width=4)
>                      ->  Nested Loop  (cost=0.42..1010755984.71
> rows=1973 width=4)
>                            ->  Seq Scan on part  (cost=0.00..4842.25
> rows=1469 width=4)
>                                  Filter: ((p_name)::text ~~ 'forest%'::text)
>                            ->  Index Scan using pk_partsupp on
> partsupp  (cost=0.42..688053.87 rows=1 width=8)
>                                  Index Cond: (ps_partkey = part.p_partkey)
>                                  Filter: ((ps_availqty)::numeric >
> (SubPlan expr_1))
>                                  SubPlan expr_1
>                                    ->  Aggregate
> (cost=172009.42..172009.44 rows=1 width=32)
>                                          ->  Seq Scan on lineitem
> (cost=0.00..172009.42 rows=1 width=5)
>                                                Filter: ((l_shipdate >=
> '1994-01-01'::date) AND (l_shipdate < '1995-01-01 00:00:00'::timestamp
> without time zone) AND (l_partkey = partsupp.ps_partkey) AND
> (l_suppkey = partsupp.ps_suppkey))
>
>
>  Generated Plan Advice:
>    JOIN_ORDER(nation (supplier (part partsupp)))
>    NESTED_LOOP_PLAIN(partsupp partsupp) <--- [X]
>    NESTED_LOOP_MATERIALIZE(partsupp)
>    SEQ_SCAN(nation supplier part lineitem@expr_1)
>    INDEX_SCAN(partsupp public.pk_partsupp)
>    SEMIJOIN_NON_UNIQUE((partsupp part))
>    NO_GATHER(supplier nation partsupp part lineitem@expr_1)

Yeah, that's not right. There are three nested loops here, so we
should have three pieces of nested loop advice.
NESTED_LOOP_MATERIALIZE(partsupp) covers the innermost nested loop.
The other two are NESTED_LOOP_PLAIN, but the advice should cover all
the tables on the inner side of the join. I think it should read:

NESTED_LOOP_PLAIN((part partsupp) (supplier part partsupp))

Ordering isn't significant here, so NESTED_LOOP_PLAIN((part supplier
partsupp) (partsupp part)) would be logically equivalent. Doesn't
matter exactly what we output here, but it shouldn't be just partsupp.

> and apparently proper advice like below which has better yield:
>     set pg_plan_advice.advice = '[..] NESTED_LOOP_PLAIN(part partsupp)

This isn't quite what you want, because this says that part should be
on the outer side of a NESTED_LOOP_PLAIN by itself and partsupp should
also be on the outer side of a NESTED_LOOP_PLAIN by itself. You need
the extra set of parentheses to indicate that the join product of
those two tables should be on the outer side of a NESTED_LOOP_PLAIN,
rather than each table individually.

What must be happening here is that either pgpa_join.c (maybe with
complicity from pgpa_walker.c) is not populating the
pgpa_plan_walker_context's join_strategies[JSTRAT_NESTED_LOOP_PLAIN]
member correctly, or else pgpa_output.c is not serializing it to text
correctly. I suspect the former is a more likely but I'm not sure
exactly what's happening.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Robert Haas
Дата:
On Wed, Dec 17, 2025 at 8:44 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
> OK, now for the q10:
>
>  Sort
>    Sort Key: (sum((lineitem.l_extendedprice * ('1'::numeric -
> lineitem.l_discount)))) DESC
>    ->  Finalize GroupAggregate
>          Group Key: customer.c_custkey, nation.n_name
>          ->  Gather Merge
>                Workers Planned: 2
>                ->  Partial GroupAggregate
>                      Group Key: customer.c_custkey, nation.n_name
>                      ->  Sort
>                            Sort Key: customer.c_custkey, nation.n_name
>                            ->  Hash Join
>                                  Hash Cond: (customer.c_nationkey =
> nation.n_nationkey)
>                                  ->  Parallel Hash Join
>                                        Hash Cond: (orders.o_custkey =
> customer.c_custkey)
>                                        ->  Nested Loop
>                                              ->  Parallel Seq Scan on orders
>                                                    Filter:
> ((o_orderdate >= '1993-10-01'::date) AND (o_orderdate < '1994-01-01
> 00:00:00'::timestamp without time zone))
>                                              ->  Index Scan using
> lineitem_l_orderkey_idx_l_returnflag on lineitem
>                                                    Index Cond:
> (l_orderkey = orders.o_orderkey)
>                                        ->  Parallel Hash
>                                              ->  Parallel Seq Scan on customer
>                                  ->  Hash
>                                        ->  Seq Scan on nation
>  Generated Plan Advice:
>    JOIN_ORDER(orders lineitem customer nation)
>    NESTED_LOOP_PLAIN(lineitem)
>    HASH_JOIN(customer nation)
>    SEQ_SCAN(orders customer nation)
>    INDEX_SCAN(lineitem public.lineitem_l_orderkey_idx_l_returnflag)
>    GATHER_MERGE((customer orders lineitem nation))

This looks correct to me.

> but when set the advice it generates wrong NL instead of expected
> Parallel HJ (so another way to fix is to simply disable PQ, yuck),
> but:

This is obviously bad. I'm not quite sure what happened here, but my
guess is that something prevented the JOIN_ORDER advice from being
applied cleanly and then everything went downhill from there. I wonder
if JOIN_ORDER doesn't interact properly with incremental sorts --
that's a situation for which I don't think I have existing test
coverage.

> So to me it looks like in Generated Plan Advice we:
> - have proper HASH_JOIN(customer nation)
> - but it somehow forgot to include "HASH_JOIN(orders)" to cover for
> that Parallel Hash Join on (orders.o_custkey = customer.c_custkey)
> with input from NL. After adding that manually, it achieves the same
> input plan properly.

The first table in the JOIN_ORDER() specification isn't supposed to
have a join method specification, because the join method specifier
says what appears on the inner, i.e. second, arm of the join.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Haibo Yan
Дата:
Hi Robert,

Thank you very much for your work on the pg_plan_advice patch series. It is an impressive and substantial contribution, and it seems like a meaningful step forward toward addressing long-standing query plan stability issues in PostgreSQL.

While reviewing the v7 patches, I noticed a few points that I wanted to raise for discussion:
1. GEQO interaction (patch 4):
Since GEQO relies on randomized search, is there a risk that the optimizer may fail to explore the specific join order or path that is being enforced by the advice mask? In that case, could this lead to failures such as inability to construct the required join relation or excessive planning time if the desired path is not sampled?
2. Parallel query serialization (patches 1–3):
Several new fields (subrtinfos, elidedNodes, child_append_relid_sets) are added to PlannedStmt, but I did not see corresponding changes in outfuncs.c / readfuncs.c. Without serialization support, parallel workers executing subplans or Append nodes may not receive this metadata. Is this handled elsewhere, or is it something still pending?
3. Alias handling when generating advice (patch 5):
In pgpa_output_relation_name, the advice string is generated using get_rel_name(relid), which resolves to the underlying table name rather than the RTE alias. In self-join cases this could be ambiguous (e.g., my_table vs my_table). Would it be more appropriate to use the RTE alias when available?
4. Minor typo (patch 4):
In src/include/nodes/relation.h, parititonwise appears to be a typo and should likely be partitionwise.

I hope these comments are helpful, and I apologize in advance if any of this is already addressed elsewhere in the series.

Best regards,
Haibo

On Mon, Dec 15, 2025 at 12:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
Here's v7.

In 0001, I removed "const" from a node's struct declaration, because
Tom gave me some feedback to avoid that on another recent patch, and I
noticed I had done it here also. 0002, 0003, and 0004 are unchanged.

In 0005:

- Refactored the code to avoid issuing SEMIJOIN_NON_UNIQUE() advice in
cases where uniqueness wasn't actually considered.
- Adjusted the code not to issue NO_GATHER() advice for non-relation
RTEs. (This is the issue reported by Ajay Pal in a recent message to
this thread, which was also mentioned in an XXX in the code.)
- Reject zero-length delimited identifiers, per Jacob's email.
- Properly initialize element->indexes in pgpa_trove_add_to_hash, per
Jacob'e email.
- Add gather((d d/d.d)) test case, per Jacob, and fix the related bug
in pgpa_identifier_matches_target, per Jacob's email.
- Add EXPLAIN SELECT 1 after various test cases in syntax.sql, to
improve test coverage, per analysis of why the existing test case
didn't catch a bug previously reported by Jacob.
- Added a dummy initialization to pgpa_collector.c to placate nervous
compilers, per discussion with Jacob.

I think this mostly catches me up on responding to issues reported
here, although there is one thing reported to me off-list that I
haven't dealt with yet. If there's anything reported on thread that
isn't addressed here, let me know.

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg_plan_advice

От
Lukas Fittl
Дата:
On Mon, Dec 15, 2025 at 12:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
> Here's v7.

I'm excited about this patch series, and in an effort to help land the infrastructure, here is a review of 0001 - 0003 to start:

For 0001, I'm not sure the following comment is correct:

> /* When recursing = true, it's an unplanned or dummy subquery. */
> rtinfo->dummy = recursing;

Later in that function we only recurse if its a dummy subquery - in the case of an unplanned subquery (rel->subroot == NULL)
add_rtes_to_flat_rtable won't be called again (instead the relation RTEs are directly added to the finalrtable). Maybe we can
clarify that comment as "When recursing = true, it's a dummy subquery or its children.".

From my medium-level understanding of the planner, I don't think the lack of tracking unplanned subqueries
in subrtinfos is a problem, at least for the pg_plan_advice type use cases.

---

For 0002:

It might be helpful to clarify in a comment that ElidedNode's plan_node_id represents the surviving node, not that of the elided node.

I also noticed that this currently doesn't support cases where multiple nodes are elided, e.g. with multi-level table partitioning:

CREATE TABLE pt (l1 date, l2 text) PARTITION BY RANGE (l1);
CREATE TABLE pt_202512 PARTITION OF pt FOR VALUES FROM ('2025-12-01') TO ('2026-01-01') PARTITION BY LIST (l2);
CREATE TABLE pt_202512_TEST PARTITION OF pt_202512 FOR VALUES IN ('TEST');

EXPLAIN (RANGE_TABLE) SELECT * FROM pt WHERE l1 = '2025-12-15' AND l2 = 'TEST';

                            QUERY PLAN                            
-------------------------------------------------------------------
 Seq Scan on pt_202512_test pt  (cost=0.00..29.05 rows=1 width=36)
   Filter: ((l1 = '2025-12-15'::date) AND (l2 = 'TEST'::text))
   Scan RTI: 3
   Elided Node Type: Append
   Elided Node RTIs: 1    <=== This is missing RTI 2
 RTI 1 (relation, inherited, in-from-clause):
   Relation: pt
 RTI 2 (relation, inherited, in-from-clause):
   Relation: pt_202512
 RTI 3 (relation, in-from-clause):
   Relation: pt_202512_test
 Unprunable RTIs: 1 2 3

In a quick test, adding child_append_relid_sets (from 0003) to the relids being passed to record_elided_node fixes
that. Presumably the case of partitionwise join relids doesn't matter, because that would prevent it being elided.

---

For 0003:

I also find the "cars" variable suffix a bit hard to understand, but not sure a comment next to the variables is that useful.
Separately, the noise generated by all the additional "_cars" variables isn't great.

I wonder a little bit if we couldn't introduce a better abstraction here, e.g. a struct "AppendPathInput" that contains the
two related lists, and gets populated by accumulate_append_subpath/get_singleton_append_subpath and then
passed to create_append_path as a single argument.

---

Note that 0005 needs a rebase, since 48d4a1423d2e92d10077365532d92e059ba2eb2e changed the GetNamedDSMSegment API.

You may also want to move the CF entry to the PG19-4 commitfest so CFbot runs again.

Thanks,
Lukas
 
--
Lukas Fittl

Re: pg_plan_advice

От
Robert Haas
Дата:
On Mon, Dec 29, 2025 at 8:15 PM Lukas Fittl <lukas@fittl.com> wrote:
> For 0001, I'm not sure the following comment is correct:
>
> > /* When recursing = true, it's an unplanned or dummy subquery. */
> > rtinfo->dummy = recursing;
>
> Later in that function we only recurse if its a dummy subquery - in the case of an unplanned subquery (rel->subroot
==NULL) 
> add_rtes_to_flat_rtable won't be called again (instead the relation RTEs are directly added to the finalrtable).
Maybewe can 
> clarify that comment as "When recursing = true, it's a dummy subquery or its children.".

Presumably, a child of an unplanned or dummy subquery will also be
unplanned or dummy, so I'm not sure I understand the need to clarify
here.

> From my medium-level understanding of the planner, I don't think the lack of tracking unplanned subqueries
> in subrtinfos is a problem, at least for the pg_plan_advice type use cases.

I don't think so, either. I believe that anything that falls into this
category is something that is not actually going to be reflected in
the final plan tree, but we can't lose track of it completely because
it can matter for purposes like locking or invalidation. Since plan
advice only targets things that appear in the final plan tree, it
shouldn't care. If we did want to care, e.g. to emit advice like
WE_ARE_EXPECTING_THIS_TO_BE_DUMMY(whatever_table), we'd need more than
an rtoffset per subquery; we'd have to map each RTI individually,
because some RTIs are tossed completely for in the "dummy" case,
meaning that the rtoffset isn't constant for the whole subquery.
AFAICT, this is not an issue because we need not care about the dummy
subqueries at all. The only reason I included the SubPlanRTInfo at all
for this case is that the previous SubPlanRTInfo might be for a
non-dummy subquery, and some code might want to look at the next entry
in the list to see where the portion of the range table belonging to
that previous subqueries ends. This lets you do that.

> For 0002:
>
> It might be helpful to clarify in a comment that ElidedNode's plan_node_id represents the surviving node, not that of
theelided node. 

Good point. I'll add this comment:

+ *
+ * plan_node_id is that of the surviving plan node, the sole child of the
+ * one which was elided.

> I also noticed that this currently doesn't support cases where multiple nodes are elided, e.g. with multi-level table
partitioning:
>
> CREATE TABLE pt (l1 date, l2 text) PARTITION BY RANGE (l1);
> CREATE TABLE pt_202512 PARTITION OF pt FOR VALUES FROM ('2025-12-01') TO ('2026-01-01') PARTITION BY LIST (l2);
> CREATE TABLE pt_202512_TEST PARTITION OF pt_202512 FOR VALUES IN ('TEST');
>
> EXPLAIN (RANGE_TABLE) SELECT * FROM pt WHERE l1 = '2025-12-15' AND l2 = 'TEST';
>
>                             QUERY PLAN
> -------------------------------------------------------------------
>  Seq Scan on pt_202512_test pt  (cost=0.00..29.05 rows=1 width=36)
>    Filter: ((l1 = '2025-12-15'::date) AND (l2 = 'TEST'::text))
>    Scan RTI: 3
>    Elided Node Type: Append
>    Elided Node RTIs: 1    <=== This is missing RTI 2
>  RTI 1 (relation, inherited, in-from-clause):
>    Relation: pt
>  RTI 2 (relation, inherited, in-from-clause):
>    Relation: pt_202512
>  RTI 3 (relation, in-from-clause):
>    Relation: pt_202512_test
>  Unprunable RTIs: 1 2 3
>
> In a quick test, adding child_append_relid_sets (from 0003) to the relids being passed to record_elided_node fixes
> that. Presumably the case of partitionwise join relids doesn't matter, because that would prevent it being elided.

I'm not really sure there's a problem here. We definitely do not want
to end up with something like "Elided Node RTIs: 1 2". What I've found
experimentally is that it's often important to preserve relid sets,
but you need to preserve them as sets, not individually. So there
could be an argument that we somehow want to preserve both {1} and {2}
here, but that's not equivalent to {1,2}, which looks like a
partitionwise join between relid 1 and relid 2. But it isn't
especially clear to me that we actually need to preserve RTI 2 here.
One reason why preserving RTIs is important is so that as we descend a
join tree, we can find the RTIs that the optimizer thought it was
joining, but that only requires finding RTI 1, not RTI 2. Another
reason why preserving RTIs is important is so that we can use relation
identifiers to describe planning decisions made with respect to those
RTIs, but that doesn't apply here because partition expansion just
always happens.

Of course, it's quite possible that there are reasons unrelated to
this patch set why this information would be good to preserve, but if
we want to do it, we're going to have to adjust the data
representation somehow. We'd either need to give the ElidedNode a
"cars" representation instead of a single RTI set, or we'd need to
have some separate way of representing this. I hesitate a little bit
to design something without a use case in mind, but maybe you have
one?

> For 0003:
>
> I also find the "cars" variable suffix a bit hard to understand, but not sure a comment next to the variables is that
useful.
> Separately, the noise generated by all the additional "_cars" variables isn't great.
>
> I wonder a little bit if we couldn't introduce a better abstraction here, e.g. a struct "AppendPathInput" that
containsthe 
> two related lists, and gets populated by accumulate_append_subpath/get_singleton_append_subpath and then
> passed to create_append_path as a single argument.

I spent some time thinking about this day and haven't been quite able
to come up with something that I like. The problem is that
pa_partial_subpaths and pa_nonpartial_subpaths share a single
child_append_relid_sets variable, namely pa_subpath_cars, and
accumulate_append_subpaths gets called with that as the last argument
and different things for the previous two. One thing I tried was
making the AppendPathInput struct contain three lists rather than two,
but then accumulate_append_subpath() needs an argument that makes it
work in one of three different modes:

Mode 1: normal -- add everything to the "normal" list
Mode 2: building parallel-aware append with partial path -- add things
to the "normal" list except for parallel-aware appends which need to
be split between the normal and special lists
Mode 3: building parallel-aware append with non-partial path -- add
things to the "special" list

I also tried splitting up accumulate_append_subpath() into two
functions, thinking that maybe I could segregate the parallel-append
handling from the more normal cases. This seems somewhat appealing in
the sense that having accumulate_append_subpath() hold a bunch of
extra logic that only one call site needs isn't very nice, but
changing it doesn't really seem to help with the problem that we have
two subpath lists sharing one cars list in this case. I'll try to find
some more time to think about this, but if you have any ideas
meanwhile, I'd be happy to hear them.

> Note that 0005 needs a rebase, since 48d4a1423d2e92d10077365532d92e059ba2eb2e changed the GetNamedDSMSegment API.

I'll fix this. I don't love the way that commit made the callback and
the callback arg non-consecutive arguments.

> You may also want to move the CF entry to the PG19-4 commitfest so CFbot runs again.

It seems that I cannot move it to 19-4. I moved it to 19-Final.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Robert Haas
Дата:
On Mon, Dec 29, 2025 at 6:34 PM Haibo Yan <tristan.yim@gmail.com> wrote:
> 1. GEQO interaction (patch 4):
> Since GEQO relies on randomized search, is there a risk that the optimizer may fail to explore the specific join
orderor path that is being enforced by the advice mask? In that case, could this lead to failures such as inability to
constructthe required join relation or excessive planning time if the desired path is not sampled? 

The interaction of this feature with GEQO definitely needs more study.
If you have some time to work on this, I think testing and reporting
results would be quite useful. However, I don't think we should ever
get planner failure, and I'm doubtful about excessive planning time as
well. The effect of plan advice is to disable some paths just as if
enable_<whatever> were set to false, so if you provide very specific
advice while planning with GEQO, I think you might just end up with a
disabled path that doesn't account for the advice. However, this
should be checked, and I haven't gotten there yet. I'll add an XXX to
the README to make sure this doesn't get forgotten.

> 2. Parallel query serialization (patches 1–3):
> Several new fields (subrtinfos, elidedNodes, child_append_relid_sets) are added to PlannedStmt, but I did not see
correspondingchanges in outfuncs.c / readfuncs.c. Without serialization support, parallel workers executing subplans or
Appendnodes may not receive this metadata. Is this handled elsewhere, or is it something still pending? 

I believe that gen_node_support.pl should take care of this
automatically unless the node type is flagged as
pg_node_attr(custom_read_write).

> 3. Alias handling when generating advice (patch 5):
> In pgpa_output_relation_name, the advice string is generated using get_rel_name(relid), which resolves to the
underlyingtable name rather than the RTE alias. In self-join cases this could be ambiguous (e.g., my_table vs
my_table).Would it be more appropriate to use the RTE alias when available? 

No. That function is only used for indexes.

> 4. Minor typo (patch 4):
> In src/include/nodes/relation.h, parititonwise appears to be a typo and should likely be partitionwise.

Will fix, thanks.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Lukas Fittl
Дата:
On Mon, Dec 29, 2025 at 5:15 PM Lukas Fittl <lukas@fittl.com> wrote:
On Mon, Dec 15, 2025 at 12:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
> Here's v7.

I'm excited about this patch series, and in an effort to help land the infrastructure, here is a review of 0001 - 0003 to start:

For the review of 0004, I decided to spend a few days to test if the plan generation strategy logic will
work for pg_hint_plan, as an existing extension in the ecosystem that is widely used, but functions today
by virtue of modifying planner GUCs whilst the planner is running.

First of all, as is expected, the extension completely stops working with 0004 in place - because we now
only read the GUCs at planner start, the mechanism of modifying them in the middle doesn't work. I don't
think we can avoid this.

That said, good news: After a bunch of iterations, I get a clean pass on the pg_hint_plan regression tests,
whilst completely dropping its copying of core code and hackish re-run of set_plain_rel_pathlist. See [0]
for a draft PR (on my own fork of pg_hint_plan) with individual patches that explain some regression test
differences.

Adding Michael in CC, since he's been thankfully maintaining pg_hint_plan over the years, and I think if
0004 gets merged that should significantly reduce the maintenance burden, independently of what happens
with pg_plan_advice - so his input would be useful here.

The biggest change in the regression test output was due to how the "Parallel" hint worked in pg_hint_plan
(basically it was setting parallel_*_cost to zero, and then messed with the gucs that factor into
compute_parallel_worker) -- I think the only sensible thing to do is to change that in pg_hint_plan, and
instead rely on rejecting non-partial paths with PGS_CONSIDER_NONPARTIAL if "hard" enforcement of
parallelism is requested. That caused some minor plan changes, but I think they can still be argued to be
matching the user's intent of "make a scan involving this relation parallel".

There were two bugs in 0004 that I had to fix to make this work:

In cost_index, we are checking "path->path.parallel_workers == 0", but parallel_workers only gets
set later in the function, causing the PGS_CONSIDER_NONPARTIAL mask to not be applied. Replacing
this with checking the "partial_path" argument instead makes it work.

In cost_samplescan, we set the PGS_CONSIDER_NONPARTIAL mask if its a non-partial path, but that
causes Sample Scans to always be disabled when setting PGS_CONSIDER_NONPARTIAL on the
relation. I think we could simply drop that check, since we never generate partial sample scan paths.

Otherwise 0004 looks good to me, and the mechanism of working with mask values felt natural to me,
especially in contrast with existing ways to achieve something similar. I did not test partition wise joins,
since pg_hint_plan doesn't cover them today.
Lukas

--
Lukas Fittl

Re: pg_plan_advice

От
Haibo Yan
Дата:
>> 1. GEQO interaction (patch 4):
>> Since GEQO relies on randomized search, is there a risk that the optimizer may fail to explore the specific join order or path that is being enforced by the advice mask? In that case, could this lead to failures such as inability to construct the required join relation or excessive planning time if the desired path is not sampled?

> The interaction of this feature with GEQO definitely needs more study.
> If you have some time to work on this, I think testing and reporting
> results would be quite useful. However, I don't think we should ever
> get planner failure, and I'm doubtful about excessive planning time as
> well. The effect of plan advice is to disable some paths just as if
> enable_<whatever> were set to false, so if you provide very specific
> advice while planning with GEQO, I think you might just end up with a
> disabled path that doesn't account for the advice. However, this
> should be checked, and I haven't gotten there yet. I'll add an XXX to
> the README to make sure this doesn't get forgotten.

I conducted extensive tests today using randomized advice strings to challenge pg_plan_advice under GEQO pressure. The results strongly support your hypothesis: for standard Left-Deep trees (which GEQO natively supports), the interaction is stable and efficient.

I executed a stress test involving 100,000 iterations (100 random join structures x 1000 random seeds). The planning time remained low, and no planning failures occurred for valid topology advice.

Observation on Bushy Plans: I did identify one anomaly regarding "Bushy Plans" (e.g., ((t1 t2) (t3 t4))). Since PostgreSQL's GEQO implementation is strictly Left-Deep and cannot generate Bushy trees, if a user manually forges a Bushy Plan advice:

It does not cause a planner crash (e.g., "failed to construct join relation").

Instead, the planner seems to silently ignore the structural constraint of the advice and falls back to a path GEQO can actually find.

I believe this behavior is acceptable because pg_plan_advice is intended to stabilize plans that the optimizer can generate. Since GEQO cannot generate Bushy plans, users should not be supplying them.

Script
-----------------------------------------------------------------------------------
/* * GEQO Stress Test for pg_plan_advice
 * -----------------------------------
 * Methodology:
 * 1. Generates 100 random "Left-Deep" join topologies (t1 joining t2..t100 in random orders).
 * 2. This simulates valid advice that GEQO is capable of producing.
 * 3. For each topology, runs 1000 iterations with random GEQO seeds.
 * 4. Measures success rate and planning time overhead.
 */
DO $$
DECLARE
    v_jo       TEXT;
    v_jo_rest  TEXT;
    v_nl       TEXT;
    v_scan     TEXT;
    v_ng       TEXT;
    v_adv      TEXT;
    v_sql      TEXT;
    v_seed     FLOAT;
    v_ok       INT := 0;
    v_err      INT := 0;
    v_msg      TEXT;
    k          INT;
    i          INT;
    j          INT;
    v_ts1      timestamp;
    v_ts2      timestamp;
    v_cur_ms   numeric;
    v_total_ms numeric := 0;
    v_max_ms   numeric := 0;
BEGIN
    -- Pre-generate static parts of the advice to save time
    SELECT string_agg('t'||n, ' ' ORDER BY n) INTO v_nl   FROM generate_series(2,100) n;
    SELECT string_agg('t'||n, ' ' ORDER BY n) INTO v_scan FROM generate_series(1,100) n;
    SELECT string_agg('t'||n, ' ' ORDER BY n) INTO v_ng   FROM generate_series(1,100) n;

    -- Construct the SQL: t1 JOIN t2 JOIN t3 ... JOIN t100
    v_sql := 'EXPLAIN (COSTS OFF) SELECT count(*) FROM t1';
    FOR j IN 2..100 LOOP
        v_sql := v_sql || ' JOIN t' || j || ' ON t1.id=t' || j || '.id';
    END LOOP;

    -- Configure GEQO for stress testing (force it ON, low effort/pool)
    PERFORM set_config('geqo', 'on', false);
    PERFORM set_config('geqo_threshold', '12', false);
    PERFORM set_config('geqo_effort', '1', false);
    PERFORM set_config('geqo_pool_size', '0', false);

    RAISE NOTICE 'Starting Stress Test: 100 Outer Loops (Random Plans) x 1000 Inner Loops (Random Seeds)...';

    -- Outer Loop: Generate 100 different valid Advice structures
    FOR k IN 1..100 LOOP
        -- Randomize the join order of t2..t100 to simulate different Left-Deep trees
        SELECT string_agg('t'||n, ' ' ORDER BY random()) INTO v_jo_rest FROM generate_series(2,100) n;
        v_jo := 't1 ' || v_jo_rest;

        v_adv := 'JOIN_ORDER(' || v_jo || ') ' ||
                 'NESTED_LOOP_PLAIN(' || v_nl || ') ' ||
                 'SEQ_SCAN(' || v_scan || ') ' ||
                 'NO_GATHER(' || v_ng || ')';

        PERFORM set_config('pg_plan_advice.advice', v_adv, false);

        -- Inner Loop: Test the specific advice against 1000 random GEQO seeds
        FOR i IN 1..1000 LOOP
            v_seed := random();
            PERFORM set_config('geqo_seed', v_seed::text, false);

            BEGIN
                v_ts1 := clock_timestamp();
               
                EXECUTE v_sql;
               
                v_ts2 := clock_timestamp();

                v_cur_ms := EXTRACT(EPOCH FROM (v_ts2 - v_ts1)) * 1000;
                v_total_ms := v_total_ms + v_cur_ms;
               
                IF v_cur_ms > v_max_ms THEN
                    v_max_ms := v_cur_ms;
                END IF;

                v_ok := v_ok + 1;

            EXCEPTION WHEN OTHERS THEN
                GET STACKED DIAGNOSTICS v_msg = MESSAGE_TEXT;
                v_err := v_err + 1;
                RAISE WARNING 'Outer % / Inner % Crashed! Seed: %, Err: %', k, i, v_seed, v_msg;
            END;
        END LOOP;
       
        RAISE NOTICE 'Batch %/100 completed.', k;
    END LOOP;

    RAISE NOTICE '---------------------------';
    RAISE NOTICE 'Total Scenarios: 100,000';
    RAISE NOTICE 'Success:         %', v_ok;
    RAISE NOTICE 'Failed:          %', v_err;
    RAISE NOTICE 'Total Time:      % ms', round(v_total_ms, 2);
    RAISE NOTICE 'Avg Time:        % ms', round(v_total_ms / (v_ok + v_err + 0.0001), 2);
    RAISE NOTICE 'Max Time:        % ms', round(v_max_ms, 2);
    RAISE NOTICE '---------------------------';

    IF v_err > 0 THEN
        RAISE NOTICE 'CONCLUSION: Conflict found.';
    ELSE
        RAISE NOTICE 'CONCLUSION: No errors found.';
    END IF;
END $$;
-----------------------------------------------------------------------------------


On Tue, Jan 6, 2026 at 11:50 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Dec 29, 2025 at 6:34 PM Haibo Yan <tristan.yim@gmail.com> wrote:
> 1. GEQO interaction (patch 4):
> Since GEQO relies on randomized search, is there a risk that the optimizer may fail to explore the specific join order or path that is being enforced by the advice mask? In that case, could this lead to failures such as inability to construct the required join relation or excessive planning time if the desired path is not sampled?

The interaction of this feature with GEQO definitely needs more study.
If you have some time to work on this, I think testing and reporting
results would be quite useful. However, I don't think we should ever
get planner failure, and I'm doubtful about excessive planning time as
well. The effect of plan advice is to disable some paths just as if
enable_<whatever> were set to false, so if you provide very specific
advice while planning with GEQO, I think you might just end up with a
disabled path that doesn't account for the advice. However, this
should be checked, and I haven't gotten there yet. I'll add an XXX to
the README to make sure this doesn't get forgotten.

> 2. Parallel query serialization (patches 1–3):
> Several new fields (subrtinfos, elidedNodes, child_append_relid_sets) are added to PlannedStmt, but I did not see corresponding changes in outfuncs.c / readfuncs.c. Without serialization support, parallel workers executing subplans or Append nodes may not receive this metadata. Is this handled elsewhere, or is it something still pending?

I believe that gen_node_support.pl should take care of this
automatically unless the node type is flagged as
pg_node_attr(custom_read_write).

> 3. Alias handling when generating advice (patch 5):
> In pgpa_output_relation_name, the advice string is generated using get_rel_name(relid), which resolves to the underlying table name rather than the RTE alias. In self-join cases this could be ambiguous (e.g., my_table vs my_table). Would it be more appropriate to use the RTE alias when available?

No. That function is only used for indexes.

> 4. Minor typo (patch 4):
> In src/include/nodes/relation.h, parititonwise appears to be a typo and should likely be partitionwise.

Will fix, thanks.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg_plan_advice

От
Jacob Champion
Дата:
On Mon, Dec 15, 2025 at 12:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
> - Add gather((d d/d.d)) test case, per Jacob, and fix the related bug
> in pgpa_identifier_matches_target, per Jacob's email.

I think this fix affected the ability to omit the partition schema in
advice strings. Attached is a quick test that shows it on my machine,
plus an attempted fix that mashes together the v6 and v7 approaches.
(I have diffs in my generated plan advice compared to what's in
v7-0005, so I'm not sure if my .out file is correct.)

--Jacob

Вложения

Re: pg_plan_advice

От
Robert Haas
Дата:
On Wed, Jan 7, 2026 at 2:04 AM Lukas Fittl <lukas@fittl.com> wrote:
> That said, good news: After a bunch of iterations, I get a clean pass on the pg_hint_plan regression tests,
> whilst completely dropping its copying of core code and hackish re-run of set_plain_rel_pathlist. See [0]
> for a draft PR (on my own fork of pg_hint_plan) with individual patches that explain some regression test
> differences.

That sounds AWESOME.

> The biggest change in the regression test output was due to how the "Parallel" hint worked in pg_hint_plan
> (basically it was setting parallel_*_cost to zero, and then messed with the gucs that factor into
> compute_parallel_worker) -- I think the only sensible thing to do is to change that in pg_hint_plan, and
> instead rely on rejecting non-partial paths with PGS_CONSIDER_NONPARTIAL if "hard" enforcement of
> parallelism is requested. That caused some minor plan changes, but I think they can still be argued to be
> matching the user's intent of "make a scan involving this relation parallel".

Cool. I'm sort of curious what changed, but maybe it's not important
enough to spend time discussing right now.

> There were two bugs in 0004 that I had to fix to make this work:
>
> In cost_index, we are checking "path->path.parallel_workers == 0", but parallel_workers only gets
> set later in the function, causing the PGS_CONSIDER_NONPARTIAL mask to not be applied. Replacing
> this with checking the "partial_path" argument instead makes it work.

I agree that this is a bug. I'm thinking this might be the appropriate fix:

     enable_mask = (indexonly ? PGS_INDEXONLYSCAN : PGS_INDEXSCAN)
-        | (path->path.parallel_workers == 0 ? PGS_CONSIDER_NONPARTIAL : 0);
+        | (partial_path ? 0 : PGS_CONSIDER_NONPARTIAL);

> In cost_samplescan, we set the PGS_CONSIDER_NONPARTIAL mask if its a non-partial path, but that
> causes Sample Scans to always be disabled when setting PGS_CONSIDER_NONPARTIAL on the
> relation. I think we could simply drop that check, since we never generate partial sample scan paths.

This one is less obvious to me. I mean, if PGS_CONSIDER_NONPARTIAL
lets us consider non-partial plans, and a sample scan is a non-partial
plan, then shouldn't the flag need to be set in order for us to
consider it? If not, maybe we need to rethink the name or the
semantics of that bit in some way.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Lukas Fittl
Дата:
On Thu, Jan 8, 2026 at 8:07 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > The biggest change in the regression test output was due to how the "Parallel" hint worked in pg_hint_plan
> > (basically it was setting parallel_*_cost to zero, and then messed with the gucs that factor into
> > compute_parallel_worker) -- I think the only sensible thing to do is to change that in pg_hint_plan, and
> > instead rely on rejecting non-partial paths with PGS_CONSIDER_NONPARTIAL if "hard" enforcement of
> > parallelism is requested. That caused some minor plan changes, but I think they can still be argued to be
> > matching the user's intent of "make a scan involving this relation parallel".
>
> Cool. I'm sort of curious what changed, but maybe it's not important
> enough to spend time discussing right now.

In my assessment there were roughly four kinds of changes on the
regression test outputs for pg_hint_plan:

1) The order of operations (e.g. whats the inner/outer rel in a
parallel plan) - I think that's probably caused by the previous zero
cost confusing the planner
2) The placement of the Gather node in the case of joins (its now on
top of the join in more cases, vs below it) - I think that's also
caused by the previous zero cost logic
3) Parallelism wasn't enforced before, but is enforced now (e.g.
sequential scans on empty relations didn't get a Gather node
previously)
4) Worker count changed because rel_parallel_workers is still limited
by max_parallel_workers_per_gather, and so my patched version now sets
that for the whole query to the highest of the workers requested in
Parallel hints (vs before it was able to keep that to just what a plan
node needed)

> > There were two bugs in 0004 that I had to fix to make this work:
> >
> > In cost_index, we are checking "path->path.parallel_workers == 0", but parallel_workers only gets
> > set later in the function, causing the PGS_CONSIDER_NONPARTIAL mask to not be applied. Replacing
> > this with checking the "partial_path" argument instead makes it work.
>
> I agree that this is a bug. I'm thinking this might be the appropriate fix:
>
>      enable_mask = (indexonly ? PGS_INDEXONLYSCAN : PGS_INDEXSCAN)
> -        | (path->path.parallel_workers == 0 ? PGS_CONSIDER_NONPARTIAL : 0);
> +        | (partial_path ? 0 : PGS_CONSIDER_NONPARTIAL);

Yup, that's exactly the change I had locally that worked as expected.

> > In cost_samplescan, we set the PGS_CONSIDER_NONPARTIAL mask if its a non-partial path, but that
> > causes Sample Scans to always be disabled when setting PGS_CONSIDER_NONPARTIAL on the
> > relation. I think we could simply drop that check, since we never generate partial sample scan paths.
>
> This one is less obvious to me. I mean, if PGS_CONSIDER_NONPARTIAL
> lets us consider non-partial plans, and a sample scan is a non-partial
> plan, then shouldn't the flag need to be set in order for us to
> consider it? If not, maybe we need to rethink the name or the
> semantics of that bit in some way.

Yeah, I would agree with you that is inconsistent with the flag's name
- but on the flip side, its difficult for the caller to conditionally
set the flag (which you'd have to do to avoid a "Disabled" showing in
the plan), since we're setting it on the RelOptInfo (do we know if the
scan is a sample scan at that point?).

I wonder if its worth considering to invert the flag, i.e.
"PGS_PREFER_PARTIAL" - that way its clear that in case of paths that
can't be partial, we use the non-partial version without a penalty.

PS: I realized my prior mails were not plain text and had bad word
wrap (thanks Gmail-based workflow!) - hopefully this one is better :)

Thanks,
Lukas

--
Lukas Fittl



Re: pg_plan_advice

От
Robert Haas
Дата:
On Thu, Jan 8, 2026 at 11:31 AM Lukas Fittl <lukas@fittl.com> wrote:
> Yeah, I would agree with you that is inconsistent with the flag's name
> - but on the flip side, its difficult for the caller to conditionally
> set the flag (which you'd have to do to avoid a "Disabled" showing in
> the plan), since we're setting it on the RelOptInfo (do we know if the
> scan is a sample scan at that point?).

How about checking rte->tablesample, as set_rel_pathlist does?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Robert Haas
Дата:
On Wed, Jan 7, 2026 at 6:11 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> On Mon, Dec 15, 2025 at 12:06 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > - Add gather((d d/d.d)) test case, per Jacob, and fix the related bug
> > in pgpa_identifier_matches_target, per Jacob's email.
>
> I think this fix affected the ability to omit the partition schema in
> advice strings. Attached is a quick test that shows it on my machine,
> plus an attempted fix that mashes together the v6 and v7 approaches.
> (I have diffs in my generated plan advice compared to what's in
> v7-0005, so I'm not sure if my .out file is correct.)

Right, so I also am seeing some instability in the plans for some of
the test cases. I've attempted to address that in v8 by making the
three partitioned tables have unequal row counts. I also adapted your
test case and adopted your code fix. Unfortunately, I haven't been
able to respond to everyone's reports about v7 yet, partly because of
Christmas vacation, partly because of the number of reports, and
partly because, uh, I'm slow I guess. But here's the list of things
that have changed:

0001, 0003: No changes.

0002: Comment update.

0004: Comment update, bug fix to cost_index() per comment from Lukas.

- Added an XXX to the README to highlight the need for more GEQO investigation.
- Regression test expected output changes.
- Make partitionwise_join test tables different sizes in the hopes of
stabilizing test results.
- Add test case for forcing partitionwise join order with and without
a schema specification.
- Add new GUC pg_plan_advice.trace_mask because I've written similar
debugging code too many times already.
- Replace reference to inner_beneath_any_gather[i] with
inner_beneath_any_gather[k] per comment from Jakub.
- Add logic to pgpa_planner_apply_join_path_advice taking into account
that PARTITIONWISE() implicitly constrains the join order.
- Rebased.

--
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

Re: pg_plan_advice

От
Lukas Fittl
Дата:
On Thu, Jan 8, 2026 at 10:22 AM Robert Haas <robertmhaas@gmail.com> wrote:
> 0004: Comment update, bug fix to cost_index() per comment from Lukas.

Thanks! I've tested this and this works as expected on current master
with the updated pg_hint_plan code, and checking for tablesample in
the code that sets the mask.

On Thu, Jan 8, 2026 at 8:38 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Jan 8, 2026 at 11:31 AM Lukas Fittl <lukas@fittl.com> wrote:
> > Yeah, I would agree with you that is inconsistent with the flag's name
> > - but on the flip side, its difficult for the caller to conditionally
> > set the flag (which you'd have to do to avoid a "Disabled" showing in
> > the plan), since we're setting it on the RelOptInfo (do we know if the
> > scan is a sample scan at that point?).
>
> How about checking rte->tablesample, as set_rel_pathlist does?

Yeah, that works - its a bit inconvenient for two reasons, but I don't
think that warrants a redesign:

1) get_relation_info_hook doesn't get a RangeTblEntry passed (like
set_rel_pathlist_hook), but that's solvable by looking it up via
simple_rte_array
2) It requires maintaining a special case in the logic that says "make
it parallel", vs the planner that is authoritative (i.e. if we add
more special cases in the future, each extension will have to be
updated to reflect that)

Thanks,
Lukas

--
Lukas Fittl



Re: pg_plan_advice

От
Robert Haas
Дата:
On Thu, Jan 8, 2026 at 2:14 PM Lukas Fittl <lukas@fittl.com> wrote:
> On Thu, Jan 8, 2026 at 10:22 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > 0004: Comment update, bug fix to cost_index() per comment from Lukas.
>
> Thanks! I've tested this and this works as expected on current master
> with the updated pg_hint_plan code, and checking for tablesample in
> the code that sets the mask.

Nice!

> > How about checking rte->tablesample, as set_rel_pathlist does?
>
> Yeah, that works - its a bit inconvenient for two reasons, but I don't
> think that warrants a redesign:
>
> 1) get_relation_info_hook doesn't get a RangeTblEntry passed (like
> set_rel_pathlist_hook), but that's solvable by looking it up via
> simple_rte_array

That's a pretty normal thing to have to do in this kind of code, IMHO.

> 2) It requires maintaining a special case in the logic that says "make
> it parallel", vs the planner that is authoritative (i.e. if we add
> more special cases in the future, each extension will have to be
> updated to reflect that)

IMHO, this is a policy decision by the extension and so the logic
belongs in the extension. pg_plan_advice has no similar exception,
because it made a different policy decision.

Overall, I feel that your experiment here is a pretty compelling
argument for committing 0004 (the pgs_mask stuff). I'm not sure it's
quite time to do that yet, because maybe there are still some design
changes we want to consider for it, or maybe somebody else wants to
review first. But even if that patch did nothing other than get rid of
lots of complexity and copy-paste in pg_hint_plan, that would be
enough to justify this infrastructure. The fact that basically works
as intended for both pg_plan_advice and pg_hint_plan, despite them
having no common code, is a really good sign, IMHO.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Robert Haas
Дата:
On Thu, Jan 8, 2026 at 11:31 AM Lukas Fittl <lukas@fittl.com> wrote:
> In my assessment there were roughly four kinds of changes on the
> regression test outputs for pg_hint_plan:
>
> 1) The order of operations (e.g. whats the inner/outer rel in a
> parallel plan) - I think that's probably caused by the previous zero
> cost confusing the planner
> 2) The placement of the Gather node in the case of joins (its now on
> top of the join in more cases, vs below it) - I think that's also
> caused by the previous zero cost logic
> 3) Parallelism wasn't enforced before, but is enforced now (e.g.
> sequential scans on empty relations didn't get a Gather node
> previously)
> 4) Worker count changed because rel_parallel_workers is still limited
> by max_parallel_workers_per_gather, and so my patched version now sets
> that for the whole query to the highest of the workers requested in
> Parallel hints (vs before it was able to keep that to just what a plan
> node needed)

Ideally, if you say "hey, I want to use parallel query here," you
would get the best plan that the planner knows how to construct that
happens to use parallelism. #2 sounds to me like you weren't really
getting that, because you were making parallelism on cost rather than
disabling non-parallel paths. #3 also sounds that way, because you
asked for parallelism and didn't get it. So I would tend to view those
changes as improvements.

The others are a little trickier. I don't think I quite understand
what this patch changed with respect to #4. I'm guessing that maybe
what's happening here is that this isn't really due to anything in the
patch set, but is rather due to relying on pgs_mask rather than
copy-pasting lots of code, which maybe incidentally removed the
ability to tweak something that pg_hint_plan was previously tweaking.
Maybe you want to think about writing a patch to go with 0004 that
addresses that gap specifically?

I'm actually kind of surprised that you didn't run into a similar
problem with the Rows() hint, for which 0004 also doesn't provide
infrastructure. If there's no problem, cool, but if there's a problem
there you haven't detected yet, maybe we should try to plug that gap,
too.

I don't quite know what to say about #1. If your theory about it being
due to the zero costs is correct, that's another example of the
current implementation not really being able to achieve the ideal
behavior, and the patch making it better. If there's something else
going on there, then I don't know.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Robert Haas
Дата:
On Wed, Jan 7, 2026 at 5:47 PM Haibo Yan <tristan.yim@gmail.com> wrote:
> Instead, the planner seems to silently ignore the structural constraint of the advice and falls back to a path GEQO
canactually find. 

Does the plan end up disabled in that case?

> I believe this behavior is acceptable because pg_plan_advice is intended to stabilize plans that the optimizer can
generate.Since GEQO cannot generate Bushy plans, users should not be supplying them. 

Right, I agree. A core principal here is that you can only nudge the
planner towards a plan it would have considered anyway. In the case of
GEQO, there is some randomness to which plans are considered. Your
advice will only be reliably take into account if it applies to
elements that must be part of the final plan. For instance, if you
advise the use of a sequential scan or an index scan, that should
work, because that relation has to be scanned somehow. Advice on a
join method should almost always work, since it can apply to any
non-leading table. Of course, you also won't be able to advise an
infeasible join method, but that would be true without GEQO, too.
Advice on the join order is going to be iffy when using GEQO -- if a
compatible join order is highly likely to be considered, e.g. because
you specify something like JOIN_ORDER(just_one) that only sets the
driving table -- then it'll probably work, but if you give a complete
join order specification, it probably won't. If you want to avoid
that, you can adjust geqo_threshold.

Thanks for looking into this.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_plan_advice

От
Lukas Fittl
Дата:
On Thu, Jan 8, 2026 at 11:37 AM Robert Haas <robertmhaas@gmail.com> wrote:
> Ideally, if you say "hey, I want to use parallel query here," you
> would get the best plan that the planner knows how to construct that
> happens to use parallelism. #2 sounds to me like you weren't really
> getting that, because you were making parallelism on cost rather than
> disabling non-parallel paths. #3 also sounds that way, because you
> asked for parallelism and didn't get it. So I would tend to view those
> changes as improvements.

Agreed from my perspective.

> The others are a little trickier. I don't think I quite understand
> what this patch changed with respect to #4. I'm guessing that maybe
> what's happening here is that this isn't really due to anything in the
> patch set, but is rather due to relying on pgs_mask rather than
> copy-pasting lots of code, which maybe incidentally removed the
> ability to tweak something that pg_hint_plan was previously tweaking.
> Maybe you want to think about writing a patch to go with 0004 that
> addresses that gap specifically?

Yeah, I don't think this needs to be reviewed in detail here now, but
I do think there is a potential to improve worker count overrides in
core code. I agree this can be done as a follow-up to 0004 - and I
don't have a good sense yet for how infrastructure for this could look
like.

pg_hint_plan has some existing logic to copy parallel worker counts
into other paths (e.g. for joins), and I kept that for now in that
draft patch [0] to reduce further test output differences.

> I'm actually kind of surprised that you didn't run into a similar
> problem with the Rows() hint, for which 0004 also doesn't provide
> infrastructure. If there's no problem, cool, but if there's a problem
> there you haven't detected yet, maybe we should try to plug that gap,
> too.

Yeah, 0004 doesn't provide infrastructure for that directly. However,
it does add joinrel_setup_hook which enables modifying joinrel->rows
at the right time without copying core code. Previously pg_hint_plan
modified the row estimates for joins through copied core code [1], but
0004 lets it affect join rels through the new hook [2].

> I don't quite know what to say about #1. If your theory about it being
> due to the zero costs is correct, that's another example of the
> current implementation not really being able to achieve the ideal
> behavior, and the patch making it better. If there's something else
> going on there, then I don't know.

Yeah, I don't think we need to worry about this in the context of
applying 0004, and its something that Michael or other pg_hint_plan
maintainers can assess when updating it.

Its worth noting for clarity, whilst 0004 requires extensions that
modify certain GUCs during the planning process to instead use the PGS
mask, max_parallel_workers_per_gather is not one of these GUCs. I was
driven to also do the parallel hint changes because I wanted to prove
that pg_hint_plan can now function without any copying of core code,
but technically all the parallel regression test changes can be
avoided when keeping the prior zero parallel_*_cost mechanism +
modifying max_parallel_workers_per_gather during planning as before,
whilst still updating the scan/join hints to use the PGS mask logic.

Thanks,
Lukas

[0]: https://github.com/lfittl/pg_hint_plan/blob/postgres-19-with-plan-generation-strategies/pg_hint_plan.c#L4602
[1]: https://github.com/ossc-db/pg_hint_plan/blob/master/make_join_rel.c#L126
[2]: https://github.com/lfittl/pg_hint_plan/blob/postgres-19-with-plan-generation-strategies/pg_hint_plan.c#L4372

--
Lukas Fittl



Re: pg_plan_advice

От
Robert Haas
Дата:
On Thu, Dec 18, 2025 at 8:36 AM Robert Haas <robertmhaas@gmail.com> wrote:
> What must be happening here is that either pgpa_join.c (maybe with
> complicity from pgpa_walker.c) is not populating the
> pgpa_plan_walker_context's join_strategies[JSTRAT_NESTED_LOOP_PLAIN]
> member correctly, or else pgpa_output.c is not serializing it to text
> correctly. I suspect the former is a more likely but I'm not sure
> exactly what's happening.

I think I see the problem: pgpa_process_unrolled_join() returns a set
called "all_relids" but it only returns the union of the inner relid
sets, not including the outer relid set. In your example, we want to
get:

NESTED_LOOP_PLAIN((part partsupp) (supplier part partsupp))

But the join order is:

JOIN_ORDER(nation (supplier (part partsupp)))

So every table is the outer table of some unrolled join, except for
the innermost table, which is partsupp. So all the others get omitted
from the output, and we get the output you saw:

NESTED_LOOP_PLAIN(partsupp partsupp)

Proposed fix attached.

--
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

Re: pg_plan_advice

От
Haibo Yan
Дата:
> Does the plan end up disabled in that case?

I understand that except for join order advice which might not be honored(due to GEQO's randomness), other forms of advice—like scan types or join methods—remain effective since those operations are inevitable parts of the plan regardless of the specific tree structure.
Thanks for the explanation regarding the design philosophy. It clarifies that the primary goal is stabilizing plans within the optimizer's valid search space rather than forcing impossible paths. I will keep the geqo_threshold adjustment in mind if strict structure enforcement is ever needed.
Regards
Haibo

On Thu, Jan 8, 2026 at 11:53 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jan 7, 2026 at 5:47 PM Haibo Yan <tristan.yim@gmail.com> wrote:
> Instead, the planner seems to silently ignore the structural constraint of the advice and falls back to a path GEQO can actually find.

Does the plan end up disabled in that case?

> I believe this behavior is acceptable because pg_plan_advice is intended to stabilize plans that the optimizer can generate. Since GEQO cannot generate Bushy plans, users should not be supplying them.

Right, I agree. A core principal here is that you can only nudge the
planner towards a plan it would have considered anyway. In the case of
GEQO, there is some randomness to which plans are considered. Your
advice will only be reliably take into account if it applies to
elements that must be part of the final plan. For instance, if you
advise the use of a sequential scan or an index scan, that should
work, because that relation has to be scanned somehow. Advice on a
join method should almost always work, since it can apply to any
non-leading table. Of course, you also won't be able to advise an
infeasible join method, but that would be true without GEQO, too.
Advice on the join order is going to be iffy when using GEQO -- if a
compatible join order is highly likely to be considered, e.g. because
you specify something like JOIN_ORDER(just_one) that only sets the
driving table -- then it'll probably work, but if you give a complete
join order specification, it probably won't. If you want to avoid
that, you can adjust geqo_threshold.

Thanks for looking into this.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg_plan_advice

От
Michael Paquier
Дата:
On Thu, Jan 08, 2026 at 11:07:31AM -0500, Robert Haas wrote:
> On Wed, Jan 7, 2026 at 2:04 AM Lukas Fittl <lukas@fittl.com> wrote:
>> That said, good news: After a bunch of iterations, I get a clean
>> pass on the pg_hint_plan regression tests, whilst completely
>> dropping its copying of core code and hackish re-run of
>> set_plain_rel_pathlist. See [0] for a draft PR (on my own fork of
>> pg_hint_plan) with individual patches that explain some regression
>> test differences.
>
> That sounds AWESOME.

So you are telling me that I can commit code that deletes code.  Count
me in.  The project has some merge requests that I've been holding on
a bit due to what's happening here and because I did not really look
at the internals that have changed.  It's great to see that you have
begun an investigation, Lukas.

>> The biggest change in the regression test output was due to how the
>> "Parallel" hint worked in pg_hint_plan (basically it was setting
>> parallel_*_cost to zero, and then messed with the gucs that factor
>> into compute_parallel_worker) -- I think the only sensible thing to
>> do is to change that in pg_hint_plan, and instead rely on rejecting
>> non-partial paths with PGS_CONSIDER_NONPARTIAL if "hard"
>> enforcement of parallelism is requested. That caused some minor
>> plan changes, but I think they can still be argued to be matching
>> the user's intent of "make a scan involving this relation
>> parallel".
>
> Cool. I'm sort of curious what changed, but maybe it's not important
> enough to spend time discussing right now.

I suspect that this is going to be an incremental integration process,
and it smells to me that it is going to require more than one major
release before being able to remove the whole set of hacks that
pg_hint_plan has been using, particularly with the GUCs, the costing
and the forced update of the backend routines which is a ugly
historical hack.  Saying that, I would need to look at the plan
outputs to be sure, perhaps we would be OK even with slight changes.
These happen every year, because the plans tested are complex enough
that some of the sub-paths are changed, but the hints still work
properly.  This year for v19 we have at least the changes in the
expression names.
--
Michael

Вложения

Re: pg_plan_advice

От
Jakub Wartak
Дата:
On Thu, Jan 8, 2026 at 7:21 PM Robert Haas <robertmhaas@gmail.com> wrote:
[..]
> I've attempted to address that in v8 [..snip]
[..]

The full TPC-H queries set still reported some issues with v8 (for
q20/with wrong plan after using advice and two "partially matched" for
q9 and q2). However after applying that tiny patch from [1] it makes
all of the problems go away and for the TPC-H query set there are no
failures anymore (yay!). By failure I mean a different plan when using
advice and/or any partial match or failure to apply advice.

So, I've switched to more realistic test for each TPC-H query:
1) ensure we have valid stats, gather plan, save advices
2) clear stats using pg_clear_relation_stats() , apply advice and
check if we have exact same plan

The only thing this hav revealed is what appears to be some tiny
problem with placing GroupAggregates or am I wrong or is that known
limitation? (The original plan shows "Partial GroupAggregate" while
the one using advice is not aware of the need to use it; yet
contrib/pg_plan_advices/README in "Future Work" indicates it is out of
scope for now, right?)

--- /tmp/plan
+++ /tmp/planadviced
Sort
   Sort Key: (sum((lineitem.l_extendedprice * ('1'::numeric -
lineitem.l_discount)))) DESC
-  ->  Finalize GroupAggregate
+  ->  GroupAggregate
         Group Key: nation.n_name
         ->  Gather Merge
               Workers Planned: 2
-              ->  Partial GroupAggregate
-                    Group Key: nation.n_name
-                    ->  Sort
-                          Sort Key: nation.n_name
-                          ->  Hash Join
-                                Hash Cond: ((lineitem.l_suppkey =
supplier.s_suppkey) AND (customer.c_nationkey = supplier.s_nationkey))
+              ->  Sort
+                    Sort Key: nation.n_name
+                    ->  Hash Join
+                          Hash Cond: ((lineitem.l_suppkey =
supplier.s_suppkey) AND (customer.c_nationkey = supplier.s_nationkey))

-J.

[1] - https://www.postgresql.org/message-id/CA%2BTgmoZzBkd1BG8qusicUjme0kZuT8konQM_rcr0gMXs-TpK7A%40mail.gmail.com