Обсуждение: PATCH: generate fractional cheapest paths in generate_orderedappend_path

Поиск
Список
Период
Сортировка

PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Tomas Vondra
Дата:
Hi,

This patch is a WIP fix for the issue described in [1], where the
planner picks a more expensive plan with partition-wise joins enabled,
and disabling this option produces a cheaper plan. That's a bit strange
because with the option disabled we consider *fewer* plans, so we should
not be able to generate a cheaper plan.

The problem lies in generate_orderedappend_paths(), which builds two
types of append paths - with minimal startup cost, and with minimal
total cost. That however does not work for queries with LIMIT, which
also need to consider at fractional cost, but the path interesting from
this perspective may be eliminated by other paths.

Consider three paths (this comes from the reproducer shared in [1]):

  A: nestloop_path   startup 0.585000    total 35708.292500
  B: nestloop_path   startup 0.292500    total 150004297.292500
  C: mergejoin_path  startup 9748.112737 total 14102.092737

With some reasonable LIMIT value (e.g. 5% of the data), we really want
to pick path A. But that path is dominated both in startup cost (by B)
and total cost (path C). Hence generate_orderedappend_paths() will
ignore it, and we'll generate a more expensive LIMIT plan.

In [2] Tom proposed to modify generate_orderedappend_paths() to also
consider the fractional cheapest_path, just like we do for startup and
total costs. The attached patch does exactly that, and it does seem to
do the trick.

There are some loose ends, though:

1) If get_cheapest_fractional_path_for_pathkeys returns NULL, it's not
clear whether to default to cheapest_startup or cheapest_total. We might
also consider an incremental sort, in which case the startup cost
matters (for Sort it does not). So we may need an enhanced version of
get_cheapest_fractional_path_for_pathkeys that generates such paths.

2) Same for the cheapest_total - maybe there's a partially sorted path,
and using it with incremental sort on top would be better than using
cheapest_total_path + sort.

3) Not sure if get_cheapest_fractional_path_for_pathkeys should worry
about require_parallel_safe, just like the other functions nearby.

I'd argue that this patch does not need to add the Incremental Sort
capabilities in (1) and (2) - it's just another place where we decided
not to consider this sort variant yet.

I'm not sure how much this depends on partition-wise join, and why
disabling it generates the right plan. The reproducer uses that, but
AFAICS generate_orderedappend_paths() works like this since 2010 (commit
11cad29c915). I'd bet the issue exists since then and it's possible to
get similar cases even without partition-wise join.

I can reproduce it on PostgreSQL 12, though (which however supports
partition-wise join).

Not sure whether this should be backpatched. We didn't get any reports
until now, so it doesn't seem like a pressing issue. OTOH most users
won't actually notice this, they'll just get worse plans without
realizing there's a better option.


regards

[1]
https://www.postgresql.org/message-id/011937a3-7427-b99f-13f1-c07a127cf94c%40enterprisedb.com

[2] https://www.postgresql.org/message-id/4006636.1618577893%40sss.pgh.pa.us

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Arne Roland
Дата:

Hi,


thanks for looking into it!


For some reason the patch doesn't apply at my end, could you repost one based at the master?


> 1) If get_cheapest_fractional_path_for_pathkeys returns NULL, it's not
> clear whether to default to cheapest_startup or cheapest_total. We might
> also consider an incremental sort, in which case the startup cost
> matters (for Sort it does not). So we may need an enhanced version of
> get_cheapest_fractional_path_for_pathkeys that generates such paths.
>
> 2) Same for the cheapest_total - maybe there's a partially sorted path,
> and using it with incremental sort on top would be better than using
> cheapest_total_path + sort.

> I'd argue that this patch does not need to add the Incremental Sort
> capabilities in (1) and (2) - it's just another place where we decided
> not to consider this sort variant yet.

I'd say your reasoning is sound. If I'd want to get better partial costs for incremental sorts, I'd look at get_cheapest_fractional_path first. That sounds more important than generate_orderedappend_paths. Either way I'd say that is a completely separate issue and I think that should be looked at separately.


>3) Not sure if get_cheapest_fractional_path_for_pathkeys should worry

> about require_parallel_safe, just like the other functions nearby.

I think it should. We have a ParallelAppend node after all.
I'm not really familiar with the way get_cheapest_fractional_path_for_pathkeys is used, but a quick search suggests to me, that build_minmax_path was thus far the only one using it. And minmax paths are never parallel safe anyway. I think that is the reason it doesn't do that already.


Regards

Arne


From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Sent: Saturday, April 17, 2021 1:52:19 AM
To: pgsql-hackers
Subject: PATCH: generate fractional cheapest paths in generate_orderedappend_path
 
Hi,

This patch is a WIP fix for the issue described in [1], where the
planner picks a more expensive plan with partition-wise joins enabled,
and disabling this option produces a cheaper plan. That's a bit strange
because with the option disabled we consider *fewer* plans, so we should
not be able to generate a cheaper plan.

The problem lies in generate_orderedappend_paths(), which builds two
types of append paths - with minimal startup cost, and with minimal
total cost. That however does not work for queries with LIMIT, which
also need to consider at fractional cost, but the path interesting from
this perspective may be eliminated by other paths.

Consider three paths (this comes from the reproducer shared in [1]):

  A: nestloop_path   startup 0.585000    total 35708.292500
  B: nestloop_path   startup 0.292500    total 150004297.292500
  C: mergejoin_path  startup 9748.112737 total 14102.092737

With some reasonable LIMIT value (e.g. 5% of the data), we really want
to pick path A. But that path is dominated both in startup cost (by B)
and total cost (path C). Hence generate_orderedappend_paths() will
ignore it, and we'll generate a more expensive LIMIT plan.

In [2] Tom proposed to modify generate_orderedappend_paths() to also
consider the fractional cheapest_path, just like we do for startup and
total costs. The attached patch does exactly that, and it does seem to
do the trick.

There are some loose ends, though:

1) If get_cheapest_fractional_path_for_pathkeys returns NULL, it's not
clear whether to default to cheapest_startup or cheapest_total. We might
also consider an incremental sort, in which case the startup cost
matters (for Sort it does not). So we may need an enhanced version of
get_cheapest_fractional_path_for_pathkeys that generates such paths.

2) Same for the cheapest_total - maybe there's a partially sorted path,
and using it with incremental sort on top would be better than using
cheapest_total_path + sort.

3) Not sure if get_cheapest_fractional_path_for_pathkeys should worry
about require_parallel_safe, just like the other functions nearby.

I'd argue that this patch does not need to add the Incremental Sort
capabilities in (1) and (2) - it's just another place where we decided
not to consider this sort variant yet.

I'm not sure how much this depends on partition-wise join, and why
disabling it generates the right plan. The reproducer uses that, but
AFAICS generate_orderedappend_paths() works like this since 2010 (commit
11cad29c915). I'd bet the issue exists since then and it's possible to
get similar cases even without partition-wise join.

I can reproduce it on PostgreSQL 12, though (which however supports
partition-wise join).

Not sure whether this should be backpatched. We didn't get any reports
until now, so it doesn't seem like a pressing issue. OTOH most users
won't actually notice this, they'll just get worse plans without
realizing there's a better option.


regards

[1]
https://www.postgresql.org/message-id/011937a3-7427-b99f-13f1-c07a127cf94c%40enterprisedb.com

[2] https://www.postgresql.org/message-id/4006636.1618577893%40sss.pgh.pa.us

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Arne Roland
Дата:

Hi,


I haven't tested the parallel case, but I think we should sort out (3) get_cheapest_fractional_path_for_pathkeys as mentioned above.


I am lost about the comment regarding startup_new_fractional. Could you elaborate what you mean by that?


Apart from that, I'd argue for a small test case. I attached a slimmed down case of what we were trying to fix. It might be worth to integrate that with an existing test, since more than a third of the time seems to be consumed by the creation and attachment of partitions.


Regards
Arne


Вложения

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Tomas Vondra
Дата:
Hi,

On 6/3/21 7:17 PM, Arne Roland wrote:
> Hi,
> 
> 
> I haven't tested the parallel case, but I think we should sort out (3)
> get_cheapest_fractional_path_for_pathkeys as mentioned above.
> 

Not sure what you refer to by "above" - it's probably better to reply
in-line to existing message, which makes it much cleared.

> 
> I am lost about the comment regarding startup_new_fractional. Could you
> elaborate what you mean by that?
> 

Not sure what this refers to either - there's no startup_new_fractional
in my message and 'git grep startup_new_fractional' returns nothing.

> 
> Apart from that, I'd argue for a small test case. I attached a slimmed
> down case of what we were trying to fix. It might be worth to integrate
> that with an existing test, since more than a third of the time seems to
> be consumed by the creation and attachment of partitions.
> 

Maybe, if there's a suitable table to reuse, we can do that. But I don't
think it matters it takes ~1/3 of the time to attach the partitions.
What's more important is whether it measurably slows down the test
suite, and I don't think that's an issue.

In any case, this seems a bit premature - we need something to test the
patch etc. We can worry about how expensive the test is much later.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Zhihong Yu
Дата:


On Thu, Jun 3, 2021 at 11:12 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
Hi,

On 6/3/21 7:17 PM, Arne Roland wrote:
> Hi,
>
>
> I haven't tested the parallel case, but I think we should sort out (3)
> get_cheapest_fractional_path_for_pathkeys as mentioned above.
>

Not sure what you refer to by "above" - it's probably better to reply
in-line to existing message, which makes it much cleared.

>
> I am lost about the comment regarding startup_new_fractional. Could you
> elaborate what you mean by that?
>

Not sure what this refers to either - there's no startup_new_fractional
in my message and 'git grep startup_new_fractional' returns nothing.

>
> Apart from that, I'd argue for a small test case. I attached a slimmed
> down case of what we were trying to fix. It might be worth to integrate
> that with an existing test, since more than a third of the time seems to
> be consumed by the creation and attachment of partitions.
>

Maybe, if there's a suitable table to reuse, we can do that. But I don't
think it matters it takes ~1/3 of the time to attach the partitions.
What's more important is whether it measurably slows down the test
suite, and I don't think that's an issue.

In any case, this seems a bit premature - we need something to test the
patch etc. We can worry about how expensive the test is much later.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Hi,
In REL_11_STABLE branch, a search revealed the following:

src/backend/optimizer/path/pathkeys.c: * get_cheapest_fractional_path_for_pathkeys
src/backend/optimizer/path/pathkeys.c:get_cheapest_fractional_path_for_pathkeys(List *paths,
src/backend/optimizer/plan/planagg.c:           get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
src/include/optimizer/paths.h:extern Path *get_cheapest_fractional_path_for_pathkeys(List *paths, 

It seems this function has been refactored out in subsequent releases.

FYI

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Zhihong Yu
Дата:


On Thu, Jun 3, 2021 at 1:50 PM Zhihong Yu <zyu@yugabyte.com> wrote:


On Thu, Jun 3, 2021 at 11:12 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
Hi,

On 6/3/21 7:17 PM, Arne Roland wrote:
> Hi,
>
>
> I haven't tested the parallel case, but I think we should sort out (3)
> get_cheapest_fractional_path_for_pathkeys as mentioned above.
>

Not sure what you refer to by "above" - it's probably better to reply
in-line to existing message, which makes it much cleared.

>
> I am lost about the comment regarding startup_new_fractional. Could you
> elaborate what you mean by that?
>

Not sure what this refers to either - there's no startup_new_fractional
in my message and 'git grep startup_new_fractional' returns nothing.

>
> Apart from that, I'd argue for a small test case. I attached a slimmed
> down case of what we were trying to fix. It might be worth to integrate
> that with an existing test, since more than a third of the time seems to
> be consumed by the creation and attachment of partitions.
>

Maybe, if there's a suitable table to reuse, we can do that. But I don't
think it matters it takes ~1/3 of the time to attach the partitions.
What's more important is whether it measurably slows down the test
suite, and I don't think that's an issue.

In any case, this seems a bit premature - we need something to test the
patch etc. We can worry about how expensive the test is much later.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Hi,
In REL_11_STABLE branch, a search revealed the following:

src/backend/optimizer/path/pathkeys.c: * get_cheapest_fractional_path_for_pathkeys
src/backend/optimizer/path/pathkeys.c:get_cheapest_fractional_path_for_pathkeys(List *paths,
src/backend/optimizer/plan/planagg.c:           get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
src/include/optimizer/paths.h:extern Path *get_cheapest_fractional_path_for_pathkeys(List *paths, 

It seems this function has been refactored out in subsequent releases.

FYI

Sent a bit too soon.

The above function still exists.
But startup_new_fractional was nowhere to be found.

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Tomas Vondra
Дата:
On 6/3/21 10:52 PM, Zhihong Yu wrote:
> ...
> 
> Sent a bit too soon.
> 
> The above function still exists.
> But startup_new_fractional was nowhere to be found.

Actually, there are two comments

    /* XXX maybe we should have startup_new_fractional? */

in the patch I posted - I completely forgot about that. But I think
that's a typo, I think - it should be

    /* XXX maybe we should have startup_neq_fractional? */

and the new flag would work similarly to startup_neq_total, i.e. it's
pointless to add paths where startup == fractional cost.

At least I think that was the idea when I wrote the patch, it way too
long ago.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Arne Roland
Дата:

Hi,


thanks for the quick reply!


From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Sent: Thursday, June 3, 2021 20:11
To: Arne Roland; pgsql-hackers
Subject: Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

> I haven't tested the parallel case, but I think we should sort out (3)
> get_cheapest_fractional_path_for_pathkeys as mentioned above.
>

Not sure what you refer to by "above" - it's probably better to reply
in-line to existing message, which makes it much cleared.

I was referring to one message above. I thought the thread was still short enough. Apparently to much time has passed. Sorry, I hope this mail is better. I was referring to my post from April:


From: Arne Roland
Sent: Monday, April 26, 2021 13:00
To: Tomas Vondra; pgsql-hackers
Subject: Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

>3) Not sure if get_cheapest_fractional_path_for_pathkeys should worry

> about require_parallel_safe, just like the other functions nearby.

I think it should. We have a ParallelAppend node after all.
I'm not really familiar with the way get_cheapest_fractional_path_for_pathkeys is used, but a quick search suggests to me, that build_minmax_path was thus far the only one using it. And minmax paths are never parallel safe anyway. I think that is the reason it doesn't do that already.


From: Zhihong Yu <zyu@yugabyte.com>
Sent: Thursday, June 3, 2021 22:50
To: Tomas Vondra
Cc: Arne Roland; pgsql-hackers
Subject: Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path
 

Hi,
In REL_11_STABLE branch, a search revealed the following:

src/backend/optimizer/path/pathkeys.c: * get_cheapest_fractional_path_for_pathkeys
src/backend/optimizer/path/pathkeys.c:get_cheapest_fractional_path_for_pathkeys(List *paths,
src/backend/optimizer/plan/planagg.c:           get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
src/include/optimizer/paths.h:extern Path *get_cheapest_fractional_path_for_pathkeys(List *paths, 

It seems this function has been refactored out in subsequent releases.

FYI

Thanks for the info!
I doubt there is any interest to back patch this anywhere. My most ambitious dream would be getting this into pg 14.

I think, we only care about a parallel safety aware variant anyways, which afaict never existed.


From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Sent: Thursday, June 3, 2021 22:57
To: Zhihong Yu
Cc: Arne Roland; pgsql-hackers
Subject: Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path
Actually, there are two comments

        /* XXX maybe we should have startup_new_fractional? */

in the patch I posted - I completely forgot about that. But I think
that's a typo, I think - it should be

        /* XXX maybe we should have startup_neq_fractional? */

and the new flag would work similarly to startup_neq_total, i.e. it's
pointless to add paths where startup == fractional cost.

At least I think that was the idea when I wrote the patch, it way too
long ago.

Sorry, I almost forgot about this myself. I only got reminded upon seeing that again with different queries/tables.
Just to be sure I get this correctly: You mean startup_gt_fractional (cost) as an additional condition, right?

Regards
Arne

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Arne Roland
Дата:

Hi Tomas,


I don't think there is much work left to do here.

Did you have a look at the test case? Did it make sense to you? 

And I am sorry. I had another look at this and it seems I was confused (again).

From: Arne Roland
Sent: Monday, April 26, 2021 13:00
To: Tomas Vondra; pgsql-hackers
Subject: Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

> I think it should. We have a ParallelAppend node after all.
> I'm not really familiar with the way get_cheapest_fractional_path_for_pathkeys is used, but a quick search suggests to
> me, that build_minmax_path was thus far the only one using it. And minmax paths are never parallel safe anyway. I think that is the reason it doesn't do that already.

The whole segment were are talking about obviously assumes require_parallel_safe is not needed. I wasn't aware that in set_append_rel_size. And I just realized there is a great comment explaining why it rightfully does so:
         /*  
         * If any live child is not parallel-safe, treat the whole appendrel
         * as not parallel-safe.  In future we might be able to generate plans
         * in which some children are farmed out to workers while others are
         * not; but we don't have that today, so it's a waste to consider
         * partial paths anywhere in the appendrel unless it's all safe.
         * (Child rels visited before this one will be unmarked in
         * set_append_rel_pathlist().)
         */
So afaik we don't need to think further about this.


From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Sent: Thursday, June 3, 2021 22:57
To: Zhihong Yu
Cc: Arne Roland; pgsql-hackers
Subject: Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path
> Actually, there are two comments
>
>        /* XXX maybe we should have startup_new_fractional? */
>
> in the patch I posted - I completely forgot about that. But I think
> that's a typo, I think - it should be
>
>         /* XXX maybe we should have startup_neq_fractional? */
>
> and the new flag would work similarly to startup_neq_total, i.e. it's
> pointless to add paths where startup == fractional cost.
>
> At least I think that was the idea when I wrote the patch, it way too
> long ago.

> Sorry, I almost forgot about this myself. I only got reminded upon seeing that again with different queries/tables.
> Just to be sure I get this correctly: You mean startup_gt_fractional (cost) as an additional condition, right?

Could you clarify that for me?

Regards
Arne

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Arne Roland
Дата:

Afaiac we should add a simple testcase here, like I suggested in 477344d5f17c4a8e95d3a5bb6642718a. Apart from that I am not sure there is work to be done here.


Am I wrong?


Regards

Arne


From: Arne Roland <A.Roland@index.de>
Sent: Saturday, June 26, 2021 5:50:49 PM
To: Tomas Vondra
Cc: pgsql-hackers
Subject: Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path
 

Hi Tomas,


I don't think there is much work left to do here.

Did you have a look at the test case? Did it make sense to you? 

And I am sorry. I had another look at this and it seems I was confused (again).

From: Arne Roland
Sent: Monday, April 26, 2021 13:00
To: Tomas Vondra; pgsql-hackers
Subject: Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

> I think it should. We have a ParallelAppend node after all.
> I'm not really familiar with the way get_cheapest_fractional_path_for_pathkeys is used, but a quick search suggests to
> me, that build_minmax_path was thus far the only one using it. And minmax paths are never parallel safe anyway. I think that is the reason it doesn't do that already.

The whole segment were are talking about obviously assumes require_parallel_safe is not needed. I wasn't aware that in set_append_rel_size. And I just realized there is a great comment explaining why it rightfully does so:
         /*  
         * If any live child is not parallel-safe, treat the whole appendrel
         * as not parallel-safe.  In future we might be able to generate plans
         * in which some children are farmed out to workers while others are
         * not; but we don't have that today, so it's a waste to consider
         * partial paths anywhere in the appendrel unless it's all safe.
         * (Child rels visited before this one will be unmarked in
         * set_append_rel_pathlist().)
         */
So afaik we don't need to think further about this.


From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Sent: Thursday, June 3, 2021 22:57
To: Zhihong Yu
Cc: Arne Roland; pgsql-hackers
Subject: Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path
> Actually, there are two comments
>
>        /* XXX maybe we should have startup_new_fractional? */
>
> in the patch I posted - I completely forgot about that. But I think
> that's a typo, I think - it should be
>
>         /* XXX maybe we should have startup_neq_fractional? */
>
> and the new flag would work similarly to startup_neq_total, i.e. it's
> pointless to add paths where startup == fractional cost.
>
> At least I think that was the idea when I wrote the patch, it way too
> long ago.

> Sorry, I almost forgot about this myself. I only got reminded upon seeing that again with different queries/tables.
> Just to be sure I get this correctly: You mean startup_gt_fractional (cost) as an additional condition, right?

Could you clarify that for me?

Regards
Arne

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Tomas Vondra
Дата:
Hi,

On 12/2/21 15:58, Arne Roland wrote:
> Afaiac we should add a simple testcase here, like I suggested in 
> 477344d5f17c4a8e95d3a5bb6642718a 
> <https://www.postgresql.org/message-id/477344d5f17c4a8e95d3a5bb6642718a%40index.de>. 
> Apart from that I am not sure there is work to be done here.
> 

Well, I mentioned three open questions in my first message, and I don't 
think we've really addressed them yet. We've agreed we don't need to add 
the incremental sort here, but that leaves


1) If get_cheapest_fractional_path_for_pathkeys returns NULL, should we 
default to cheapest_startup or cheapest_total?

2) Should get_cheapest_fractional_path_for_pathkeys worry about 
require_parallel_safe? I think yes, but we need to update the patch.

I'll take a closer look next week, once I get home from NYC, and I'll 
submit an improved version for the January CF.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Arne Roland
Дата:
Hi,

thanks for the reply!

From: Tomas Vondra <tomas.vondra@enterprisedb.com>
Sent: Thursday, December 2, 2021 20:58
Subject: Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path
> [...]
> Well, I mentioned three open questions in my first message, and I don't
> think we've really addressed them yet. We've agreed we don't need to add
> the incremental sort here, but that leaves
>
>
> 1) If get_cheapest_fractional_path_for_pathkeys returns NULL, should we
> default to cheapest_startup or cheapest_total?

I think it's reasonable to use cheapest_total like we are doing now. I hardly see any reason to change it.
The incremental sort case you mentioned, seems like the only case that plan might be interesting. If we really want that to happen, we probably should check for that separately, i.e. having startup_fractional. Even though this is a fairly special case as it's mostly relevant for partitionwise joins, I'm still not convinced it's worth the cpu cycles. The point is that in most cases factional and startup_fractional will be the same anyways.
And I suspect, even if they aren't, solving that from an application developers point of view, is in most cases not that difficult. One can usually match the pathkey. I personally had a lot of real world issues with missing fractional paths using primary keys. I think it's worth noting that everything will likely match the partition keys anyways, because otherwise there is no chance of doing a merge append.
If I am not mistaken, in case we end up doing a full sort, the cheapest_total path should be completely sufficient.

> 2) Should get_cheapest_fractional_path_for_pathkeys worry about
> require_parallel_safe? I think yes, but we need to update the patch.

I admit, that such a version of get_cheapest_fractional_path_for_pathkeys could be consistent with other functions. And I was confused about that before. But I am not sure what to use require_parallel_safe for. build_minmax_path doesn't care, they are never parallel safe. And afaict generate_orderedappend_paths cares neither, it considers all plans. For instance when it calls get_cheapest_path_for_pathkeys, it sets require_parallel_safe just to false as well.

> I'll take a closer look next week, once I get home from NYC, and I'll
> submit an improved version for the January CF.

Thank you for your work! The current patch, apart from the comments/regression tests, seems pretty reasonable to me.

Regards
Arne

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Tomas Vondra
Дата:

On 12/10/21 00:51, Arne Roland wrote:
> Hi,
> 
> thanks for the reply!
> 
> From: Tomas Vondra <tomas.vondra@enterprisedb.com>
> Sent: Thursday, December 2, 2021 20:58
> Subject: Re: PATCH: generate fractional cheapest paths in 
> generate_orderedappend_path
>  > [...]
>  > Well, I mentioned three open questions in my first message, and I don't
>  > think we've really addressed them yet. We've agreed we don't need to add
>  > the incremental sort here, but that leaves
>  >
>  >
>  > 1) If get_cheapest_fractional_path_for_pathkeys returns NULL, should we
>  > default to cheapest_startup or cheapest_total?
> 
> I think it's reasonable to use cheapest_total like we are doing now. I 
> hardly see any reason to change it.

True, it's a reasonable first step.

Either we generate the same plan as today (with cheapest_total), or 
maybe a better one (if we find a cheaper fractional path with matching 
pathkeys). It's true we could do better, but that's life - it's not like 
we consider every possible path everywhere.

> The incremental sort case you mentioned, seems like the only case that 
> plan might be interesting. If we really want that to happen, we probably 
> should check for that separately, i.e. having startup_fractional. Even 
> though this is a fairly special case as it's mostly relevant for 
> partitionwise joins, I'm still not convinced it's worth the cpu cycles. 
> The point is that in most cases factional and startup_fractional will be 
> the same anyways.

I don't think we can simply check for startup_fractional (although, I'm 
not sure I fully understand what would that be). I think what we should 
really do in this case is walking all the paths, ensuring it's properly 
sorted (with either full or incremental sort), and then pick the 
cheapest fractional path from these sorted paths. But I agree that seems 
pretty expensive.

> And I suspect, even if they aren't, solving that from an application 
> developers point of view, is in most cases not that difficult. One can 
> usually match the pathkey. I personally had a lot of real world issues 
> with missing fractional paths using primary keys. I think it's worth 
> noting that everything will likely match the partition keys anyways, 
> because otherwise there is no chance of doing a merge append.

Yeah, I think you're right.

> If I am not mistaken, in case we end up doing a full sort, the 
> cheapest_total path should be completely sufficient.
> 

Definitely true.

>  > 2) Should get_cheapest_fractional_path_for_pathkeys worry about
>  > require_parallel_safe? I think yes, but we need to update the patch.
> 
> I admit, that such a version of 
> get_cheapest_fractional_path_for_pathkeys could be consistent with other 
> functions. And I was confused about that before. But I am not sure what 
> to use require_parallel_safe for. build_minmax_path doesn't care, they 
> are never parallel safe. And afaict generate_orderedappend_paths cares 
> neither, it considers all plans. For instance when it calls 
> get_cheapest_path_for_pathkeys, it sets require_parallel_safe just to 
> false as well.
> 

True as well. It's looks a bit strange, but you're right neither place 
cares about parallel safety.

>  > I'll take a closer look next week, once I get home from NYC, and I'll
>  > submit an improved version for the January CF.
> 
> Thank you for your work! The current patch, apart from the 
> comments/regression tests, seems pretty reasonable to me.
> 

Attached is a cleaned-up patch, with a simple regression test. I'll mark 
this as RFC and get it committed in a couple days.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Tomas Vondra
Дата:
Pushed, after clarifying the comments a bit.

I also looked into what would it take to consider incremental paths, and 
in principle it doesn't seem all that complicated. The attached PoC 
patch extends get_cheapest_fractional_path_for_pathkeys() to optionally 
build incremental sort on paths if needed. There are two GUCs that make 
experimenting simpler:

* enable_fractional_paths - disables fractional paths entirely, i.e. we 
get behavior without the part I already pushed

* enable_fractional_incremental_paths - disables the incremental sort 
part added by the attached patch

With this, I get this plan (see the test in partitionwise_join.sql)

test=# EXPLAIN (COSTS OFF)
test-# SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id1, id2) 
ORDER BY id1 ASC, id2 ASC LIMIT 10;
                                   QUERY PLAN 

------------------------------------------------------------------------------
  Limit
    ->  Merge Left Join
          Merge Cond: ((x.id1 = y.id1) AND (x.id2 = y.id2))
          ->  Append
                ->  Index Only Scan using fract_t0_id1_id2_idx on
                                          fract_t0 x_1
                ->  Incremental Sort
                      Sort Key: x_2.id1, x_2.id2
                      Presorted Key: x_2.id1
                      ->  Index Scan using fract_t1_pkey on fract_t1 x_2
          ->  Materialize
                ->  Append
                      ->  Incremental Sort
                            Sort Key: y_1.id1, y_1.id2
                            Presorted Key: y_1.id1
                            ->  Index Scan using fract_t0_pkey on
                                                 fract_t0 y_1
                                  Index Cond: (id1 = id1)
                                  Filter: (id2 = id2)
                      ->  Incremental Sort
                            Sort Key: y_2.id1, y_2.id2
                            Presorted Key: y_2.id1
                            ->  Index Scan using fract_t1_pkey on
                                                 fract_t1 y_2
                                  Index Cond: (id1 = id1)
                                  Filter: (id2 = id2)
(23 rows)

instead of

                                   QUERY PLAN 

------------------------------------------------------------------------------
  Limit
    ->  Incremental Sort
          Sort Key: x.id1, x.id2
          Presorted Key: x.id1
          ->  Merge Left Join
                Merge Cond: (x.id1 = y.id1)
                Join Filter: (x.id2 = y.id2)
                ->  Append
                      ->  Index Scan using fract_t0_pkey on fract_t0 x_1
                      ->  Index Scan using fract_t1_pkey on fract_t1 x_2
                ->  Materialize
                      ->  Append
                            ->  Index Scan using fract_t0_pkey on
                                                 fract_t0 y_1
                            ->  Index Scan using fract_t1_pkey on
                                                 fract_t1 y_2
(14 rows)

i.e. the incremental sorts moved below the merge join (and the cost is 
lower, but that's not shown here).

So that seems reasonable, but there's a couple issues too:

1) Planning works (hence we can run explain), but execution fails 
because of segfault in CheckVarSlotCompatibility - it gets NULL slot for 
some reason. I haven't looked into the details, but I'd guess we need to 
pass a different rel to create_incrementalsort_path, not childrel.

2) enable_partitionwisejoin=on seems to cause some confusion, because it 
results in picking a different plan with higher cost. But that's clearly 
not because of this patch.

3) I'm still a bit skeptical about the cost of this implementation - it 
builds the incrementalsort path, just to throw most of the paths away. 
It'd be much better to just calculate the cost, which should be much 
cheaper, and only do the heavy work for the one "best" path.

4) One thing I did not realize before is what pathkeys we even consider 
here. Imagine you have two tables:

    CREATE TABLE t1 (a int, b int, primary key (a));
    CREATE TABLE t2 (a int, b int, primary key (a));

and query

    SELECT * FROM t1 JOIN t2 USING (a,b);

It seems reasonable to also consider pathkeys (a,b) because that's make 
e.g. mergejoin much cheaper, right? But no, we'll not do that - we only 
consider pathkeys that at least one child relation has, so in this case 
it's just (a). Which means we'll never consider incremental sort for 
this particular example.

It's a bit strange, because it's enough to create index on (a,b) for 
just one of the relations, and it'll suddenly consider building 
incremental sort on both sides.


I don't plan to pursue this further at this point, so I pushed the first 
part because that's a fairly simple improvement over what we have now. 
But it seems like a fairly promising area for improvements.

Also, the non-intuitive effects of enabling partition-wise joins (i.e. 
picking plans with higher estimated costs) is something worth exploring, 
I guess. But that's a separate issue.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Tomas Vondra
Дата:
FWIW this is now marked as committed. I've created a separate entry in 
the next CF for the incremental sort part.


https://commitfest.postgresql.org/37/3513/

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Arne Roland
Дата:
 Hi!

> From: Tomas Vondra <tomas.vondra@enterprisedb.com>
> Subject: Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path
>  
> test-# SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id1, id2)
> ORDER BY id1 ASC, id2 ASC LIMIT 10;
>                                    QUERY PLAN
>
> ------------------------------------------------------------------------------
>   Limit
>     ->  Merge Left Join
>           Merge Cond: ((x.id1 = y.id1) AND (x.id2 = y.id2))
>           ->  Append
>                 ->  Index Only Scan using fract_t0_id1_id2_idx on
>                                           fract_t0 x_1
>                 ->  Incremental Sort
>                       Sort Key: x_2.id1, x_2.id2
>                       Presorted Key: x_2.id1
>                       ->  Index Scan using fract_t1_pkey on fract_t1 x_2
>           ->  Materialize
>                 ->  Append
>                       ->  Incremental Sort
>                             Sort Key: y_1.id1, y_1.id2
>                             Presorted Key: y_1.id1
>                             ->  Index Scan using fract_t0_pkey on
>                                                  fract_t0 y_1
>                                   Index Cond: (id1 = id1)
>                                   Filter: (id2 = id2)
>                       ->  Incremental Sort
>                             Sort Key: y_2.id1, y_2.id2
>                             Presorted Key: y_2.id1
>                             ->  Index Scan using fract_t1_pkey on
>                                                  fract_t1 y_2
>                                   Index Cond: (id1 = id1)
>                                   Filter: (id2 = id2)
> (23 rows)
> [...]
> So that seems reasonable

Maybe I'm just slow, but that doesn't seem reasonable to me. It doesn't look like a valid plan to me. Sure all the nodes are arranged like I'd like them to be. But what are the id1/id2 bound we have in the index and filter conditions?

> [...]but there's a couple issues too:
>
> 1) Planning works (hence we can run explain), but execution fails
> because of segfault in CheckVarSlotCompatibility - it gets NULL slot for
> some reason. I haven't looked into the details, but I'd guess we need to
> pass a different rel to create_incrementalsort_path, not childrel.

In case my above concern is valid, maybe the executor is just as confused as I am. Such conditions should generate VarSlots, no? If so, where should they come from?

Sadly I don't have time to debug that in depth today.

> 2) enable_partitionwisejoin=on seems to cause some confusion, because it
> results in picking a different plan with higher cost. But that's clearly
> not because of this patch.

Short version: I'd neither conceptually expect costs to be lower in any case, nor would that be desirable, because our estimates aren't perfect.

Long version: What do you mean by confusion. The plan I get with the patch doesn't seem to confusing to me. Generally something like that is to be expected. enable_partitionwisejoin changes the way this planing works by rewriting the entire query effectively rule based. So we end up with a completely different query. I'd in particular expect slightly different startup costs.
So if we activate this we consider completely different plans, I struggle to come up with a meaningful example where there is any overlap at all. Thus it doesn't surprise me conceptually.
From practical experience I'd say: If they are about the same plan, the costs estimates work somewhat okish.
If we change the way we compose the nodes together, we sometimes end up with radical different costs for doing the same. While I suspect there are a lot of corner cases causing this, I've seen quite a few where this was due to the fact, that our planer often has insignificant statistics to know something and takes a guess. This has gotten way better of recent years, but it's in particular for non-trivial joins still a problem in practice.

> 3) I'm still a bit skeptical about the cost of this implementation - it
> builds the incrementalsort path, just to throw most of the paths away.
> It'd be much better to just calculate the cost, which should be much
> cheaper, and only do the heavy work for the one "best" path.

Maybe we should profile this to get a rough estimate, how much time we spend building these incremental paths. From a code perspective it's non trivial to me where the time is lost.

> 4) One thing I did not realize before is what pathkeys we even consider
> here. Imagine you have two tables:
>
>     CREATE TABLE t1 (a int, b int, primary key (a));
>     CREATE TABLE t2 (a int, b int, primary key (a));
>
> and query
>
>     SELECT * FROM t1 JOIN t2 USING (a,b);
>
> It seems reasonable to also consider pathkeys (a,b) because that's make
> e.g. mergejoin much cheaper, right? But no, we'll not do that - we only
> consider pathkeys that at least one child relation has, so in this case
> it's just (a). Which means we'll never consider incremental sort for
> this particular example.
>
> It's a bit strange, because it's enough to create index on (a,b) for
> just one of the relations, and it'll suddenly consider building
> incremental sort on both sides.

I don't find that surprising, because the single index *might* make the incremental sort cheaper for the join *without* considering any external sort order.
So we would be switching up the incremental sort and the mergejoin, in case we need to sort anyways. That would mean considering also the sort order, that might be relevant on the outside. Sounds like an interesting idea for a later patch.

> I don't plan to pursue this further at this point, so I pushed the first
> part because that's a fairly simple improvement over what we have now.
> But it seems like a fairly promising area for improvements.

I think 1) is pretty important, so we should sort that out sooner than later. Apart form that: :+1:
Thank you!

Regards
Arne

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Tomas Vondra
Дата:
On 1/13/22 21:12, Arne Roland wrote:
>  Hi!
> 
>> From: Tomas Vondra <tomas.vondra@enterprisedb.com>
>> Subject: Re: PATCH: generate fractional cheapest paths in
> generate_orderedappend_path
>>  
>> test-# SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id1, id2)
>> ORDER BY id1 ASC, id2 ASC LIMIT 10;
>>                                    QUERY PLAN
>>
>>
> ------------------------------------------------------------------------------
>>   Limit
>>     ->  Merge Left Join
>>           Merge Cond: ((x.id1 = y.id1) AND (x.id2 = y.id2))
>>           ->  Append
>>                 ->  Index Only Scan using fract_t0_id1_id2_idx on
>>                                           fract_t0 x_1
>>                 ->  Incremental Sort
>>                       Sort Key: x_2.id1, x_2.id2
>>                       Presorted Key: x_2.id1
>>                       ->  Index Scan using fract_t1_pkey on fract_t1 x_2
>>           ->  Materialize
>>                 ->  Append
>>                       ->  Incremental Sort
>>                             Sort Key: y_1.id1, y_1.id2
>>                             Presorted Key: y_1.id1
>>                             ->  Index Scan using fract_t0_pkey on
>>                                                  fract_t0 y_1
>>                                   Index Cond: (id1 = id1)
>>                                   Filter: (id2 = id2)
>>                       ->  Incremental Sort
>>                             Sort Key: y_2.id1, y_2.id2
>>                             Presorted Key: y_2.id1
>>                             ->  Index Scan using fract_t1_pkey on
>>                                                  fract_t1 y_2
>>                                   Index Cond: (id1 = id1)
>>                                   Filter: (id2 = id2)
>> (23 rows)
>> [...]
>> So that seems reasonable
> 
> Maybe I'm just slow, but that doesn't seem reasonable to me. It doesn't
> look like a valid plan to me. Sure all the nodes are arranged like I'd
> like them to be. But what are the id1/id2 bound we have in the index and
> filter conditions?
> 

I'm not claiming the plan is 100% correct, and you may have a point
about the index condition / filter in the materialize branch.

But the overall plan shape (with incremental sort nodes on both sides)
seems reasonable to me. The materialize node is expected (incremental
sort does not support rescans cheaply).

Maybe that's not any cheaper than just doing merge join on the first
column, and filter on the second. But we should be able to decide that
based on cost, I think.

>> [...]but there's a couple issues too:
>>
>> 1) Planning works (hence we can run explain), but execution fails
>> because of segfault in CheckVarSlotCompatibility - it gets NULL slot for
>> some reason. I haven't looked into the details, but I'd guess we need to
>> pass a different rel to create_incrementalsort_path, not childrel.
> 
> In case my above concern is valid, maybe the executor is just as
> confused as I am. Such conditions should generate VarSlots, no? If so,
> where should they come from?
> 

Yeah, something like that.

> Sadly I don't have time to debug that in depth today.
> 
>> 2) enable_partitionwisejoin=on seems to cause some confusion, because it
>> results in picking a different plan with higher cost. But that's clearly
>> not because of this patch.
> 
> Short version: I'd neither conceptually expect costs to be lower in any
> case, nor would that be desirable, because our estimates aren't perfect.
> 
> Long version: What do you mean by confusion. The plan I get with the
> patch doesn't seem to confusing to me. Generally something like that is
> to be expected. enable_partitionwisejoin changes the way this planing
> works by rewriting the entire query effectively rule based. So we end up
> with a completely different query. I'd in particular expect slightly
> different startup costs.
> So if we activate this we consider completely different plans, I
> struggle to come up with a meaningful example where there is any overlap
> at all. Thus it doesn't surprise me conceptually.
> From practical experience I'd say: If they are about the same plan, the
> costs estimates work somewhat okish.
> If we change the way we compose the nodes together, we sometimes end up
> with radical different costs for doing the same. While I suspect there
> are a lot of corner cases causing this, I've seen quite a few where this
> was due to the fact, that our planer often has insignificant statistics
> to know something and takes a guess. This has gotten way better of
> recent years, but it's in particular for non-trivial joins still a
> problem in practice.
> 

By confusion I meant that if you plan the query with partitionwise join
enabled, you get a plan with cost X, and if you disable it you get a
different plan with cost Y, where (Y < X). Which is somewhat unexpected,
because that seems to simply reduce the set of plans we explore.

But as you say, enable_partitionwise_join kinda "switches" between two
planning modes. Not sure why we don't try building both paths and decide
based on costs.

>> 3) I'm still a bit skeptical about the cost of this implementation - it
>> builds the incrementalsort path, just to throw most of the paths away.
>> It'd be much better to just calculate the cost, which should be much
>> cheaper, and only do the heavy work for the one "best" path.
> 
> Maybe we should profile this to get a rough estimate, how much time we
> spend building these incremental paths. From a code perspective it's non
> trivial to me where the time is lost.
> 

TBH I haven't really done any profiling, but I wouldn't be surprised if
it got somewhat expensive for large number of child relations,
especially if there are a couple indexes on each. We do something
similar for nestloop (see initial_cost_nestloop).

>> 4) One thing I did not realize before is what pathkeys we even consider
>> here. Imagine you have two tables:
>>
>>     CREATE TABLE t1 (a int, b int, primary key (a));
>>     CREATE TABLE t2 (a int, b int, primary key (a));
>>
>> and query
>>
>>     SELECT * FROM t1 JOIN t2 USING (a,b);
>>
>> It seems reasonable to also consider pathkeys (a,b) because that's make
>> e.g. mergejoin much cheaper, right? But no, we'll not do that - we only
>> consider pathkeys that at least one child relation has, so in this case
>> it's just (a). Which means we'll never consider incremental sort for
>> this particular example.
>>
>> It's a bit strange, because it's enough to create index on (a,b) for
>> just one of the relations, and it'll suddenly consider building
>> incremental sort on both sides.
> 
> I don't find that surprising, because the single index *might* make the
> incremental sort cheaper for the join *without* considering any external
> sort order.
> So we would be switching up the incremental sort and the mergejoin, in
> case we need to sort anyways. That would mean considering also the sort
> order, that might be relevant on the outside. Sounds like an interesting
> idea for a later patch.
> 

I'm not sure it depends on the incremental sort. I suspect in some cases
it might be faster to fully sort the merge join inputs, even if none of
the input paths has suitable pathkeys.

For example, if you do

  ... FROM t1 JOIN t2 USING (a,b) ...

but there are only indexes on (a), maybe sorting on (a,b) would win e.g.
if there's a lot of duplicate values in (a)?

I was thinking about this variation on example from the committed patch:

  CREATE TABLE fract_t (id1 BIGINT, id2 BIGINT)
                       PARTITION BY RANGE (id1);

  CREATE TABLE fract_t0 PARTITION OF fract_t
                        FOR VALUES FROM ('0') TO ('10');

  CREATE TABLE fract_t1 PARTITION OF fract_t
                        FOR VALUES FROM ('10') TO ('20');

  CREATE INDEX ON fract_t(id1);

  INSERT INTO fract_t (id1, id2)
  SELECT i/100000, i FROM generate_series(0, 1999999) s(i);

  ANALYZE fract_t;

  -- not interested in nestloop/hashjoin paths for now
  set enable_hashjoin = off;
  set enable_nestloop = off;
  set max_parallel_workers_per_gather = 0;

  EXPLAIN (COSTS OFF)
  SELECT * FROM fract_t x JOIN fract_t y USING (id1, id2) order by id1;

which is now planned like this:

                     QUERY PLAN
-----------------------------------------------------
 Merge Join
   Merge Cond: ((x.id1 = y.id1) AND (x.id2 = y.id2))
   ->  Sort
         Sort Key: x.id1, x.id2
         ->  Append
               ->  Seq Scan on fract_t0 x_1
               ->  Seq Scan on fract_t1 x_2
   ->  Materialize
         ->  Sort
               Sort Key: y.id1, y.id2
               ->  Append
                     ->  Seq Scan on fract_t0 y_1
                     ->  Seq Scan on fract_t1 y_2
(13 rows)

But maybe a plan like this might be better:

                     QUERY PLAN
-----------------------------------------------------
 Merge Join
   Merge Cond: ((x.id1 = y.id1) AND (x.id2 = y.id2))
   ->  Incremental Sort
         Sort Key: x.id1, x.id2
         Presorted Key: x.id1
         ->  Append
               ->  Index Scan on fract_t0 x_1
               ->  Index Scan on fract_t1 x_2
   ->  Materialize
         ->  Incremental Sort
               Sort Key: y.id1, y.id2
               Presorted Key: y.id1
               ->  Append
                     ->  Index Scan on fract_t0 y_1
                     ->  Index Scan on fract_t1 y_2

or maybe even Incremental Sort + (Merge) Append on top. Which is what I
was trying to achieve with the experimental patch.


FWIW I did briefly look at the Merge Join + Incremental Sort plan too,
and it seems we don't consider incremental sorts there either. AFAICS
add_paths_to_joinrel justs call sort_inner_and_outer, which determines
interesting pathkeys in select_outer_pathkeys_for_merge, and I guess
that only considers pathkeys usable for full sort. In any case, we don't
actually add sort paths - we construct sort plans by calling make_sort
in create_mergejoin_plan. So there's not much chance for incremental
sort at all. That's kinda unfortunate, I guess. It's one of the
non-pathified places that we ignored while adding incremental sort.

>> I don't plan to pursue this further at this point, so I pushed the first
>> part because that's a fairly simple improvement over what we have now.
>> But it seems like a fairly promising area for improvements.
> 
> I think 1) is pretty important, so we should sort that out sooner than
> later. Apart form that: :+1:
> Thank you!
> 

I agree it's worth investigating and experimenting with. We may end up
realizing those plans are not worth it, but we won't know until we try.

It may require replacing some of the hard-coded logic in createplan.c
with constructing regular alternative paths. IIRC we even did some of
this work in the incremental sort patch at some point, but then ripped
that out to keep the patch smaller / simpler ... need to look at it again.

Are you interested / willing to do some of this work?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Andres Freund
Дата:
Hi,

On 2022-01-14 01:39:30 +0100, Tomas Vondra wrote:
> Are you interested / willing to do some of this work?

This patch hasn't moved in the last two months. I think it may be time to
mark it as returned with feedback for now?

It's also failing tests, and has done so for months:

https://cirrus-ci.com/task/5308087077699584
https://api.cirrus-ci.com/v1/artifact/task/5308087077699584/log/src/test/regress/regression.diffs

Greetings,

Andres Freund



Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

От
Tomas Vondra
Дата:
On 3/22/22 01:18, Andres Freund wrote:
> Hi,
> 
> On 2022-01-14 01:39:30 +0100, Tomas Vondra wrote:
>> Are you interested / willing to do some of this work?
> 
> This patch hasn't moved in the last two months. I think it may be time to
> mark it as returned with feedback for now?
> 
> It's also failing tests, and has done so for months:
> 
> https://cirrus-ci.com/task/5308087077699584
> https://api.cirrus-ci.com/v1/artifact/task/5308087077699584/log/src/test/regress/regression.diffs
> 
> Greetings,
> 

Yeah. I think it's a useful improvement, but it needs much more work
than is doable by the end of this CF. RwF seems about right.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company