Обсуждение: Reordering DISTINCT keys to match input path's pathkeys

Поиск
Список
Период
Сортировка

Reordering DISTINCT keys to match input path's pathkeys

От
Richard Guo
Дата:
Similar to what we did to GROUP BY keys in 0452b461bc, I think we can do
the same to DISTINCT keys, i.e. reordering DISTINCT keys to match input
path's pathkeys, which can help reduce cost by avoiding unnecessary
re-sort, or allowing us to use incremental-sort to save efforts.  For
instance,

create table t (a int, b int);
create index on t (a, b);

explain (costs off) select distinct b, a from t limit 10;
                    QUERY PLAN
--------------------------------------------------
 Limit
   ->  Unique
         ->  Index Only Scan using t_a_b_idx on t
(3 rows)


Please note that the parser has ensured that the DISTINCT pathkeys
matches the order of ORDER BY clauses.  So there is no need to do this
part again.

In principle, we can perform such reordering for DISTINCT ON too, but we
need to make sure that the resulting pathkeys matches initial ORDER BY
keys, which seems not trivial.  So it doesn't seem worth the effort.

Attached is a patch for this optimization.  Any thoughts?

Thanks
Richard
Вложения

Re: Reordering DISTINCT keys to match input path's pathkeys

От
Laurenz Albe
Дата:
On Tue, 2024-01-23 at 13:55 +0800, Richard Guo wrote:
> Similar to what we did to GROUP BY keys in 0452b461bc, I think we can do
> the same to DISTINCT keys, i.e. reordering DISTINCT keys to match input
> path's pathkeys, which can help reduce cost by avoiding unnecessary
> re-sort, or allowing us to use incremental-sort to save efforts.
>
> Attached is a patch for this optimization.  Any thoughts?

I didn't scrutinize the code, but that sounds like a fine optimization.

Yours,
Laurenz Albe



Re: Reordering DISTINCT keys to match input path's pathkeys

От
David Rowley
Дата:
On Tue, 23 Jan 2024 at 18:56, Richard Guo <guofenglinux@gmail.com> wrote:
>
> Similar to what we did to GROUP BY keys in 0452b461bc, I think we can do
> the same to DISTINCT keys, i.e. reordering DISTINCT keys to match input
> path's pathkeys, which can help reduce cost by avoiding unnecessary
> re-sort, or allowing us to use incremental-sort to save efforts.  For
> instance,

I've not caught up on the specifics of 0452b461b, but I just wanted to
highlight that there was some work done in [1] in this area.  It seems
Ankit didn't ever add that to a CF, so that might explain why it's
been lost.

Anyway, just pointing it out as there may be useful code or discussion
in the corresponding threads.

David

[1] https://postgr.es/m/da9425ae-8ff7-33d9-23b3-2a3eb605e106%40gmail.com



Re: Reordering DISTINCT keys to match input path's pathkeys

От
Richard Guo
Дата:

On Tue, Jan 23, 2024 at 5:03 PM David Rowley <dgrowleyml@gmail.com> wrote:
I've not caught up on the specifics of 0452b461b, but I just wanted to
highlight that there was some work done in [1] in this area.  It seems
Ankit didn't ever add that to a CF, so that might explain why it's
been lost.

Anyway, just pointing it out as there may be useful code or discussion
in the corresponding threads.

Thanks for pointing it out.  I looked at the patch there and noticed
several problems with it.

* That patch is incomplete and does not work as expected.  It at least
needs to modify truncate_useless_pathkeys() to account for DISTINCT
clause (I think this has been mentioned in that thread).

* That patch would not consider the origin DISTINCT pathkeys if it could
do some reordering, which is not great and can generate inefficient
plans.  For instance (after fixing the first problem)

create table t (a int, b int);
create index on t(a);

set enable_hashagg to off;
set enable_incremental_sort to off;
set enable_seqscan to off;

explain (costs off) select distinct b, a from t order by b, a;
                   QUERY PLAN
-------------------------------------------------
 Sort
   Sort Key: b, a
   ->  Unique
         ->  Sort
               Sort Key: a, b
               ->  Index Scan using t_a_idx on t
(6 rows)

Using DISTINCT pathkeys {b, a} is more efficient for this plan, because
only one Sort would be required.  But that patch is not able to do that,
because it does not consider the origin DISTINCT pathkeys after
reordering.

Thanks
Richard

Re: Reordering DISTINCT keys to match input path's pathkeys

От
Richard Guo
Дата:
cfbot reminds that this patch does not apply any more.  So I've rebased
it on master, and also adjusted the test cases a bit.

Thanks
Richard
Вложения

Re: Reordering DISTINCT keys to match input path's pathkeys

От
Richard Guo
Дата:
On Mon, Feb 5, 2024 at 11:18 AM Richard Guo <guofenglinux@gmail.com> wrote:
> cfbot reminds that this patch does not apply any more.  So I've rebased
> it on master, and also adjusted the test cases a bit.

This patch does not apply any more, so here is a new rebase, with some
tweaks to the comments.

Thanks
Richard

Вложения