Re: Use virtual tuple slot for Unique node

Поиск
Список
Период
Сортировка
От Denis Smirnov
Тема Re: Use virtual tuple slot for Unique node
Дата
Msg-id 527984DD-4215-4A0D-ABB0-B34A9C39214E@gmail.com
обсуждение исходный текст
Ответ на Re: Use virtual tuple slot for Unique node  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: Use virtual tuple slot for Unique node  (Denis Smirnov <darthunix@gmail.com>)
Список pgsql-hackers
Here is the example (checked on the current master branch with release build + I've made about 10 runs for each explain analyze to get repeatable results)

Before the patch:

adb=# create table t(a int, primary key(a));


adb=# insert into t select random() * 5000000

from generate_series(1, 5000000)

on conflict do nothing;


adb=# explain analyze select a from t group by a;

                                                              QUERY PLAN                                                               

---------------------------------------------------------------------------------------------------------------------------------------

 Group  (cost=0.43..98761.06 rows=3160493 width=4) (actual time=0.085..1225.139 rows=3160493 loops=1)

   Group Key: a

   ->  Index Only Scan using t_pkey on t  (cost=0.43..90859.82 rows=3160493 width=4) (actual time=0.081..641.567 rows=3160493 loops=1)

         Heap Fetches: 0

 Planning Time: 0.188 ms

 Execution Time: 1370.027 ms

(6 rows)




adb=# explain analyze select distinct a from t;

                                                              QUERY PLAN                                                               

---------------------------------------------------------------------------------------------------------------------------------------

 Unique  (cost=0.43..98761.06 rows=3160493 width=4) (actual time=0.135..1525.704 rows=3160493 loops=1)

   ->  Index Only Scan using t_pkey on t  (cost=0.43..90859.82 rows=3160493 width=4) (actual time=0.130..635.742 rows=3160493 loops=1)

         Heap Fetches: 0

 Planning Time: 0.273 ms

 Execution Time: 1660.857 ms

(5 rows)



We can see that ExecCopySlot occupies 24% of the CPU inside ExecUnique function (thanks to palloc in Unique’s minimal tuples). On the other hand ExecCopySlot is only 6% of the ExecGroup function (we use virtual tuples in Group node).


After the patch Unique node works a little bit faster then the Group node:


adb=# explain analyze select distinct a from t;

                                                              QUERY PLAN                                                               

---------------------------------------------------------------------------------------------------------------------------------------

 Unique  (cost=0.43..98761.06 rows=3160493 width=4) (actual time=0.094..1072.007 rows=3160493 loops=1)

   ->  Index Only Scan using t_pkey on t  (cost=0.43..90859.82 rows=3160493 width=4) (actual time=0.092..592.619 rows=3160493 loops=1)

         Heap Fetches: 0

 Planning Time: 0.203 ms

 Execution Time: 1209.940 ms

(5 rows)


adb=# explain analyze select a from t group by a;

                                                              QUERY PLAN                                                               

---------------------------------------------------------------------------------------------------------------------------------------

 Group  (cost=0.43..98761.06 rows=3160493 width=4) (actual time=0.074..1140.644 rows=3160493 loops=1)

   Group Key: a

   ->  Index Only Scan using t_pkey on t  (cost=0.43..90859.82 rows=3160493 width=4) (actual time=0.070..591.930 rows=3160493 loops=1)

         Heap Fetches: 0

 Planning Time: 0.193 ms

 Execution Time: 1276.026 ms

(6 rows)


I have added current patch to the commitfest.



31 авг. 2023 г., в 04:59, David Rowley <dgrowleyml@gmail.com> написал(а):

On Thu, 31 Aug 2023 at 05:37, Денис Смирнов <darthunix@gmail.com> wrote:
I have inspected the performance of the GROUP BY and DISTINCT queries for the sorted data streams and found out, that Group node (produced by GROUP BY) works faster then the Unique node (produced by DISTINCT).  The flame graph should out the reason - Unique palloc`s tuples for the result slot while the Group node doesn’t.

I wonder, why do we use minimal tuples for the Unique node instead of the virtual ones? It looks like there is no actual reason for that as Unique doesn’t make any materialization.

It would be good to see example queries and a demonstration of the
performance increase. I'm not disputing your claims, but showing some
performance numbers might catch the eye of a reviewer more quickly.

You should also add this to the September commitfest at
https://commitfest.postgresql.org/44/

David

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Replace some cstring_to_text to cstring_to_text_with_len