We can see that ExecCopySlot occupies 24% of the CPU inside ExecUnique function (thanks to palloc in Unique’s minimal tuples). On the other hand ExecCopySlot is only 6% of the ExecGroup function (we use virtual tuples in Group node).
After the patch Unique node works a little bit faster then the Group node:
adb=# explain analyze select distinct a from t;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=0.43..98761.06 rows=3160493 width=4) (actual time=0.094..1072.007 rows=3160493 loops=1)
-> Index Only Scan using t_pkey on t (cost=0.43..90859.82 rows=3160493 width=4) (actual time=0.092..592.619 rows=3160493 loops=1)
Heap Fetches: 0
Planning Time: 0.203 ms
Execution Time: 1209.940 ms
(5 rows)
adb=# explain analyze select a from t group by a;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Group (cost=0.43..98761.06 rows=3160493 width=4) (actual time=0.074..1140.644 rows=3160493 loops=1)
Group Key: a
-> Index Only Scan using t_pkey on t (cost=0.43..90859.82 rows=3160493 width=4) (actual time=0.070..591.930 rows=3160493 loops=1)
Heap Fetches: 0
Planning Time: 0.193 ms
Execution Time: 1276.026 ms
(6 rows)
I have added current patch to the commitfest.