Обсуждение: Support loser tree for k-way merge

Поиск
Список
Период
Сортировка

Support loser tree for k-way merge

От
"cca5507"
Дата:
Hi hackers,

Background
==========
Now we use 'heap' during the k-way merge, it's O(n log k). The 'loser tree' is also O(n log k), but
it's usually has fewer comparisons than the 'heap'. When the tuple comparator is complex, the
'loser tree' can significantly speed up the k-way merge.

Test
====
With the WIP patch(v1-0001), I got a 3% ~ 13%(different work_mem) speed up in the following test case:

SET max_parallel_workers_per_gather = 0;
CREATE UNLOGGED TABLE t AS SELECT generate_series(1, 20000000) AS a, md5(random()::text) AS b;
create extension if not exists pg_prewarm;
select pg_prewarm('t');

SET enable_loser_tree = OFF;
# SET work_mem = '4MB'; ('8MB' '16MB' '32MB' '64MB' ...)
explain analyze select * from t order by b;

SET enable_loser_tree = ON;
# SET work_mem = '4MB'; ('8MB' '16MB' '32MB' '64MB' ...)
explain analyze select * from t order by b;

Open questions
==============
1) Now I add a GUC 'enable_loser_tree' to control the use of loser tree, maybe we should
decide whether to use the 'loser tree' based on the value of 'k', the complexity of tuple
comparators or just always use the 'loser tree'?

Looking forward to your reply and comment.

--
Regards,
ChangAo Chen

Вложения

Re: Support loser tree for k-way merge

От
Heikki Linnakangas
Дата:
On 03/12/2025 13:48, cca5507 wrote:
> With the WIP patch(v1-0001), I got a 3% ~ 13%(different work_mem) speed up in the following test case:

Nice speedup!

> 1) Now I add a GUC 'enable_loser_tree' to control the use of loser tree, maybe we should
> decide whether to use the 'loser tree' based on the value of 'k', the complexity of tuple
> comparators or just always use the 'loser tree'?

What is the worst case scenario for the loser tree, where the heap is 
faster? How big is the difference?

- Heikki




Re: Support loser tree for k-way merge

От
"cca5507"
Дата:
Hi Heikki,

> What is the worst case scenario for the loser tree, where the heap is 
> faster? How big is the difference?

In tuplesort_heap_replace_top(), it has 2 comparisons each level, but it can early return
if the parent less than both child.

In tuplesort_loser_tree_adjust(), it has 1 comparison each level, but it can't early return.

So on specific data, the heap may be better than the loser tree. But I think the possibility
is very small.

--
Regards,
ChangAo Chen

Re: Support loser tree for k-way merge

От
"cca5507"
Дата:
> So on specific data, the heap may be better than the loser tree. But I think the possibility
> is very small.

For example, a column contains all the same values, so the heap can always early return
when replacing top.

--
Regards,
ChangAo Chen

Re: Support loser tree for k-way merge

От
Sami Imseih
Дата:
Hi,

Thanks for raising this.

> Now we use 'heap' during the k-way merge, it's O(n log k). The 'loser tree' is also O(n log k), but
> it's usually has fewer comparisons than the 'heap'. When the tuple comparator is complex, the
> 'loser tree' can significantly speed up the k-way merge.

I was playing around with this and I do also see the performance
benefits that you
mention with loser tree.

```
DROP TABLE IF EXISTS t;
CREATE TABLE t (
    col1 INT,
    col2 INT
);


INSERT INTO t (col1, col2)
SELECT
    (i % 10000000),
    (i % 100)
FROM generate_series(1, 10000000) AS s(i);
```

Using something like the above, testing an "external merge" on the
high cardinality
column there is a bit of an improvement with loser tree, but there is
a bit of a loss on
low-cardinality.

In general though, I see a bit of an issue with allowing a GUC to
change strategies. What happens if there are multiple "external merge"
nodes and they can each benefit from different strategies? Maybe that's
not an issue since all the other enable_ GUCs have that problem, but
it is something to consider.

Can we drive the decision for what to do based on optimizer
stats, i.e. n_distinct and row counts? Not sure what the calculation would
be specifically, but something else to consider.

We can still provide the GUC to  override the optimizer decisions,
but at least the optimizer, given up-to-date stats, may get it right most
of the time.

--
Sami Imseih
Amazon Web Services (AWS)