Sort optimizations: Making in-memory sort cache-aware

Поиск

Список

Период

Сортировка

От	Ankit Kumar Pandey
Тема	Sort optimizations: Making in-memory sort cache-aware
Дата	11 февраля 2023 г. 12:19:02
Msg-id	444ccfcc-06d6-0160-f662-9fa2075229ad@gmail.com обсуждение исходный текст
Ответы	Re: Sort optimizations: Making in-memory sort cache-aware
Список	pgsql-hackers

Дерево обсуждения

Hi all,

While working on sort optimization for window function, it was seen that 
performance of sort where

all tuples are in memory was bad when number of tuples were very large [1]

Eg: work_mem = 4 GB, sort on 4 int columns on table having 10 million 
tuples.


Issues we saw were as follows:

1. The comparetup function re-compares the first key again in case of 
tie-break.

2. Frequent cache misses


Issue #1 is being looked in separate patch. I am currently looking at #2.

Possible solution was to batch tuples into groups (which can fit into L3 
cache) before pushing them to sort function.

After looking at different papers on this (multi-Quicksort, memory-tuned 
quicksort, Samplesort and various distributed sorts),

although they look promising (especially samplesort), I would like to 
get more inputs as changes look bit too steep and

may or may not be in of scope of solving actual problem in hand.


Please let me know your opinions, do we really need to re-look at 
quicksort for this use-case or we can

perform optimization without major change in core sorting algorithm? Are 
we are open for trying new algorithms for sort?

Any suggestions to narrow down search space for this problem are welcomed.


[1] 
https://www.postgresql.org/message-id/CAApHDvqh+qOHk4sbvvy=Qr2NjPqAAVYf82oXY0g=Z2hRpC2Vmg@mail.gmail.com


Thanks,

Ankit

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Sort optimizations: Making in-memory sort cache-aware