Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT

Поиск
Список
Период
Сортировка
От Jeremy Harris
Тема Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT
Дата
Msg-id 52E03607.2020501@wizmail.org
обсуждение исходный текст
Ответ на PoC: Duplicate Tuple Elidation during External Sort for DISTINCT  (Jon Nelson <jnelson+pgsql@jamponi.net>)
Список pgsql-hackers
On 22/01/14 03:16, Jon Nelson wrote:
> Greetings -hackers:
>
> I have worked up a patch to PostgreSQL which elides tuples during an
> external sort. The primary use case is when sorted input is being used
> to feed a DISTINCT operation. The idea is to throw out tuples that
> compare as identical whenever it's convenient, predicated on the
> assumption that even a single I/O is more expensive than some number
> of (potentially extra) comparisons.  Obviously, this is where a cost
> model comes in, which has not been implemented. This patch is a
> work-in-progress.


Dedup-in-sort is also done by my WIP internal merge sort, and
extended (in much the same ways as Jon's) to the external merge.
https://github.com/j47996/pgsql_sorb


I've not done a cost model either, but the dedup capability is
exposed from tuplesort.c to the executor, and downstream uniq
nodes removed.

I've not worked out yet how to eliminate upstream hashagg nodes,
which would be worthwhile from testing results.

-- 
Cheers,   Jeremy



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jan Kara
Дата:
Сообщение: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Следующее
От: Jeremy Harris
Дата:
Сообщение: Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT