Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT

Поиск
Список
Период
Сортировка
От Jon Nelson
Тема Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT
Дата
Msg-id CAKuK5J2R44SwGPyKJtrDZfGbWZ44CM1HHvhfzJP_ngz3MGdNWg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Tue, Jan 21, 2014 at 9:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jon Nelson <jnelson+pgsql@jamponi.net> writes:
>> A rough summary of the patch follows:
>
>> - a GUC variable enables or disables this capability
>> - in nodeAgg.c, eliding duplicate tuples is enabled if the number of
>>   distinct columns is equal to the number of sort columns (and both are
>>   greater than zero).
>> - in createplan.c, eliding duplicate tuples is enabled if we are
>>   creating a unique plan which involves sorting first
>> - ditto planner.c
>> - all of the remaining changes are in tuplesort.c, which consist of:
>>   + a new macro, DISCARDTUP and a new structure member, discardtup, are
>>     both defined and operate similar to COMPARETUP, COPYTUP, etc...
>>   + in puttuple_common, when state is TSS_BUILDRUNS, we *may* simply
>>     throw out the new tuple if it compares as identical to the tuple at
>>     the top of the heap. Since we're already performing this comparison,
>>     this is essentially free.
>>   + in mergeonerun, we may discard a tuple if it compares as identical
>>     to the *last written tuple*. This is a comparison that did not take
>>     place before, so it's not free, but it saves a write I/O.
>>   + We perform the same logic in dumptuples
>
> [ raised eyebrow ... ]  And what happens if the planner drops the
> unique step and then the sort doesn't actually go to disk?

I'm not familiar enough with the code to be able to answer your
question with any sort of authority, but I believe that if the state
TSS_BUILDRUNS is never hit, then basically nothing new happens.

-- 
Jon



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Why conf.d should be default, and auto.conf and recovery.conf should be in it
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Funny representation in pg_stat_statements.query.