Re: Using quicksort for every external sort run

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Using quicksort for every external sort run
Дата
Msg-id CAM3SWZQHJS=Bg9esOZo7V3nXTjEiAMyjVmjJyk-feLz13m6b8g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Using quicksort for every external sort run  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
On Sun, Dec 6, 2015 at 3:59 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> My column has the format of ABC-123-456-789-0
>
> The name-space identifier ("ABC-")  is the same in 99.99% of the
> cases.  And to date, as well as in my extrapolation, the first two
> digits of the numeric part are leading zeros and the third one is
> mostly 0,1,2.  So the first 8 bytes really have less than 2 bits worth
> of information.  So yeah, not surprising abbreviation was not useful.

I think that given you're using the "C" collation, abbreviation should
still go ahead. I posted a patch to do that, which I need to further
justify per Robert's request (currently, we do nothing special based
on collation). Abbreviation should help in surprisingly marginal
cases, since far fewer memory accesses will be required in the early
stages of the sort with only (say) 5 distinct abbreviated keys. Once
abbreviated comparisons start to not help at all (with quicksort, at
some partition), there's a good chance that the full keys can be
reused to some extent, before being evicted from CPU caches.

>> BTW, roughly what does this CREATE INDEX look like? Is it a composite
>> index, for example?
>
> Nope, just a single column index.  In the extrapolated data set, each
> distinct value shows up a couple hundred times on average.  I'm
> thinking of converting it to a btree_gin index once I've tested them a
> bit more, as the compression benefits are substantial.

Unfortunately, that cannot use tuplesort.c at all.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: Using quicksort for every external sort run
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Using quicksort for every external sort run