Re: Using quicksort for every external sort run

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Using quicksort for every external sort run
Дата
Msg-id CAM3SWZQfU3Y-Zs29u4x5wVK39pb3b2WbcPmTPLrLeC4zen=e7w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Using quicksort for every external sort run  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
On Wed, Mar 16, 2016 at 6:42 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> - I think that batchmemtuples() is somewhat weird.  Normally,
>> grow_memtuples() doubles the size of the array each time it's called.
>> So if you somehow called this function when you still had lots of
>> memory available, it would just double the size of the array.
>> However, I think the expectation is that it's only going to be called
>> when availMem is less than half of allowedMem, in which case we're
>> going to get the special "last increment of memtupsize" behavior,
>> where we expand the memtuples array by some multiple between 1.0 and
>> 2.0 based on allowedMem/memNowUsed.
>
> That's right. It might be possible for the simple doubling behavior to
> happen under artificial conditions instead, for example when we have
> enormous individual tuples, but if that does happen it's still
> correct. I just didn't think it was worth worrying about giving back
> more memory in such extreme edge-cases.

Come to think of it, maybe the pass-by-value datum sort case should
also call batchmemtuples() too (or something similar). If you look at
how beginmerge() is called, you'll see that that doesn't happen.

Obviously this case is not entitiled to a "memtupsize *
STANDARDCHUNKHEADERSIZE" refund, since of course there never was any
overhead like that at any point. And, obviously this case has no need
for batch memory at all. However, it is entitled to get a refund for
non-used tapes (accounted for, but, it turns out, never allocated
tapes). It should then get the benefit of that refund by way of
growing memtuples through a similar "final, honestly, I really mean it
this time" call to grow_memtuples().

So, while the "memtupsize * STANDARDCHUNKHEADERSIZE refund" part
should still be batch-specific (i.e. used for the complement of
tuplesort cases, never the datum pass-by-val case), the new
grow_memtuples() thing should always happen with external sorts.

The more I think about it, the more I wonder if we should commit
something like the debugging patch 0004-* (enabled only when
trace_sort = on, of course). Close scrutiny of what tuplesort.c is
doing with memory is important.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Joshua D. Drake"
Дата:
Сообщение: Gendered language in source
Следующее
От: "Joshua D. Drake"
Дата:
Сообщение: Make primnodes.h gender neutral