Re: cost_sort() may need to be updated

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: cost_sort() may need to be updated
Дата
Msg-id CAM3SWZTD6mXXf+zcLA_CsLnV2yRCVrntE0p0mTMeNVgkGvURNw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: cost_sort() may need to be updated  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: cost_sort() may need to be updated  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Sun, Sep 11, 2016 at 9:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Geoghegan <pg@heroku.com> writes:
>> I think that we *can* refine this guess, and should, because random
>> I/O is really quite unlikely to be a large cost these days (I/O in
>> general often isn't a large cost, actually). More fundamentally, I
>> think it's a problem that cost_sort() thinks that external sorts are
>> far more expensive than internal sorts in general. There is good
>> reason to think that that does not reflect the reality. I think we can
>> expect external sorts to be *faster* than internal sorts with
>> increasing regularity in Postgres 10.
>
> TBH, if that's true, haven't you broken something?

It's possible for external sorts to be faster some of the time because
the memory access patterns can be more cache efficient: smaller runs
are better when accessing tuples in sorted order, scattered across
memory. More importantly, the sort can start returning tuples earlier
in the common case where a final on-the-fly merge can be performed. In
principle, you could adopt internal sorts to have the same advantages,
but that hasn't and probably won't happen. Finally, the external sort
I/O costs grow linearly, whereas the CPU costs grow in a linearithmic
fashion, which will eventually come to dominate. We can hide the
latency of those costs pretty well, too, with asynchronous I/O.

I'm not arguing that cost_sort() should think that external sorts are
cheaper under any circumstances, since all of this is very hard to
model. I only mention this because it illustrates nicely that
cost_sort() has the wrong idea.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: cost_sort() may need to be updated
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Merge Join with an Index Optimization