Re: [HACKERS] maintenance_work_mem and CREATE INDEX time

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: [HACKERS] maintenance_work_mem and CREATE INDEX time
Дата
Msg-id CA+HiwqHM4G8meKARFYMATnXKPKz1DVR856Ot7y8kzeZZrsuLcQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] maintenance_work_mem and CREATE INDEX time  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-general
Hi Jeff,

On Tue, Jul 30, 2013 at 3:25 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Tue, Jul 23, 2013 at 10:56 PM, Amit Langote <amitlangote09@gmail.com> wrote:
>> On Wed, Jul 24, 2013 at 3:20 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>
>>> The heap structure used in external sorts is cache-unfriendly.  The
>>> bigger the heap used, the more this unfriendliness becomes apparent.
>>> And the bigger maintenance_work_mem, the bigger the heap used.
>>>
>>> The bigger heap also means you have fewer "runs" to merge in the
>>> external sort.  However, as long as the number of runs still fits in
>>> the same number of merge passes, this is generally not a meaningful
>>> difference.
>>
>> Does fewer runs mean more time (in whichever phase of external sort)?
>
> That's complicated.  In general fewer runs are faster, as the heap
> used at that stage is smaller.  But this difference is small.  If you
> can get the number of runs down to a level that needs fewer passes
> over the data, that will make things faster.  But this is rare.  If
> the sort isn't already being done in a single pass, then your sort
> must be huge or your working memory setting is pathologically tiny.
>
> There is a rough conservation of total heap layers between the two
> phases: the initial tuple heap, and the merge stage heap-of-tapes.
> Say for example that by increasing work_mem, you can increase the
> initial heap from 25 layers to 27 layers, while decreasing the merge
> phase heap from 5 layers to 3 layers.  The total number of comparisons
> for the entire sort will be about the same, but the comparisons across
> the 27 layer heap are much more likely to need to go to main RAM,
> rather than come from L3 cache (or whatever the cache level is).
>

If I my assumption that fewer runs mean longer runs is plausible, may
it be correct to think that performsort step (performsort_done -
performsort_starting) time increases when such longer runs are created
due to larger workMem?


--
Amit Langote


В списке pgsql-general по дате отправления:

Предыдущее
От: Klaus Ita
Дата:
Сообщение: Fwd: corrupted files
Следующее
От: Sergey Konoplev
Дата:
Сообщение: Re: Fastest Index/Algorithm to find similar sentences