Re: Parallel tuplesort (for parallel B-Tree index creation)

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Parallel tuplesort (for parallel B-Tree index creation)
Дата
Msg-id CAM3SWZTQHvmPTL-mZ-2gOfPQu3f=MLT0c4HXQPC5=+ESS42hjw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel tuplesort (for parallel B-Tree index creation)  (Claudio Freire <klaussfreire@gmail.com>)
Ответы Re: Parallel tuplesort (for parallel B-Tree index creation)  (Claudio Freire <klaussfreire@gmail.com>)
Список pgsql-hackers
On Thu, Sep 8, 2016 at 8:53 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
> setup:
>
> create table lotsofitext(i text, j text, w text, z integer, z2 bigint);
> insert into lotsofitext select cast(random() * 1000000000.0 as text)
> || 'blablablawiiiiblabla', cast(random() * 1000000000.0 as text) ||
> 'blablablawjjjblabla', cast(random() * 1000000000.0 as text) ||
> 'blablabl
> awwwabla', random() * 1000000000.0, random() * 1000000000000.0 from
> generate_series(1, 10000000);
>
> timed:
>
> select count(*) FROM (select * from lotsofitext order by i, j, w, z, z2) t;
>
> Unpatched Time: 100351.251 ms
> Patched Time: 75180.787 ms
>
> That's like a 25% speedup on random input. As we say over here, rather
> badly translated, not a turkey's boogers (meaning "nice!")

Cool! What work_mem setting were you using here?

>> More than using "n" or "memtupcount" what I'm saying is to assert that
>> memtuples[imin] is inside the heap, which would catch the same errors
>> the original assert would, and more.
>>
>> Assert(imin < state->memtupcount)
>>
>> If you prefer.
>>
>> The original asserts allows any value of imin for memtupcount>1, and
>> that's my main concern. It shouldn't.
>
> So, for the assertions to properly avoid clobbering/reading out of
> bounds memory, you need both the above assert:
>
>  +                */
>  +               memtuples[i] = memtuples[imin];
>  +               i = imin;
>  +       }
>  +
>>+       Assert(imin < state->memtupcount);
>  +       memtuples[imin] = *newtup;
>  +}
>
> And another one at the beginning, asserting:
>
>  +       SortTuple  *memtuples = state->memtuples;
>  +       int             n,
>  +                               imin,
>  +                               i;
>  +
>>+       Assert(state->memtupcount > 0 && memtuples[0].tupindex == newtup->tupindex);
>  +
>  +       CHECK_FOR_INTERRUPTS();
>
> It's worth making that change, IMHO, unless I'm missing something.

You're supposed to just not call it with an empty heap, so the
assertions trust that much. I'll look into that.

Currently, producing a new revision of this entire patchset. Improving
the cost model (used when the parallel_workers storage parameter is
not specified within CREATE INDEX) is taking a bit of time, but hope
to have it out in the next couple of days.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: Aggregate Push Down - Performing aggregation on foreign server
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pg_dump with tables created in schemas created by extensions