Re: Inefficiency in parallel pg_restore with many tables

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Inefficiency in parallel pg_restore with many tables
Дата
Msg-id 20230715181916.qluqwiuh5ckg4f75@awork3.anarazel.de
обсуждение исходный текст
Ответ на Inefficiency in parallel pg_restore with many tables  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

On 2023-07-15 13:47:12 -0400, Tom Lane wrote:
> I wonder if we could replace the sorted ready-list with a priority heap,
> although that might be complicated by the fact that pop_next_work_item
> has to be capable of popping something that's not necessarily the
> largest remaining job.  Another idea could be to be a little less eager
> to sort the list every time; I think in practice scheduling wouldn't
> get much worse if we only re-sorted every so often.

Perhaps we could keep track of where the newly inserted items are, and use
insertion sort or such when the number of new elements is much smaller than
the size of the already sorted elements?

As you say, a straight priority heap might not be easy. But we could just open
code using two sorted arrays, one large, one for recent additions that needs
to be newly sorted. And occasionally merge the small array into the big array,
once it has gotten large enough that sorting becomes expensive.  We could go
for a heap of N>2 such arrays, but I doubt it would be worth much.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Inefficiency in parallel pg_restore with many tables
Следующее
От: Nikita Malakhov
Дата:
Сообщение: Protect extension' internal tables - how?