Re: [PERFORM] Slow BLOBs restoring

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [PERFORM] Slow BLOBs restoring
Дата
Msg-id AANLkTin2uz5rn5a6dVEXCbkFyj-87=c1Q41KiCWcbUWP@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PERFORM] Slow BLOBs restoring  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Thu, Dec 9, 2010 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> One fairly simple, if ugly, thing we could do about this is skip calling
>> reduce_dependencies during the first loop if the TOC object is a blob;
>> effectively assuming that nothing could depend on a blob.  But that does
>> nothing about the point that we're failing to parallelize blob
>> restoration.  Right offhand it seems hard to do much about that without
>> some changes to the archive representation of blobs.  Some things that
>> might be worth looking at for 9.1:
>
>> * Add a flag to TOC objects saying "this object has no dependencies",
>> to provide a generalized and principled way to skip the
>> reduce_dependencies loop.  This is only a good idea if pg_dump knows
>> that or can cheaply determine it at dump time, but I think it can.
>
> I had further ideas about this part of the problem.  First, there's no
> need for a file format change to fix this: parallel restore is already
> groveling over all the dependencies in its fix_dependencies step, so it
> could count them for itself easily enough.  Second, the real problem
> here is that reduce_dependencies processing is O(N^2) in the number of
> TOC objects.  Skipping it for blobs, or even for all dependency-free
> objects, doesn't make that very much better: the kind of people who
> really need parallel restore are still likely to bump into unreasonable
> processing time.  I think what we need to do is make fix_dependencies
> build a reverse lookup list of all the objects dependent on each TOC
> object, so that the searching behavior in reduce_dependencies can be
> eliminated outright.  That will take O(N) time and O(N) extra space,
> which is a good tradeoff because you won't care if N is small, while if
> N is large you have got to have it anyway.
>
> Barring objections, I will do this and back-patch into 9.0.  There is
> maybe some case for trying to fix 8.4 as well, but since 8.4 didn't
> make a separate TOC entry for each blob, it isn't as exposed to the
> problem.  We didn't back-patch the last round of efficiency hacks in
> this area, so I'm thinking it's not necessary here either.  Comments?

Ah, that sounds like a much cleaner solution.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Solving sudoku using SQL
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: [PERFORM] Slow BLOBs restoring