Re: 8.3 / 8.2.6 restore comparison

Поиск

Список

Период

Сортировка

От	Joshua D. Drake
Тема	Re: 8.3 / 8.2.6 restore comparison
Дата	24 февраля 2008 г. 16:47:42
Msg-id	47C1ADA3.60104@commandprompt.com обсуждение исходный текст
Ответ на	Re: 8.3 / 8.2.6 restore comparison (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: 8.3 / 8.2.6 restore comparison (Jeff Davis <pgsql@j-davis.com>)
Список	pgsql-hackers

Дерево обсуждения

Tom Lane wrote:
> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
>> At some point, I think we have to bite the bullet and find a way to use 
>> multiple CPUs for a single load. I don't have any good ideas or plans 
>> for that, but hopefully someone does.
> 
> As already mentioned upthread, we could do that today, with zero backend
> changes, by making pg_restore drive multiple sessions.  Now there are
> scenarios where this wouldn't help too much --- eg, a database with only
> one enormous table.  We couldn't parallelize the loading of that table,
> although we could parallelize creation of its indexes.  But for an
> example such as JD is complaining about, we ought to have no trouble
> thrashing his disks into the ground ;-)

Bring it on ! but I would note that with the current pg_restore I was 
not able to trash my disks. I only used four backends (I could have used 
8) but even with that, I was only doing ~ 45M a second. So if I double, 
I still have bandwidth. It would still be an huge improvement though.

> 
> What you would need for this is an -Fc or -Ft dump, because a plain
> script output from pg_dump doesn't carry any dependency information,

I would focus on -Fc. With the limitations of -Ft this would be a good 
way to start phasing -Ft out.

> much less any index of where in the file different bits of data are.
> Just armwaving, I envision a multiprocess pg_restore like this:
> 
>     * one controller process that tracks the not-yet-loaded
>       TOC items in the dump
>     * N child processes that just transmit a selected TOC item
>       to a connected backend, and then ask the controller
>       what to do next

> Most likely, the bottleneck with this sort of thing would be multiple
> parallel reads from the pg_dump archive file.  Possibly the controller
> process could be taught to schedule COPY and CREATE INDEX operations
> so that not too many processes are trying to read lots of archive
> data at the same time.

A less hacker and more DBA bottleneck will be to limit the number of 
backends being created for restore. We don't really want to have more 
than one backend per CPU, otherwise we just start switching.

Sincerely,

Joshua D. Drake

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Hannes Dorbath
Дата: 24 февраля 2008 г., 16:41:20
Сообщение: Re: 8.3 / 8.2.6 restore comparison

Следующее

От: Tom Lane
Дата: 24 февраля 2008 г., 16:57:37
Сообщение: Re: Behaviour of rows containg not-null domains in plpgsql

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: 8.3 / 8.2.6 restore comparison

Предыдущее

Следующее