Re: pg_dump additional options for performance

Поиск

Список

Период

Сортировка

От	Andrew Dunstan
Тема	Re: pg_dump additional options for performance
Дата	26 февраля 2008 г. 11:16:10
Msg-id	47C42D1B.90708@dunslane.net обсуждение исходный текст
Ответ на	Re: pg_dump additional options for performance (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения


Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>   
>> On Tue, Feb 26, 2008 at 12:39:29AM -0500, Tom Lane wrote:
>>     
>>> BTW, what exactly was the use-case for this?
>>>       
>
>   
>> One use-case would be when you have to make some small change to the schema
>> while reloading it, that's still compatible with the data format. Then
>> you'd dump schema-no-indexes-and-stuff, then *edit* that file, before
>> reloading things. It's a lot easier to edit the file if it's not hundreds
>> of gigabytes..
>>     
>
> This is a use-case for having switches that *extract* convenient subsets
> of a dump archive.  It does not mandate having pg_dump emit multiple
> files.  You could extract, say, the pre-data schema into a text SQL
> script, edit it, load it, then extract the data and remainining script
> directly into the database from the dump file.
>
> In short, what I think we need here is just some more conveniently
> defined extraction filter switches than --schema-only and --data-only.
> There's no need for any fundamental change to pg_dump's architecture.
>
> Yes, I've read the subsequent discussion about a "directory" output
> format.  I think it's pointless complication --- or at least, that it's
> a performance hack rather than a functionality one, with no chance of
> any actual performance gain until we've parallelized pg_restore, and
> with zero existing evidence that any gain would be had even then.
>
> BTW, if we avoid fooling with the definition of the archive format,
> that also means that the extraction-switch patch should be relatively
> independent of parallelization work, so the work could proceed
> concurrently.
>
>     
>   

I agree that they are really independent. There are enough reasons for 
splitting the schema output into pre-data and post-data sections that we 
should do that forthwith.

cheers

andrew

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: pg_dump additional options for performance