Re: pg_dump additional options for performance

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: pg_dump additional options for performance
Дата
Msg-id 1204031111.4252.255.camel@ebony.site
обсуждение исходный текст
Ответ на Re: pg_dump additional options for performance  ("Tom Dunstan" <pgsql@tomd.cc>)
Ответы Re: pg_dump additional options for performance
Список pgsql-hackers
On Tue, 2008-02-26 at 18:19 +0530, Tom Dunstan wrote:
> On Tue, Feb 26, 2008 at 5:35 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > On Tue, 2008-02-26 at 12:46 +0100, Dimitri Fontaine wrote:
> >  > As a user I'd really prefer all of this to be much more transparent, and could
> >  > well imagine the -Fc format to be some kind of TOC + zip of table data + post
> >  > load instructions (organized per table), or something like this.
> >  > In fact just what you described, all embedded in a single file.
> >
> >  If its in a single file then it won't perform as well as if its separate
> >  files. We can put separate files on separate drives. We can begin
> >  reloading one table while another is still unloading. The OS will
> >  perform readahead for us on single files whereas on one file it will
> >  look like random I/O. etc.
> 
> Yeah, writing multiple unknown-length streams to a single file in
> parallel is going to be all kinds of painful, and this use case seems
> to be the biggest complaint against a zip file kind of approach. I
> didn't know about the custom file format when I suggested the zip file
> one yesterday*, but a zip or equivalent has the major benefit of
> allowing the user to do manual inspection / tweaking of the dump
> because the file format is one that can be manipulated by standard
> tools. And zip wins over tar because it's indexed - if you want to
> extract just the schema and hack on it you don't need to touch your
> multi-GBs of data.
> 
> Perhaps a compromise: we specify a file system layout for table data
> files, pre/post scripts and other metadata that we want to be made
> available to pg_restore. By default, it gets dumped into a zip file /
> whatever, but a user who wants to get parallel unloads can pass a flag
> that tells pg_dump to stick it into a directory instead, with exactly
> the same file layout. Or how about this: if the filename given to
> pg_dump is a directory, spit out files in there, otherwise
> create/overwrite a single file.
> 
> While it's a bit fiddly, putting data on separate drives would then
> involve something like symlinking the tablename inside the dump dir
> off to an appropriate mount point, but that's probably not much worse
> than running n different pg_dump commands specifying different files.
> Heck, if you've got lots of data and want very particular behavior,
> you've got to specify it somehow. :)

Separate files seems much simpler...

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com 



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Tom Dunstan"
Дата:
Сообщение: Re: pg_dump additional options for performance
Следующее
От: "Zeugswetter Andreas ADI SD"
Дата:
Сообщение: Re: pg_dump additional options for performance