Re: Proposal: More flexible backup/restore via pg_dump
От | Philip Warner |
---|---|
Тема | Re: Proposal: More flexible backup/restore via pg_dump |
Дата | |
Msg-id | 3.0.5.32.20000627003030.02573640@mail.rhyme.com.au обсуждение исходный текст |
Ответы |
Re: Proposal: More flexible backup/restore via pg_dump
(Giles Lean <giles@nemeton.com.au>)
Re: Proposal: More flexible backup/restore via pg_dump (Bruce Momjian <pgman@candle.pha.pa.us>) |
Список | pgsql-hackers |
At 10:17 26/06/00 +0200, Zeugswetter Andreas SB wrote: > >A problem I see with an index at file end is, that you will need to read the >file twice, and that may be very undesireable if e.g the backup is on tape >or a compressed file. The proposal has actually come a fairly long way after extensive discussions with Tom Lane, and I have added the current plans at the end of this message. The TOC-at-end problem is an issue that I am trying to deal with; I am planning a 'custom' format that has the large parts (data dumps) compressed, to avoid the need of compressing the entire file. This means that you would not need to uncompress the entire file to get to the TOC, or to restore just the schema. It also allows good random access to defns and data. I'm also considering putting the dumped data at the end of the file, but this has issues when you want to restore table data before defining indexes, for example. I must admit that I've been working on the assumption that people using PostgreSQL don't have multi-GB (compressed) database dumps, so that (in theory) a restore can be loaded onto disk from tape before being used. I know this is pretty evil, but it will cover 95% of users. For those people with huge backups, they will have to suffer tapes that go backward and forwards a bit. From the details below, you will see that this is unavoidable. Sanity Check: does fseek work on tapes? If not, what is the correct way to read a particular block/byte from a file on a tape? ----------------------------------------------------------- Updated Proposal: ------------------------- For the sake of argument, call the new utilities pg_backup and pg_restore. pg_backup --------- Dump schema [and data] in OID order (to try to make restores sequential, for when tar/tape storage is used). Each dumped item has a TOC entry which includes the OID and description, and for those items for which we know some dependencies (functions for types & aggregates; types for tables; superclasses for classes; - any more?), it will also dump the dependency OIDs. Each object (table defn, table data, function defn, type defn etc) is dumped to a separate file/thing in the output file. The TOC entries go into a separate file/thing (probably only one file/thing for the whole TOC). The output scheme will be encapsulated, and in the initial version will be a custom format (since I can't see an API for tar files), and a dump-to-a-directory format. Future use of tar, DB, PostgreSQL or even a Make file should not be excluded in the IO design. This last goal *may* not be achieved, but I don't see why it can't be at this stage. Hopefully someone with appropriate skills & motivation can do a tar archive 8-}. The result of a pg_backup should be a single file with metadata and optional data, along with whatever dependency and extra data is available pg_backup, or provided by the DBA. pg_restore ---------- Reads a backup file and dumps SQL suitable for sending to psql. Options will include: - No Data (--no-data? -nd? -s?) - No metadata (--no-schema? -ns? -d?) - Specification of items to dump from an input file; this allows custom ordering AND custom selection of multiple items. Basically, I will allow the user to dump part of the TOC, edit it, and tell pg_restore to use the edited partial TOC. (--item-list=<file>? -l=<file>?) - Dump TOC (--toc-only? -c?) [Wish List] - Data For a single table (--table=<name>? -t=<name>) - Defn/Data for a single OID; (--oid=<oid>? -o=<oid>?) - User definied dependencies. Allow the DB developer to specify once for thier DB what the dependencies are, then use that files as a guide to the restore process. (--deps=<file> -D=<file>) pg_restore will use the same custom IO routines to allow IO to tar/directory/custom files. In the first pass, I will do custom file IO. If a user selects to restore the entire metadata, then it will be dumped according to the defaul policy (OID order). If they select to specify the items from an input file, then the file ordering is used. ------- Typical backup procedure: pg_backup mydb mydb.bkp or *maybe* pg_backup mydb > mydb.bkp BUT AFAIK, fseek does not work on STDOUT, and at the current time pg_backup will use fseek. Typical restore procedure: pg_restore mydb mydb.bkp | psql A user will be able to extract only the schema (-s), only the data (-d), a specific table (-t=name), or even edit the object order and selection via: pg_restore --dump-toc mydb.bkp > mytoc.txt vi mytoc.txt {ie. reorder TOC elements as per known dependency problems} pg_restore --item-list=mytoc.txt mydb.bkp | psql FWIW, I envisage the ---dump-toc output to look like: ID; FUNCTION FRED(INT4) ID; TYPE MY_TYPE ID; TABLE MY_TABLE ID; DATA MY_TABLE ID; INDEX MY_TABLE_IX1 ...etc. so editing and reordering the dump plan should not be too onerous. ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.C.N. 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
В списке pgsql-hackers по дате отправления: