Обсуждение: Creating a clean database
Hi All, I have finished a testing phase and I would like to create a production database. My thought is to use pg_dump to create a backup of the data dictionary, create a new database using pgadminIII or psql and then use something (pg_restore or psql or ?) to rebuild all of the database objects. I am hoping that a technique is available that will allow me to get all new oids and reset all sequences. Suggestions will be appreciated. -- Kind Regards, Keith
On Mon, Jan 03, 2005 at 09:13:56PM -0500, Keith Worthington wrote: > I have finished a testing phase and I would like to create a production > database. My thought is to use pg_dump to create a backup of the data > dictionary, create a new database using pgadminIII or psql and then use > something (pg_restore or psql or ?) to rebuild all of the database > objects. I am hoping that a technique is available that will allow me > to get all new oids and reset all sequences. Suggestions will be > appreciated. Have you looked at the documentation for pg_dump, in particular the --schema-only and --data-only options? If those won't work then please be more specific about what you're trying to do. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Michael Fuhr wrote: >On Mon, Jan 03, 2005 at 09:13:56PM -0500, Keith Worthington wrote: > > > >>I have finished a testing phase and I would like to create a production >>database. My thought is to use pg_dump to create a backup of the data >>dictionary, create a new database using pgadminIII or psql and then use >>something (pg_restore or psql or ?) to rebuild all of the database >>objects. I am hoping that a technique is available that will allow me >>to get all new oids and reset all sequences. Suggestions will be >>appreciated. >> >> > >Have you looked at the documentation for pg_dump, in particular >the --schema-only and --data-only options? If those won't work >then please be more specific about what you're trying to do. > > Yes I have and I believe I understand how to use pg_dump for the most part. The last time I used pg_dump and pg_restore to create a new database. The procedure went something like this: 1) pg_dump 2) psql create database testdb 3) pg_restore This seems to have the affect of keeping OIDs. Are OIDs universal to an installation or specific to a database? I thought that when I created the new database the OIDs would start over at 0 but they didn't appear to do so. I will check later but I am not sure whether or not the sequences were reset to 1 or not either. My desire is to simply create a clean database. I have read about OID counter wraparound and how that can be bad but I do not yet understand the implications of this issue. I suspect that I need to use pg_dump to output the SQL commands necessary to create the data dictionary and then run these commands via psql. Do I just cat the file into psql? i.e. cat data_dictionary.sql > psql -database NEWDB -username postgres -- Kind Regards, Keith
On Tue, Jan 04, 2005 at 07:15:56AM -0500, Keith Worthington wrote: > > 1) pg_dump > 2) psql create database testdb > 3) pg_restore It might be useful to see the exact pg_dump and pg_restore commands you ran (all options and arguments). > This seems to have the affect of keeping OIDs. How did you determine this? I'd expect that to be the case if you used pg_dump's --oids option (or -o), but not otherwise. > Are OIDs universal to an installation or specific to a database? > I thought that when I created the new database the OIDs would start > over at 0 but they didn't appear to do so. See the "System Columns" section in the "Data Definition" chapter of the documentation: OIDs are 32-bit quantities and are assigned from a single cluster-wide counter. In a large or long-lived database, it is possible for the counter to wrap around. Hence, it is bad practice to assume that OIDs are unique, unless you take steps to ensure that this is the case. > I will check later but I am not sure whether or not the sequences > were reset to 1 or not either. If you did a restore, why would you want or expect the sequences to be reset to 1? That would contradict the purpose of a restore. > My desire is to simply create a clean database. What do you mean by a "clean" database? How would a "clean" restored database differ from the original database that you dumped, and why would you want those differences? What problem are you trying to solve? > I have read about OID counter wraparound and how that can be bad > but I do not yet understand the implications of this issue. OID wraparound means that you can't depend on OIDs being unique. Code that makes that assumption could break. Understand the difference between OID wraparound and transaction ID wraparound -- see "Preventing transaction ID wraparound failures" in the "Routine Database Maintenance Tasks" chapter for a discussion of the latter. > I suspect that I need to use pg_dump to output the SQL commands > necessary to create the data dictionary and then run these commands via > psql. Do I just cat the file into psql? i.e. > cat data_dictionary.sql > psql -database NEWDB -username postgres You can use "cat file | psql" but a few years ago that would have won you a "Useless Use of Cat Award": http://laku19.adsl.netsonic.fi/era/unix/award.html Other ways are "psql < file" and "psql -f file". See the psql documentation and read your shell's documentation for an explanation of input and output redirection. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
* Michael Fuhr <mike@fuhr.org> [040105, 10:19]: > On Tue, Jan 04, 2005 at 07:15:56AM -0500, Keith Worthington wrote: > > > > [...] > > > My desire is to simply create a clean database. > > What do you mean by a "clean" database? How would a "clean" restored > database differ from the original database that you dumped, and why > would you want those differences? What problem are you trying to > solve? > Sorry to be intrusive :-) May be he simply would like to have a brand new db, where record 1 has OID no. 1, rec. 2 OID no. 2 and so on, which I fear is not possible if he also wants to recover whatever data has already been registered in the db ... Regards, Ennio. -- [Perche' usare Win$ozz (dico io) se ..."anche uno sciocco sa farlo. \\?// Fa' qualche cosa di cui non sei capace!" (diceva Henry Miller) ] (°|°) [Why use Win$ozz (I say) if ... "even a fool can do that. )=( Do something you aren't good at!" (as Henry Miller used to say) ]
> On Tue, Jan 04, 2005 at 07:15:56AM -0500, Keith Worthington wrote: > > > > 1) pg_dump > > 2) psql create database testdb > > 3) pg_restore > > It might be useful to see the exact pg_dump and pg_restore commands > you ran (all options and arguments). > The dump command was: pg_dump IPADB -Ft -v -U postgres -f ${IPA_TMP_DIR}/TEST.tar Then I created a new database called IPA_TEST with pgadmin3 and finally I restored the database with: pg_restore --dbname=IPA_TEST --schema-only --superuser=postgres --verbose -U postgres TEST.tar > > > This seems to have the affect of keeping OIDs. > > How did you determine this? I'd expect that to be the case if > you used pg_dump's --oids option (or -o), but not otherwise. Well, I made what was probably a foolish assumption. When the database was restored the OIDs kept increasing rather than starting at some low value (0?). > > > Are OIDs universal to an installation or specific to a database? > > I thought that when I created the new database the OIDs would start > > over at 0 but they didn't appear to do so. > > See the "System Columns" section in the "Data Definition" chapter > of the documentation: > > OIDs are 32-bit quantities and are assigned from a single > cluster-wide counter. In a large or long-lived database, it > is possible for the counter to wrap around. Hence, it is bad > practice to assume that OIDs are unique, unless you take steps > to ensure that this is the case. > Okay I read that section over. So OIDs are universal to an installation and not specific to each database. That was something that I didn't understand. For lack of a contrary rason I created most tables with OIDs as I thought I read somewhere that they improved performance. Is there another reason for using them? Do they increase performance and if not then what good are they? The documentation implies that if you really want a row identifier use a serial type. Isn't a row identifier the primary key? Sighhhh. So much to learn, so little time. > > > I will check later but I am not sure whether or not the sequences > > were reset to 1 or not either. > > If you did a restore, why would you want or expect the sequences > to be reset to 1? That would contradict the purpose of a restore. > I wasn't trying to restore to an old condition but rather create a new database. I will expand further below. > > > > My desire is to simply create a clean database. > > What do you mean by a "clean" database? How would a "clean" restored > database differ from the original database that you dumped, and why > would you want those differences? What problem are you trying to > solve? > By clean database I mean freshly built with no data and all internal constructs in a like new condition. Specifically all counters set back to inital value and I had thought (prior to my education on OIDs) the OID counter back to its inital value. My objective is simple. We have been developing a new system for several months. The data dictionary is nearly in its final form. Most needed functions have been built and there is a bunch of really mutilated data laying around after testing. I simply wish to extract the data dictionary and build the production database. > > > > I have read about OID counter wraparound and how that can be bad > > but I do not yet understand the implications of this issue. > > OID wraparound means that you can't depend on OIDs being unique. > Code that makes that assumption could break. > I do not use OIDs at this time. As I implied above I do not yet see what use they are. If I had to build my database over from scratch (which I am about to do.) I would probably not even use them so I am eager to learn how they might benefit me. > > Understand the difference between OID wraparound and transaction > ID wraparound -- see "Preventing transaction ID wraparound failures" > in the "Routine Database Maintenance Tasks" chapter for a discussion > of the latter. > I didn't even know there was a xact identifier. Still more to learn... > > > I suspect that I need to use pg_dump to output the SQL commands > > necessary to create the data dictionary and then run these commands via > > psql. Do I just cat the file into psql? i.e. > > cat data_dictionary.sql > psql -database NEWDB -username postgres > > You can use "cat file | psql" but a few years ago that would have > won you a "Useless Use of Cat Award": > > http://laku19.adsl.netsonic.fi/era/unix/award.html > Ahh, the useless use of cat. I ran into that a number of times on the sed list but was never accused of it. Not yet understanding what I can pipe into and out of or redirect I have been lead like a lamb to the slaughter... :-) > > Other ways are "psql < file" and "psql -f file". See the psql > documentation and read your shell's documentation for an explanation > of input and output redirection. > > -- > Michael Fuhr > http://www.fuhr.org/~mfuhr/ Well, now that my education has progressed some more I believe that I will try the following. 1) Use pgadmin to create NEWDB 2) Run the following: pg_dump IPADB --format=p --schema-only --verbose -U postgres > psql --dbname NEWDB --username postgres Any further comments prior to my attempt tomorrow morning will be appreciated. Especially those that help me to avert disaster! ;-) Kind Regards, Keith ______________________________________________ 99main Internet Services http://www.99main.com
On Tue, Jan 04, 2005 at 05:18:30PM -0500, Keith Worthington wrote: > For lack of a contrary rason I created most tables with OIDs as I thought I > read somewhere that they improved performance. Is there another reason for > using them? Do they increase performance and if not then what good are they? Performance will probably be better without OIDs. See the WITHOUT OIDS section of the CREATE TABLE documentation: In addition, excluding OIDs from a table reduces the space required on disk to storage the table by 4 bytes per row, leading to increased performance. The "Object Identifier Types" section of the "Data Types" chapter says that "OIDs are best used only for references to system tables." The 8.0 Release Notes state that future versions of PostgreSQL will change the default behavior to create tables without OIDs. In 8.0 you can make WITHOUT OIDS the default by setting the default_with_oids configuration variable to "off". > The documentation implies that if you really want a row identifier use a > serial type. Isn't a row identifier the primary key? Sighhhh. So much to > learn, so little time. The primary key is whatever field(s) you specify with a PRIMARY KEY constraint. If you want automatically-generated primary keys, then using a sequence (typically via a SERIAL or BIGSERIAL type) makes wraparound less of a problem than if you use OIDs. > I do not use OIDs at this time. As I implied above I do not yet see what use > they are. If I had to build my database over from scratch (which I am about > to do.) I would probably not even use them so I am eager to learn how they > might benefit me. OIDs probably won't benefit you, so if you don't need them and if you're not using any third-party software that needs them, then consider creating your tables using WITHOUT OIDS. If you're using PostgreSQL 8.0, then you could make that the default behavior by setting default_with_oids to "off" in postgresql.conf. > Well, now that my education has progressed some more I believe that I will try > the following. > 1) Use pgadmin to create NEWDB > 2) Run the following: pg_dump IPADB --format=p --schema-only --verbose -U > postgres > psql --dbname NEWDB --username postgres You probably want to pipe (|) instead of redirect (>). If you don't understand the difference then read your shell's documentation. Consider storing the database schema in a file instead of extracting it from one database and piping it directly into another. That file, along with explanatory comments, becomes part of your application's source code and documentation. -- Michael Fuhr http://www.fuhr.org/~mfuhr/