Обсуждение: pg_dump and sequences - RFC
It recently came to my attention that pg_dump dumps 'CREATE SEQUENCE' and 'SELECT NEXTVAL' commands for both data-only and schema-only output. This results in problems for users who do the two in separate steps, and seems a little odd. Also, I'd be interested to know what the purpose of 'SELECT NEXTVAL' is? My inclinations is do do the following: - Issue 'CREATE SEQUENCE...Initial Value 1...' in OID order - Issue 'SELECT SETVAL...' at end of data load. This means that a schema-only restore will hgave all sequences set up with initial value = 1, and a data-only restore will have sequences set 'correctly'. Does this sound reasonable? ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > My inclinations is do do the following: > - Issue 'CREATE SEQUENCE...Initial Value 1...' in OID order > - Issue 'SELECT SETVAL...' at end of data load. > This means that a schema-only restore will hgave all sequences set up with > initial value = 1, and a data-only restore will have sequences set > 'correctly'. Seems reasonable, except you should not necessarily use 1; that could be outside the defined range of the sequence object. Use its min_value instead. It's too bad the sequence object doesn't save the original starting value, which is what the schema-only restore REALLY should restore. The min_value is probably close enough for practical purposes ... not sure that it's worth adding an original_value column just for this. (It'd be a simple enough change in terms of the code, but I wonder if it might create compatibility problems for applications that look at the contents of sequences.) > Also, I'd be interested to know what the purpose of 'SELECT NEXTVAL' is? IIRC the point of the nextval() is to ensure that the internal state of the sequence is correct. There's a bool "is_called" in the sequence that means something like "I've been nextval()'d at least once", and the only clean way to make that become set is to issue a nextval. You can watch the behavior by doing "select * from sequenceobject" between sequence commands --- it looks like the first nextval() simply sets is_called without changing last_value, and then subsequent nextval()s increment last_value. (This peculiar arrangement makes it possible to have a starting value equal to MININT, should you want to do so.) So pg_dump needs to make sure it restores the correct setting of both fields. This is pretty grotty because it looks like there's no way to clear is_called again, short of dropping and recreating the sequence. So unless you want to do that always, a data-only restore couldn't guarantee to restore the state of a virgin sequence. regards, tom lane
At 10:36 28/09/00 -0400, Tom Lane wrote: >Philip Warner <pjw@rhyme.com.au> writes: >> My inclinations is do do the following: > >> - Issue 'CREATE SEQUENCE...Initial Value 1...' in OID order >> - Issue 'SELECT SETVAL...' at end of data load. > >Seems reasonable, except you should not necessarily use 1; that could >be outside the defined range of the sequence object. Use its min_value >instead. OK. Given the discussion of 'select nextval', do you know if 'select setval' will set the is_called flag? If not should I: Issue 'CREATE SEQUENCE...Initial Value <MINVAL>...' in OID order if (is_called was set AND we've loaded any data) then Issue 'SELECT NEXTVAL...' at end of data load, and *before* setval. Issue 'SELECT SETVAL...' at end of data load. endif ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > OK. Given the discussion of 'select nextval', do you know if 'select > setval' will set the is_called flag? Looks like it does, both by experiment and by reading the code. So if you issue a setval() you don't need a nextval() as well. However you still have the problem that you can't recreate the state of a virgin (never-nextval'd) sequence this way. The existing pg_dump code is correct, in that it will reproduce the state of a sequence whether virgin or not. A data-only reload would fail to make that guarantee unless you drop and recreate the sequence. regards, tom lane
At 11:01 28/09/00 -0400, Tom Lane wrote: >A data-only reload >would fail to make that guarantee unless you drop and recreate >the sequence. Will this cause problems in an existing database because the sequence OID changes? ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > At 11:01 28/09/00 -0400, Tom Lane wrote: >> A data-only reload >> would fail to make that guarantee unless you drop and recreate >> the sequence. > Will this cause problems in an existing database because the sequence OID > changes? Hmm, good point. There isn't any real easy way to refer to a sequence by OID --- the sequence functions only accept names --- but I suppose someone out there might be doing something with sequence OIDs. Perhaps the real answer is to extend the set of sequence functions so that it's possible to set/clear is_called directly. Perhaps a variant setval() with an additional, boolean argument? regards, tom lane
At 11:17 28/09/00 -0400, Tom Lane wrote: > >Hmm, good point. There isn't any real easy way to refer to a sequence >by OID --- the sequence functions only accept names --- but I suppose >someone out there might be doing something with sequence OIDs. So long as the backend & metadata don't rely on the OID, then it's 99.9% safe, I'd guess. I'd be happy to go with this, and do a function later if/when necessary (see below). >Perhaps the real answer is to extend the set of sequence functions so >that it's possible to set/clear is_called directly. Perhaps a variant >setval() with an additional, boolean argument? This would be something I'd like to do as a learning exercise. However, aren't we 2 days from beta? Is this enough time to learn how to add a function to the backend? ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes: > This would be something I'd like to do as a learning exercise. However, > aren't we 2 days from beta? Is this enough time to learn how to add a > function to the backend? In practice, you've probably got a week. I believe Marc is planning to be out of town for a week starting tomorrow, and he's not going to be pushing out a beta till he gets back. (Besides, I'm not quite done with subselect-in-FROM ;-)) I'd recommend going for the function. regards, tom lane