Обсуждение: pg_dump and sequences - RFC
It recently came to my attention that pg_dump dumps 'CREATE SEQUENCE' and 'SELECT NEXTVAL' commands for both data-only and schema-only output. This results in problems for users who do the two in separate steps, and seems a little odd. Also, I'd be interested to know what the purpose of 'SELECT NEXTVAL' is? My inclinations is do do the following: - Issue 'CREATE SEQUENCE...Initial Value 1...' in OID order - Issue 'SELECT SETVAL...' at end of data load. This means that a schema-only restore will hgave all sequences set up with initial value = 1, and a data-only restore will have sequences set 'correctly'. Does this sound reasonable? ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes:
> My inclinations is do do the following:
> - Issue 'CREATE SEQUENCE...Initial Value 1...' in OID order
> - Issue 'SELECT SETVAL...' at end of data load.
> This means that a schema-only restore will hgave all sequences set up with
> initial value = 1, and a data-only restore will have sequences set
> 'correctly'.
Seems reasonable, except you should not necessarily use 1; that could
be outside the defined range of the sequence object. Use its min_value
instead.
It's too bad the sequence object doesn't save the original starting
value, which is what the schema-only restore REALLY should restore.
The min_value is probably close enough for practical purposes ... not
sure that it's worth adding an original_value column just for this.
(It'd be a simple enough change in terms of the code, but I wonder if
it might create compatibility problems for applications that look at
the contents of sequences.)
> Also, I'd be interested to know what the purpose of 'SELECT NEXTVAL' is?
IIRC the point of the nextval() is to ensure that the internal state of
the sequence is correct. There's a bool "is_called" in the sequence
that means something like "I've been nextval()'d at least once", and the
only clean way to make that become set is to issue a nextval. You can
watch the behavior by doing "select * from sequenceobject" between
sequence commands --- it looks like the first nextval() simply sets
is_called without changing last_value, and then subsequent nextval()s
increment last_value. (This peculiar arrangement makes it possible
to have a starting value equal to MININT, should you want to do so.)
So pg_dump needs to make sure it restores the correct setting of both
fields.
This is pretty grotty because it looks like there's no way to clear
is_called again, short of dropping and recreating the sequence.
So unless you want to do that always, a data-only restore couldn't
guarantee to restore the state of a virgin sequence.
regards, tom lane
At 10:36 28/09/00 -0400, Tom Lane wrote: >Philip Warner <pjw@rhyme.com.au> writes: >> My inclinations is do do the following: > >> - Issue 'CREATE SEQUENCE...Initial Value 1...' in OID order >> - Issue 'SELECT SETVAL...' at end of data load. > >Seems reasonable, except you should not necessarily use 1; that could >be outside the defined range of the sequence object. Use its min_value >instead. OK. Given the discussion of 'select nextval', do you know if 'select setval' will set the is_called flag? If not should I: Issue 'CREATE SEQUENCE...Initial Value <MINVAL>...' in OID order if (is_called was set AND we've loaded any data) then Issue 'SELECT NEXTVAL...' at end of data load, and *before* setval. Issue 'SELECT SETVAL...' at end of data load. endif ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes:
> OK. Given the discussion of 'select nextval', do you know if 'select
> setval' will set the is_called flag?
Looks like it does, both by experiment and by reading the code.
So if you issue a setval() you don't need a nextval() as well.
However you still have the problem that you can't recreate the
state of a virgin (never-nextval'd) sequence this way. The
existing pg_dump code is correct, in that it will reproduce the
state of a sequence whether virgin or not. A data-only reload
would fail to make that guarantee unless you drop and recreate
the sequence.
regards, tom lane
At 11:01 28/09/00 -0400, Tom Lane wrote: >A data-only reload >would fail to make that guarantee unless you drop and recreate >the sequence. Will this cause problems in an existing database because the sequence OID changes? ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes:
> At 11:01 28/09/00 -0400, Tom Lane wrote:
>> A data-only reload
>> would fail to make that guarantee unless you drop and recreate
>> the sequence.
> Will this cause problems in an existing database because the sequence OID
> changes?
Hmm, good point. There isn't any real easy way to refer to a sequence
by OID --- the sequence functions only accept names --- but I suppose
someone out there might be doing something with sequence OIDs.
Perhaps the real answer is to extend the set of sequence functions so
that it's possible to set/clear is_called directly. Perhaps a variant
setval() with an additional, boolean argument?
regards, tom lane
At 11:17 28/09/00 -0400, Tom Lane wrote: > >Hmm, good point. There isn't any real easy way to refer to a sequence >by OID --- the sequence functions only accept names --- but I suppose >someone out there might be doing something with sequence OIDs. So long as the backend & metadata don't rely on the OID, then it's 99.9% safe, I'd guess. I'd be happy to go with this, and do a function later if/when necessary (see below). >Perhaps the real answer is to extend the set of sequence functions so >that it's possible to set/clear is_called directly. Perhaps a variant >setval() with an additional, boolean argument? This would be something I'd like to do as a learning exercise. However, aren't we 2 days from beta? Is this enough time to learn how to add a function to the backend? ---------------------------------------------------------------- Philip Warner | __---_____ Albatross Consulting Pty. Ltd. |----/ - \ (A.B.N. 75 008 659 498) | /(@) ______---_ Tel: (+61) 0500 83 82 81 | _________ \ Fax: (+61) 0500 83 82 82 | ___________ | Http://www.rhyme.com.au | / \| | --________-- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/
Philip Warner <pjw@rhyme.com.au> writes:
> This would be something I'd like to do as a learning exercise. However,
> aren't we 2 days from beta? Is this enough time to learn how to add a
> function to the backend?
In practice, you've probably got a week. I believe Marc is planning to
be out of town for a week starting tomorrow, and he's not going to be
pushing out a beta till he gets back.
(Besides, I'm not quite done with subselect-in-FROM ;-))
I'd recommend going for the function.
regards, tom lane