Обсуждение: Sequences/defaults and pg_dump
Maybe it was discussed already, but I think it's very strange behavior and things should be changed (please correct me if I'm wrong) Suppose we have database containing only one simple table: *** template1=# CREATE DATABASE testseq; template1=# \c testseq testseq=# CREATE TABLE test(id SERIAL, data TEXT); NOTICE: CREATE TABLE will create implicit sequence "test_id_seq" for serial column "test.id" CREATE TABLE *** Look at the table 'test': *** testseq=# \d test Table "public.test" Column | Type | Modifiers --------+---------+---------------------------------------------------------- id | integer | not null default (nextval('test_id_seq'::regclass)) data | text | *** So, if we don't know the history we cannot understand that id is of type SERIAL. We think as it's INTEGER with DEFAULT 'nextval('test_id_seq'::regclass)' [as expression] This is the question #1 - how I can distinguish pure SERIAL and INTEGER with corresponding DEFAULT setting? Then... Imagine that we should use sequence in some other manner. For example: *** ALTER TABLE test ALTER COLUMN id SET DEFAULT nextval('test_id_seq') * 10; *** All is OK: *** testseq=# \d test Table "public.test" Column | Type | Modifiers --------+---------+---------------------------------------------------------- id | integer | not null default (nextval('test_id_seq'::regclass) * 10) data | text | *** ... and it works as is supposed to do. But after simple dump&restore procedure (I use 'pg_dump -U ns testseq > ~/testseq.dump' and then in psql - '\i /home/ns/testseq.dump') we have: *** testseq=# \d test; Table "public.test" Column | Type | Modifiers --------+---------+--------------------------------------------------- id | integer | not null default nextval('test_id_seq'::regclass) data | text | *** In dump file I see SERIAL as the type for test.id ... So, the question #2 (the main Q): why pg_dump didn't dump my expression? For me as end-user this is very-very strange and I consider it as 'gotcha' of PosgreSQL. -- Best regards, Nikolay
On Tue, Feb 07, 2006 at 02:33:56PM +0300, Nikolay Samokhvalov wrote: > Maybe it was discussed already, but I think it's very strange behavior > and things should be changed (please correct me if I'm wrong) > > Suppose we have database containing only one simple table: <snip> > So, if we don't know the history we cannot understand that id is of > type SERIAL. We think as it's INTEGER with DEFAULT > 'nextval('test_id_seq'::regclass)' [as expression] > > This is the question #1 - how I can distinguish pure SERIAL and > INTEGER with corresponding DEFAULT setting? You can't because there is no difference. SERIAL is just a shortcut. If there is no difference, why would you want to distinguish them? > Then... Imagine that we should use sequence in some other manner. For example: > *** > ALTER TABLE test ALTER COLUMN id SET DEFAULT nextval('test_id_seq') * 10; <snip> > In dump file I see SERIAL as the type for test.id ... > So, the question #2 (the main Q): why pg_dump didn't dump my expression? Well, it's a very contrived example (I can't think of a reason why one would do that) but I agree it is a bug. You could acheive the same effect by setting the step of the sequence to 10. > For me as end-user this is very-very strange and I consider it as > 'gotcha' of PosgreSQL. Well, I would hardly call the latter a gotcha, given you're probably the first person to notice it. As for the first question, I'm not sure what you expect. SERIAL has always just been a sort of macro, so I don't see how this could be changed. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Вложения
On 2/7/06, Martijn van Oosterhout <kleptog@svana.org> wrote: > On Tue, Feb 07, 2006 at 02:33:56PM +0300, Nikolay Samokhvalov wrote: <snip> > Well, it's a very contrived example (I can't think of a reason why one > would do that) but I agree it is a bug. You could acheive the same > effect by setting the step of the sequence to 10. '* 10' is just an example. The real situation would be as the following. I want to use some algorithm to hide real number of registered users in my table user. So, I don't want to use simple sequence, when every new registered user in my system can guess what is the number of registered users simply observing his ID. So, I use following algorithm: (nextval('...name of the sequnence...') * N) mod % M, where N and M are quite big numbers that have no common multiples besides 1 (sorry, do not remember the English term for those numbers ;-) ). > > For me as end-user this is very-very strange and I consider it as > > 'gotcha' of PosgreSQL. > > Well, I would hardly call the latter a gotcha, given you're probably > the first person to notice it. As for the first question, I'm not sure > what you expect. SERIAL has always just been a sort of macro, so I > don't see how this could be changed. I wonder why people didn't notice this bug earlier, but I'm sure that there are many situations when it could be revealed. For example, what if I want to use nextval('seq1') * nextval('seq2') ?.. I'm sure that if you think you'll discover new examples that would be used in real world. Anyway, this is a bug, and I'll write the bug report to bugs mailing list. > > Have a nice day, > -- > Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > > tool for doing 5% of the work and then sitting around waiting for someone > > else to do the other 95% so you can sue them. > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.1 (GNU/Linux) > > iD8DBQFD6IsBIB7bNG8LQkwRAi2bAJ9H4Sl3u2YF9rt7mzyeu7Ixk1RCawCfddBN > S7SicAM05rTpTipucoEN/yw= > =ygVy > -----END PGP SIGNATURE----- > > > -- Best regards, Nikolay
On Tue, Feb 07, 2006 at 03:28:31PM +0300, Nikolay Samokhvalov wrote: > The real situation would be as the following. > I want to use some algorithm to hide real number of registered users > in my table user. So, I don't want to use simple sequence, when every > new registered user in my system can guess what is the number of > registered users simply observing his ID. So, I use following > algorithm: > (nextval('...name of the sequnence...') * N) mod % M, > where N and M are quite big numbers that have no common multiples > besides 1 (sorry, do not remember the English term for those numbers > ;-) ). Even then you could do it by saying: ALTER SEQUENCE x MAXVALUE M INCREMENT N CYCLE; > I wonder why people didn't notice this bug earlier, but I'm sure that > there are many situations when it could be revealed. For example, what > if I want to use nextval('seq1') * nextval('seq2') ?.. I'm sure that > if you think you'll discover new examples that would be used in real > world. > > Anyway, this is a bug, and I'll write the bug report to bugs mailing list. Please do. It wouldn't be noticed much due to most people regarding sequences as opaque ie the numbers themselves don't mean anything. But if you alter the sequence or the default, pg_dump should dump it correctly. (This may be a know bug btw, I don't know). -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Вложения
Nikolay Samokhvalov <samokhvalov@gmail.com> writes: > testseq=# CREATE TABLE test(id SERIAL, data TEXT); > NOTICE: CREATE TABLE will create implicit sequence "test_id_seq" for > serial column "test.id" > CREATE TABLE > *** > ALTER TABLE test ALTER COLUMN id SET DEFAULT nextval('test_id_seq') * 10; The correct solution to this is to forbid ALTER COLUMN SET DEFAULT on a serial column, but we haven't gotten around to enforcing that yet. regards, tom lane
Tom Lane wrote: > Nikolay Samokhvalov <samokhvalov@gmail.com> writes: > > testseq=# CREATE TABLE test(id SERIAL, data TEXT); > > NOTICE: CREATE TABLE will create implicit sequence "test_id_seq" for > > serial column "test.id" > > CREATE TABLE > > *** > > ALTER TABLE test ALTER COLUMN id SET DEFAULT nextval('test_id_seq') * 10; > > The correct solution to this is to forbid ALTER COLUMN SET DEFAULT on > a serial column, but we haven't gotten around to enforcing that yet. TODO has: * %Disallow changing default expression of a SERIAL column -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On 2/7/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Nikolay Samokhvalov <samokhvalov@gmail.com> writes: > > testseq=# CREATE TABLE test(id SERIAL, data TEXT); > > NOTICE: CREATE TABLE will create implicit sequence "test_id_seq" for > > serial column "test.id" > > CREATE TABLE > > *** > > ALTER TABLE test ALTER COLUMN id SET DEFAULT nextval('test_id_seq') * 10; > > The correct solution to this is to forbid ALTER COLUMN SET DEFAULT on > a serial column, but we haven't gotten around to enforcing that yet. That's wrong! Forget about SERIAL. I have INTEGER column with some expression as DEFAULT in it. I use sequence in that expression and want this to be dumped correctly. The bug doesn't concerns SERIALs, in concerns general usage of sequences. > > regards, tom lane > -- Best regards, Nikolay
Nikolay Samokhvalov wrote: > On 2/7/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Nikolay Samokhvalov <samokhvalov@gmail.com> writes: > > > testseq=# CREATE TABLE test(id SERIAL, data TEXT); > > > NOTICE: CREATE TABLE will create implicit sequence "test_id_seq" for > > > serial column "test.id" > > > CREATE TABLE > > > *** > > > ALTER TABLE test ALTER COLUMN id SET DEFAULT nextval('test_id_seq') * 10; > > > > The correct solution to this is to forbid ALTER COLUMN SET DEFAULT on > > a serial column, but we haven't gotten around to enforcing that yet. > That's wrong! > Forget about SERIAL. I have INTEGER column with some expression as > DEFAULT in it. I use sequence in that expression and want this to be > dumped correctly. > The bug doesn't concerns SERIALs, in concerns general usage of sequences. Uh, I can't reproduce the failure: test=> CREATE SEQUENCE xx; CREATE SEQUENCE test=> CREATE TABLE test5(id integer DEFAULT nextval('xx'), data TEXT); CREATE TABLE test=> ALTER TABLE test5 ALTER COLUMN id SET DEFAULT nextval('test_id_seq') * 10; ALTER TABLE pg_dump has: CREATE TABLE test2 ( id integer DEFAULT (nextval('test_id_seq'::regclass) * 10), data text ); -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Nikolay Samokhvalov <samokhvalov@gmail.com> writes: > Forget about SERIAL. I have INTEGER column with some expression as > DEFAULT in it. No, you have a SERIAL column that you've improperly mucked with the implementation of. If you'd declared it as INTEGER to start with, you could do whatever you wanted to its default expression. regards, tom lane
Tom Lane wrote: > The correct solution to this is to forbid ALTER COLUMN SET DEFAULT on > a serial column, but we haven't gotten around to enforcing that yet. Is this per the Standard? If so, then the oft-repeated mantra that SERIAL is simply a macro for an INTEGER column with a particular DEFAULT seems a bit misleading ... - John D. Burger MITRE
"John D. Burger" <john@mitre.org> writes: > Tom Lane wrote: >> The correct solution to this is to forbid ALTER COLUMN SET DEFAULT on >> a serial column, but we haven't gotten around to enforcing that yet. > Is this per the Standard? SERIAL isn't in the standard. > If so, then the oft-repeated mantra that > SERIAL is simply a macro for an INTEGER column with a particular > DEFAULT seems a bit misleading ... It started out that way, but we've been gradually moving in the direction of making it a more integrated thing. It's already true that you're not allowed to drop the serial column's sequence, and pg_dump has special behavior for it (this is the bit that the OP thinks is a bug). This is all in the name of preventing people from shooting themselves in the foot, so forbidding changing the default expression seems like a logical next step. Now, the other direction we could go in is to back off all that and try to make SERIAL just a table-creation macro again, as it was in the beginning. It occurs to me that 8.1 has better solutions for the key problems that the sequence-to-column binding was intended to prevent. Even without SERIAL, you can't drop a sequence that some default is depending on: regression=# create sequence s1; CREATE SEQUENCE regression=# create table foo (f1 int default nextval('s1')); CREATE TABLE regression=# \d foo Table "public.foo" Column | Type | Modifiers --------+---------+--------------------------------- f1 | integer | default nextval('s1'::regclass) regression=# drop sequence s1; NOTICE: default for table foo column f1 depends on sequence s1 ERROR: cannot drop sequence s1 because other objects depend on it HINT: Use DROP ... CASCADE to drop the dependent objects too. regression=# and the former bugaboo of renaming the sequence is gone too: regression=# alter table s1 rename to foobar; ALTER TABLE regression=# \d foo Table "public.foo" Column | Type | Modifiers --------+---------+------------------------------------- f1 | integer | default nextval('foobar'::regclass) regression=# So the only extra thing that a SERIAL column still does for you is to auto-drop the sequence if you drop the table or column. Maybe that bit of hand-holding isn't worth the inconsistency of having SERIAL be a little bit more than a macro. regards, tom lane
There is no SERIAL type in the standard at all. Moreover, standard defines following expression for SEQUENCE GENERATORs: <next value expression> ::= NEXT VALUE FOR <sequence generator name> Postgres has non-standard equivalent - nextval(<sequence generator name>)... So, sequences implementation in PostgreSQL isn't standard-compliant. On 2/7/06, John D. Burger <john@mitre.org> wrote: > Tom Lane wrote: > > > The correct solution to this is to forbid ALTER COLUMN SET DEFAULT on > > a serial column, but we haven't gotten around to enforcing that yet. > > Is this per the Standard? If so, then the oft-repeated mantra that > SERIAL is simply a macro for an INTEGER column with a particular > DEFAULT seems a bit misleading ... > > - John D. Burger > MITRE > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > -- Best regards, Nikolay
On Tue, Feb 07, 2006 at 09:49:02AM -0500, Bruce Momjian wrote: > > The correct solution to this is to forbid ALTER COLUMN SET DEFAULT on > > a serial column, but we haven't gotten around to enforcing that yet. > TODO has: > * %Disallow changing default expression of a SERIAL column This should go along with another command like ... ALTER COLUMN DROP SERIAL which drops the serial and removes the dependencies to get rid of it in a clean way, right? Joachim -- Joachim Wieland joe@mcknight.de C/ Usandizaga 12 1°B ICQ: 37225940 20002 Donostia / San Sebastian (Spain) GPG key available
Joachim Wieland wrote: > On Tue, Feb 07, 2006 at 09:49:02AM -0500, Bruce Momjian wrote: > > > The correct solution to this is to forbid ALTER COLUMN SET DEFAULT on > > > a serial column, but we haven't gotten around to enforcing that yet. > > > TODO has: > > > * %Disallow changing default expression of a SERIAL column > > This should go along with another command like > > ... ALTER COLUMN DROP SERIAL > > which drops the serial and removes the dependencies to get rid of it in a > clean way, right? Uh, you can drop a column or change its data type, but not drop the data type of a column. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Wed, Feb 08, 2006 at 09:03:54AM -0500, Bruce Momjian wrote: > Joachim Wieland wrote: > > On Tue, Feb 07, 2006 at 09:49:02AM -0500, Bruce Momjian wrote: > > > > The correct solution to this is to forbid ALTER COLUMN SET DEFAULT on > > > > a serial column, but we haven't gotten around to enforcing that yet. > > > TODO has: > > > * %Disallow changing default expression of a SERIAL column > > This should go along with another command like > > ... ALTER COLUMN DROP SERIAL > > which drops the serial and removes the dependencies to get rid of it in a > > clean way, right? > Uh, you can drop a column or change its data type, but not drop the data > type of a column. Sure, the "DROP SERIAL" I proposed would rather "change" the data type to int by dropping the default and would delete referring pg_depend entries. Read it more as a kind of "drop autoincrement functionality for this column". The problem I see (but you might see it differently) is that you can't drop this autoincrement stuff without also dropping the column once you forbid to change the default (yeah I know, changing the default is even worse and leaves you with incorrect dependencies). Joachim -- Joachim Wieland joe@mcknight.de C/ Usandizaga 12 1°B ICQ: 37225940 20002 Donostia / San Sebastian (Spain) GPG key available
Joachim Wieland wrote: > On Wed, Feb 08, 2006 at 09:03:54AM -0500, Bruce Momjian wrote: > > Joachim Wieland wrote: > > > On Tue, Feb 07, 2006 at 09:49:02AM -0500, Bruce Momjian wrote: > > > > > The correct solution to this is to forbid ALTER COLUMN SET DEFAULT on > > > > > a serial column, but we haven't gotten around to enforcing that yet. > > > > > TODO has: > > > > > * %Disallow changing default expression of a SERIAL column > > > > This should go along with another command like > > > > ... ALTER COLUMN DROP SERIAL > > > > which drops the serial and removes the dependencies to get rid of it in a > > > clean way, right? > > > Uh, you can drop a column or change its data type, but not drop the data > > type of a column. > > Sure, the "DROP SERIAL" I proposed would rather "change" the data type > to int by dropping the default and would delete referring pg_depend entries. > Read it more as a kind of "drop autoincrement functionality for this > column". > > The problem I see (but you might see it differently) is that you can't drop > this autoincrement stuff without also dropping the column once you forbid to > change the default (yeah I know, changing the default is even worse and > leaves you with incorrect dependencies). I assume an ALTER COLUMN ... TYPE INTEGER would drop the SERIAL part. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Tue, Feb 07, 2006 at 15:28:31 +0300, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote: > The real situation would be as the following. > I want to use some algorithm to hide real number of registered users > in my table user. So, I don't want to use simple sequence, when every > new registered user in my system can guess what is the number of > registered users simply observing his ID. So, I use following > algorithm: > (nextval('...name of the sequnence...') * N) mod % M, > where N and M are quite big numbers that have no common multiples > besides 1 (sorry, do not remember the English term for those numbers > ;-) ). (N and M are said to be "relatively prime".) The above method isn't very secure. You might be better off using a block cipher in counter mode, depending on how badly you want to keep the number of users secret. Even that won't be foolproof as the users might cooperate with each other to estimate how many of them there are.
Bruno Wolff III <bruno@wolff.to> writes: > On Tue, Feb 07, 2006 at 15:28:31 +0300, > Nikolay Samokhvalov <samokhvalov@gmail.com> wrote: >> The real situation would be as the following. >> I want to use some algorithm to hide real number of registered users >> in my table user. So, I don't want to use simple sequence, when every >> new registered user in my system can guess what is the number of >> registered users simply observing his ID. So, I use following > (N and M are said to be "relatively prime".) > > The above method isn't very secure. You might be better off using a block > cipher in counter mode, depending on how badly you want to keep the number > of users secret. Even that won't be foolproof as the users might cooperate > with each other to estimate how many of them there are. Or, just start your sequence counting at 1000000. Or use bigint and start it at a billion. -Doug
On Fri, Feb 10, 2006 at 07:34:35 -0500, Doug McNaught <doug@mcnaught.org> wrote: > Bruno Wolff III <bruno@wolff.to> writes: > > Or, just start your sequence counting at 1000000. Or use bigint and > start it at a billion. That may work if you only have access to one id number, but you don't need too many id numbers before you can start making good statistical estimates of the number of users. In one of my math classes, I was told a story about how statisticians estimnated the number of tanks produced by germany during world war II based on serial numbers found on captured tanks. This was supposedly despite the Germans doing things (leaving gaps and the like) to make this harder. And supposedly after the war, the statistical results were supposed to have been more accurate than estimates obtain via other means (such as spies).
On 2/7/06, Martijn van Oosterhout <kleptog@svana.org> wrote: > On Tue, Feb 07, 2006 at 03:28:31PM +0300, Nikolay Samokhvalov wrote: > > The real situation would be as the following. > > I want to use some algorithm to hide real number of registered users > > in my table user. So, I don't want to use simple sequence, when every > > new registered user in my system can guess what is the number of > > registered users simply observing his ID. So, I use following > > algorithm: > > (nextval('...name of the sequnence...') * N) mod % M, > > where N and M are quite big numbers that have no common multiples > > besides 1 (sorry, do not remember the English term for those numbers > > ;-) ). > > Even then you could do it by saying: > > ALTER SEQUENCE x MAXVALUE M INCREMENT N CYCLE; > it's a pity, but no, I can't :-( after reaching MAXVALUE sequence starts with MINVALUE (1 by default)... for example with following sequence: test=# CREATE SEQUENCE testseq INCREMENT BY 3 MAXVALUE 10 CYCLE; ...I always obtain only 1, 4, 7 and 10... ;-( -- Best regards, Nikolay