Обсуждение: BUG #15361: Add column if not exists create duplicate constraint
The following bug has been logged on the website: Bug reference: 15361 Logged by: Olivier Lepretre Email address: postgresql@noetika.com PostgreSQL version: 9.6.2 Operating system: Windows 10 Description: I have a patching script that is supposed to add column if not existing : ALTER TABLE myschem.table1 ADD COLUMN IF NOT EXISTS col1 VARCHAR(254) REFERENCES myschem.table2(col2) When col1 already exists, I expected that nothing would happen. But, when applying the previous query and then querying : select constraint_name from information_schema.key_column_usage where constraint_schema='myschem' I notice that a new constraint "table1_col2_fkeyxxx" is created each time the previous ALTER TABLE ADD COLUMN is called (with xxx being a new number each time) It seems strange to have second part of statement executed (references) when first part (add column) was not. Would it be possible that this sort of query executes "references" first ? Following are comments from pgsql-general list Problem occurs on 9.6, 10.5 and 11beta1 On 09/01/2018 09:27 AM, Andreas Kretschmer wrote: > > > Am 01.09.2018 um 17:50 schrieb Olivier Leprêtre: >> I notice that a new constraint "table1_col2_fkeyxxx" is created each >> time the previous ALTER TABLE ADD COLUMN is called > > smells like a bug. Yeah, a quick test on a database where I have an event trigger: create table fk_parent(col2 varchar primary key); NOTICE: Table public.fk_parent created NOTICE: caught CREATE TABLE event on 'public.fk_parent' NOTICE: caught CREATE INDEX event on 'public.fk_parent_pkey' create table fk_child(col1 varchar references fk_parent(col2)); NOTICE: Table public.fk_child created NOTICE: caught CREATE TABLE event on 'public.fk_child' NOTICE: caught ALTER TABLE event on 'public.fk_child' \d fk_child Table "public.fk_child" Column | Type | Collation | Nullable | Default --------+-------------------+-----------+----------+--------- col1 | character varying | | | Foreign-key constraints: "fk_child_col1_fkey" FOREIGN KEY (col1) REFERENCES fk_parent(col2) alter table fk_child add column if not exists col1 varchar references fk_parent(col2); NOTICE: column "col1" of relation "fk_child" already exists, skipping NOTICE: caught ALTER TABLE event on 'public.fk_child' ALTER TABLE \d fk_child Table "public.fk_child" Column | Type | Collation | Nullable | Default --------+-------------------+-----------+----------+--------- col1 | character varying | | | Foreign-key constraints: "fk_child_col1_fkey" FOREIGN KEY (col1) REFERENCES fk_parent(col2) "fk_child_col1_fkey1" FOREIGN KEY (col1) REFERENCES fk_parent(col2) > > Regards, Andreas > Adrian Klaver adrian.klaver@aklaver.com
=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes: > I have a patching script that is supposed to add column if not existing : > ALTER TABLE myschem.table1 > ADD COLUMN IF NOT EXISTS col1 VARCHAR(254) REFERENCES > myschem.table2(col2) > When col1 already exists, I expected that nothing would happen. > [ but actually, it creates an FK constraint anyway ] As I've said many times before, I hate CREATE IF NOT EXISTS with a passion, because its semantics are so squishy. This is a perfect example of that: it's impossible to make a principled decision whether this is a bug or not, or what the correct behavior is if you think it's a bug. Should the command do nothing at all if col1 exists, regardless of whether there's an FK constraint or not? Should it avoid creating a duplicate constraint, and if so how picky are we to be about what "duplicate" means? What happens if myschem.table2(col2) doesn't exist? Not to mention whether we should change the behavior for other secondary objects that might be shown in the command, such as UNIQUE or CHECK constraints. Right now all of those get added, possibly redundantly, just like FK constraints. I believe this exact issue was debated when ADD COLUMN IF NOT EXISTS was added, and the camp that wanted it thought this behavior was fine. Even if we were now to conclude that this is a bug and agree on what'd be better semantics, there would be a pretty strong backwards-compatibility argument against changing it; some people's scripts might expect the constraint(s) to get added. The short answer is that IF NOT EXISTS gives you no guarantees whatsoever about the subsequent properties of the object, only that something by that name will exist. If you don't like that, don't use IF NOT EXISTS. regards, tom lane
I wrote: > I believe this exact issue was debated when ADD COLUMN IF NOT EXISTS > was added, and the camp that wanted it thought this behavior was fine. After a bit of digging in the archives, I failed to find any evidence that the point was considered when the ADD COLUMN IF NOT EXISTS patch went in, though there have certainly been related threads in the dim past, which basically led to the conclusion that you don't get to have much certainty about what state IF NOT EXISTS leaves behind, and those who think it's a good feature think that's OK. (I am not among them.) The thing I was half-remembering was probably this bug report about the unique-constraint aspect of the same problem: https://www.postgresql.org/message-id/flat/152509815280.19803.16118194452213577808%40wrigleys.postgresql.org wherein it was argued that if the IF NOT EXISTS fires, it should prevent all side-effects of the ADD COLUMN clause it's attached to, whether or not any of the subsidiary objects exist (in some form) already. If you buy that theory then there's a sketch for a fix there. Nobody's done anything about it AFAIK. regards, tom lane
Hi Tom, Thanks for your point of view. I'm far from being a skilled postgres user so I could not discuss it much. From a "lambda" point of you, I found interesting to use those IF EXISTS because it saved me a few lines of code to check the information_table.columns and add the test accordingly. I appreciate it was doing that for me, code contains already so much lines just to check if everything is ok/not ok and rather few for the algorithm by itself). I expected that if column was not added, it was obvious that references should not, sort of transaction rollback. Thinking about backward compatibility, I can't figure out who could have used ADD COLUMN IF NOT EXISTS ...REFERENCES ... , expecting that the REFERENCES will be created anyway, but I do not have your experience. If this behaviour is not changed, perhaps would it be enough to modify docs which states "IF NOT EXISTS is specified and a column already exists with this name, no error is thrown" adding something like "but others statements like REFERENCES will be executed anyway." BR Olivier -----Message d'origine----- De : Tom Lane [mailto:tgl@sss.pgh.pa.us] Envoyé : samedi 1 septembre 2018 20:25 À : postgresql@noetika.com Cc : pgsql-bugs@lists.postgresql.org Objet : Re: BUG #15361: Add column if not exists create duplicate constraint =?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes: > I have a patching script that is supposed to add column if not existing : > ALTER TABLE myschem.table1 > ADD COLUMN IF NOT EXISTS col1 VARCHAR(254) REFERENCES > myschem.table2(col2) > When col1 already exists, I expected that nothing would happen. > [ but actually, it creates an FK constraint anyway ] As I've said many times before, I hate CREATE IF NOT EXISTS with a passion, because its semantics are so squishy. This is a perfect example of that: it's impossible to make a principled decision whether this is a bug or not, or what the correct behavior is if you think it's a bug. Should the command do nothing at all if col1 exists, regardless of whether there's an FK constraint or not? Should it avoid creating a duplicate constraint, and if so how picky are we to be about what "duplicate" means? What happens if myschem.table2(col2) doesn't exist? Not to mention whether we should change the behavior for other secondary objects that might be shown in the command, such as UNIQUE or CHECK constraints. Right now all of those get added, possibly redundantly, just like FK constraints. I believe this exact issue was debated when ADD COLUMN IF NOT EXISTS was added, and the camp that wanted it thought this behavior was fine. Even if we were now to conclude that this is a bug and agree on what'd be better semantics, there would be a pretty strong backwards-compatibility argument against changing it; some people's scripts might expect the constraint(s) to get added. The short answer is that IF NOT EXISTS gives you no guarantees whatsoever about the subsequent properties of the object, only that something by that name will exist. If you don't like that, don't use IF NOT EXISTS. regards, tom lane
Greetings, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > I wrote: > > I believe this exact issue was debated when ADD COLUMN IF NOT EXISTS > > was added, and the camp that wanted it thought this behavior was fine. > > After a bit of digging in the archives, I failed to find any evidence > that the point was considered when the ADD COLUMN IF NOT EXISTS patch > went in, though there have certainly been related threads in the dim > past, which basically led to the conclusion that you don't get to have > much certainty about what state IF NOT EXISTS leaves behind, and those > who think it's a good feature think that's OK. (I am not among them.) > > The thing I was half-remembering was probably this bug report about > the unique-constraint aspect of the same problem: > > https://www.postgresql.org/message-id/flat/152509815280.19803.16118194452213577808%40wrigleys.postgresql.org > > wherein it was argued that if the IF NOT EXISTS fires, it should prevent > all side-effects of the ADD COLUMN clause it's attached to, whether or not > any of the subsidiary objects exist (in some form) already. If you buy > that theory then there's a sketch for a fix there. Nobody's done anything > about it AFAIK. For my 2c, at least, I continue to be of the opinion (as it seems the OP is also..) that IF NOT EXISTS means "ONLY DO THIS IF THE OBJECT REFERENCED DOESN'T EXIST". There certainly seemed to be a lack of dissenting opinion by the end of that thread, so maybe it's time to actually make that change. I still consider the current behavior to be a bug, but I could see an argument for not back-patching it out of concern about breaking scripts in a back-patch and because the code looks like it'd need to be whacked around some. Thanks! Stephen
Вложения
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> The thing I was half-remembering was probably this bug report about >> the unique-constraint aspect of the same problem: >> https://www.postgresql.org/message-id/flat/152509815280.19803.16118194452213577808%40wrigleys.postgresql.org >> wherein it was argued that if the IF NOT EXISTS fires, it should prevent >> all side-effects of the ADD COLUMN clause it's attached to, whether or not >> any of the subsidiary objects exist (in some form) already. If you buy >> that theory then there's a sketch for a fix there. Nobody's done anything >> about it AFAIK. > For my 2c, at least, I continue to be of the opinion (as it seems the OP > is also..) that IF NOT EXISTS means "ONLY DO THIS IF THE OBJECT > REFERENCED DOESN'T EXIST". The problem is precisely that it's very fuzzy exactly what THIS is. The case at hand, with an FK constraint, is maybe even a better demonstration of that than the cases we considered previously. Currently, if the column exists but lacks an FK constraint, the code will make sure one gets added; with what you propose, it won't. You can argue all day about which semantics are more useful, but I don't see that there's a bright line dividing right from wrong here --- especially since, AFAICS, there's not currently any way for a user to get the other behavior if she doesn't like the one we choose. The latter point could be addressed if there were also an IF NOT EXISTS syntax for constraints. Then the useful aspect of the current behavior could be duplicated with ALTER TABLE t ADD COLUMN IF NOT EXISTS c1 int, ADD CONSTRAINT IF NOT EXISTS c1_fk FOREIGN KEY (c1) REFERENCES othert(c2); I'd imagine that we'd address the "what is it that doesn't exist" issue by saying that you have to attach IF NOT EXISTS to named-constraint syntax, and what's checked is just the existence of a constraint by that name, not whether its properties are the same. So this isn't an exact replacement for what happens now, but it's probably close enough to cover the real use-cases. > I still consider the current behavior to be > a bug, but I could see an argument for not back-patching it out of > concern about breaking scripts in a back-patch and because the code > looks like it'd need to be whacked around some. It's definitely not going to be a back-patchable change. But I could get behind a change that addresses the confusion without taking away arguably-useful functionality. regards, tom lane
I would like to suggest that to find the bright line, we look at the "IF NOT EXISTS" of the CREATE TABLE command:The problem is precisely that it's very fuzzy exactly what THIS is. The case at hand, with an FK constraint, is maybe even a better demonstration of that than the cases we considered previously. Currently, if the column exists but lacks an FK constraint, the code will make sure one gets added; with what you propose, it won't. You can argue all day about which semantics are more useful, but I don't see that there's a bright line dividing right from wrong here --- especially since, AFAICS, there's not currently any way for a user to get the other behavior if she doesn't like the one we choose.
IF NOT EXISTS
Do not throw an error if a relation with the same name already exists. A notice is issued in this case. Note that there is no guarantee that the existing relation is anything like the one that would have been created.
So, that version doesn't automatically fix the table to match the new specification.
Also, the IF NOT EXISTS of CREATE SEQUENCE:
IF NOT EXISTS
Do not throw an error if a relation with the same name already exists. A notice is issued in this case. Note that there is no guarantee that the existing relation is anything like the sequence that would have been created - it might not even be a sequence.
This command doesn't even guarantee the result is a sequence!
My (admittedly, unimportant) opinion is that this is a bug. For CREATE TABLE and CREATE SEQUENCE, there are two possible states that the database can be in. Either with the old table/sequence definition, or with the new definition.
With the current behaviour of ADD COLUMN IF NOT EXISTS with CONSTRAINT(s), you cannot know what the resulting state is, other than that there will be one more constraint added to the column.
Furthermore, I submit the following interaction:
jstrachan=# create table test (foo integer);If the ADD COLUMN command should automatically fix the constraints of the possibly-added column, then it should also automatically change the datatype!
CREATE TABLE
jstrachan=# alter table test add column if not exists foo boolean;
ALTER TABLE
jstrachan=# \d test
Table "public.test"
Column | Type | Modifiers
--------+---------+-----------
foo | integer |
Thanks very much for all your work, and PostgreSQL is awesome.
Jamie Strachan