Обсуждение: BUG #15361: Add column if not exists create duplicate constraint

Поиск
Список
Период
Сортировка

BUG #15361: Add column if not exists create duplicate constraint

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15361
Logged by:          Olivier Lepretre
Email address:      postgresql@noetika.com
PostgreSQL version: 9.6.2
Operating system:   Windows 10
Description:

I have a patching script that is supposed to add column if not existing :

ALTER TABLE myschem.table1
          ADD COLUMN IF NOT EXISTS col1 VARCHAR(254) REFERENCES
myschem.table2(col2)

When col1 already exists, I expected that nothing would happen. But, when
applying the previous query and then querying :

select constraint_name from information_schema.key_column_usage where
constraint_schema='myschem'

I notice that a new constraint "table1_col2_fkeyxxx" is created each time
the previous ALTER TABLE ADD COLUMN is called (with xxx being a new number
each time)

It seems strange to have second part of statement executed (references) when
first part (add column) was not. Would it be possible that this sort of
query executes "references" first ?

Following are comments from pgsql-general list 

Problem occurs on 9.6, 10.5 and 11beta1

On 09/01/2018 09:27 AM, Andreas Kretschmer wrote:
> 
> 
> Am 01.09.2018 um 17:50 schrieb Olivier Leprêtre:
>> I notice that a new constraint "table1_col2_fkeyxxx" is created each 
>> time the previous ALTER TABLE ADD COLUMN is called
> 
> smells like a bug.

Yeah, a quick test on a database where I have an event trigger:

create table fk_parent(col2 varchar primary key);
NOTICE:  Table public.fk_parent created
NOTICE:  caught CREATE TABLE event on 'public.fk_parent'
NOTICE:  caught CREATE INDEX event on 'public.fk_parent_pkey'
create table fk_child(col1 varchar references fk_parent(col2));
NOTICE:  Table public.fk_child created
NOTICE:  caught CREATE TABLE event on 'public.fk_child'
NOTICE:  caught ALTER TABLE event on 'public.fk_child'
\d fk_child
                    Table "public.fk_child"
  Column |       Type        | Collation | Nullable | Default
--------+-------------------+-----------+----------+---------
  col1   | character varying |           |          |
Foreign-key constraints:
     "fk_child_col1_fkey" FOREIGN KEY (col1) REFERENCES fk_parent(col2)

alter table fk_child add column if not exists col1 varchar references
fk_parent(col2); 

NOTICE:  column "col1" of relation "fk_child" already exists, skipping 

NOTICE:  caught ALTER TABLE event on 'public.fk_child' 

ALTER TABLE

  \d fk_child
                    Table "public.fk_child" 

  Column |       Type        | Collation | Nullable | Default 

--------+-------------------+-----------+----------+---------
 

  col1   | character varying |           |          | 
 

Foreign-key constraints: 
 

     "fk_child_col1_fkey" FOREIGN KEY (col1) REFERENCES fk_parent(col2) 
 

     "fk_child_col1_fkey1" FOREIGN KEY (col1) REFERENCES fk_parent(col2) 



> 
> Regards, Andreas
> 
Adrian Klaver
adrian.klaver@aklaver.com


Re: BUG #15361: Add column if not exists create duplicate constraint

От
Tom Lane
Дата:
=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
> I have a patching script that is supposed to add column if not existing :

> ALTER TABLE myschem.table1
>           ADD COLUMN IF NOT EXISTS col1 VARCHAR(254) REFERENCES
> myschem.table2(col2)

> When col1 already exists, I expected that nothing would happen.
> [ but actually, it creates an FK constraint anyway ]

As I've said many times before, I hate CREATE IF NOT EXISTS with a
passion, because its semantics are so squishy.  This is a perfect example
of that: it's impossible to make a principled decision whether this is a
bug or not, or what the correct behavior is if you think it's a bug.
Should the command do nothing at all if col1 exists, regardless of
whether there's an FK constraint or not?  Should it avoid creating
a duplicate constraint, and if so how picky are we to be about what
"duplicate" means?  What happens if myschem.table2(col2) doesn't exist?

Not to mention whether we should change the behavior for other secondary
objects that might be shown in the command, such as UNIQUE or CHECK
constraints.  Right now all of those get added, possibly redundantly,
just like FK constraints.

I believe this exact issue was debated when ADD COLUMN IF NOT EXISTS
was added, and the camp that wanted it thought this behavior was fine.
Even if we were now to conclude that this is a bug and agree on what'd be
better semantics, there would be a pretty strong backwards-compatibility
argument against changing it; some people's scripts might expect the
constraint(s) to get added.

The short answer is that IF NOT EXISTS gives you no guarantees whatsoever
about the subsequent properties of the object, only that something by
that name will exist.  If you don't like that, don't use IF NOT EXISTS.

            regards, tom lane


Re: BUG #15361: Add column if not exists create duplicate constraint

От
Tom Lane
Дата:
I wrote:
> I believe this exact issue was debated when ADD COLUMN IF NOT EXISTS
> was added, and the camp that wanted it thought this behavior was fine.

After a bit of digging in the archives, I failed to find any evidence
that the point was considered when the ADD COLUMN IF NOT EXISTS patch
went in, though there have certainly been related threads in the dim
past, which basically led to the conclusion that you don't get to have
much certainty about what state IF NOT EXISTS leaves behind, and those
who think it's a good feature think that's OK.  (I am not among them.)

The thing I was half-remembering was probably this bug report about
the unique-constraint aspect of the same problem:

https://www.postgresql.org/message-id/flat/152509815280.19803.16118194452213577808%40wrigleys.postgresql.org

wherein it was argued that if the IF NOT EXISTS fires, it should prevent
all side-effects of the ADD COLUMN clause it's attached to, whether or not
any of the subsidiary objects exist (in some form) already.  If you buy
that theory then there's a sketch for a fix there.  Nobody's done anything
about it AFAIK.

            regards, tom lane


RE: BUG #15361: Add column if not exists create duplicate constraint

От
Olivier Leprêtre
Дата:
Hi Tom,

Thanks for your point of view. I'm far from being a skilled postgres user so
I could not discuss it much. From a "lambda" point of you, I found
interesting to use those IF EXISTS because it saved me a few lines of code
to check the information_table.columns and add the test accordingly. I
appreciate it was doing that for me, code contains already so much lines
just to check if everything is ok/not ok and rather few for the algorithm by
itself).

I expected that if column was not added, it was obvious that references
should not, sort of transaction rollback. Thinking about backward
compatibility, I can't figure out who could have used ADD COLUMN IF NOT
EXISTS ...REFERENCES ... , expecting that the REFERENCES will be created
anyway, but I do not have your experience.

If this behaviour is not changed, perhaps would it be enough to modify docs
which states "IF NOT EXISTS is specified and a column already exists with
this name, no error is thrown" adding something like "but others statements
like REFERENCES will be executed anyway."

BR

Olivier
-----Message d'origine-----
De : Tom Lane [mailto:tgl@sss.pgh.pa.us]
Envoyé : samedi 1 septembre 2018 20:25
À : postgresql@noetika.com
Cc : pgsql-bugs@lists.postgresql.org
Objet : Re: BUG #15361: Add column if not exists create duplicate constraint

=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
> I have a patching script that is supposed to add column if not existing :

> ALTER TABLE myschem.table1
>           ADD COLUMN IF NOT EXISTS col1 VARCHAR(254) REFERENCES
> myschem.table2(col2)

> When col1 already exists, I expected that nothing would happen.
> [ but actually, it creates an FK constraint anyway ]

As I've said many times before, I hate CREATE IF NOT EXISTS with a passion,
because its semantics are so squishy.  This is a perfect example of that:
it's impossible to make a principled decision whether this is a bug or not,
or what the correct behavior is if you think it's a bug.
Should the command do nothing at all if col1 exists, regardless of whether
there's an FK constraint or not?  Should it avoid creating a duplicate
constraint, and if so how picky are we to be about what "duplicate" means?
What happens if myschem.table2(col2) doesn't exist?

Not to mention whether we should change the behavior for other secondary
objects that might be shown in the command, such as UNIQUE or CHECK
constraints.  Right now all of those get added, possibly redundantly, just
like FK constraints.

I believe this exact issue was debated when ADD COLUMN IF NOT EXISTS was
added, and the camp that wanted it thought this behavior was fine.
Even if we were now to conclude that this is a bug and agree on what'd be
better semantics, there would be a pretty strong backwards-compatibility
argument against changing it; some people's scripts might expect the
constraint(s) to get added.

The short answer is that IF NOT EXISTS gives you no guarantees whatsoever
about the subsequent properties of the object, only that something by that
name will exist.  If you don't like that, don't use IF NOT EXISTS.

            regards, tom lane



Re: BUG #15361: Add column if not exists create duplicate constraint

От
Stephen Frost
Дата:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> I wrote:
> > I believe this exact issue was debated when ADD COLUMN IF NOT EXISTS
> > was added, and the camp that wanted it thought this behavior was fine.
>
> After a bit of digging in the archives, I failed to find any evidence
> that the point was considered when the ADD COLUMN IF NOT EXISTS patch
> went in, though there have certainly been related threads in the dim
> past, which basically led to the conclusion that you don't get to have
> much certainty about what state IF NOT EXISTS leaves behind, and those
> who think it's a good feature think that's OK.  (I am not among them.)
>
> The thing I was half-remembering was probably this bug report about
> the unique-constraint aspect of the same problem:
>
> https://www.postgresql.org/message-id/flat/152509815280.19803.16118194452213577808%40wrigleys.postgresql.org
>
> wherein it was argued that if the IF NOT EXISTS fires, it should prevent
> all side-effects of the ADD COLUMN clause it's attached to, whether or not
> any of the subsidiary objects exist (in some form) already.  If you buy
> that theory then there's a sketch for a fix there.  Nobody's done anything
> about it AFAIK.

For my 2c, at least, I continue to be of the opinion (as it seems the OP
is also..) that IF NOT EXISTS means "ONLY DO THIS IF THE OBJECT
REFERENCED DOESN'T EXIST".  There certainly seemed to be a lack of
dissenting opinion by the end of that thread, so maybe it's time to
actually make that change.  I still consider the current behavior to be
a bug, but I could see an argument for not back-patching it out of
concern about breaking scripts in a back-patch and because the code
looks like it'd need to be whacked around some.

Thanks!

Stephen

Вложения

Re: BUG #15361: Add column if not exists create duplicate constraint

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> The thing I was half-remembering was probably this bug report about
>> the unique-constraint aspect of the same problem:
>> https://www.postgresql.org/message-id/flat/152509815280.19803.16118194452213577808%40wrigleys.postgresql.org
>> wherein it was argued that if the IF NOT EXISTS fires, it should prevent
>> all side-effects of the ADD COLUMN clause it's attached to, whether or not
>> any of the subsidiary objects exist (in some form) already.  If you buy
>> that theory then there's a sketch for a fix there.  Nobody's done anything
>> about it AFAIK.

> For my 2c, at least, I continue to be of the opinion (as it seems the OP
> is also..) that IF NOT EXISTS means "ONLY DO THIS IF THE OBJECT
> REFERENCED DOESN'T EXIST".

The problem is precisely that it's very fuzzy exactly what THIS is.
The case at hand, with an FK constraint, is maybe even a better
demonstration of that than the cases we considered previously.
Currently, if the column exists but lacks an FK constraint, the code
will make sure one gets added; with what you propose, it won't.
You can argue all day about which semantics are more useful, but I don't
see that there's a bright line dividing right from wrong here ---
especially since, AFAICS, there's not currently any way for a user to
get the other behavior if she doesn't like the one we choose.

The latter point could be addressed if there were also an IF NOT EXISTS
syntax for constraints.  Then the useful aspect of the current behavior
could be duplicated with

ALTER TABLE t
  ADD COLUMN IF NOT EXISTS c1 int,
  ADD CONSTRAINT IF NOT EXISTS c1_fk FOREIGN KEY (c1) REFERENCES othert(c2);

I'd imagine that we'd address the "what is it that doesn't exist" issue
by saying that you have to attach IF NOT EXISTS to named-constraint
syntax, and what's checked is just the existence of a constraint by that
name, not whether its properties are the same.  So this isn't an exact
replacement for what happens now, but it's probably close enough to
cover the real use-cases.

> I still consider the current behavior to be
> a bug, but I could see an argument for not back-patching it out of
> concern about breaking scripts in a back-patch and because the code
> looks like it'd need to be whacked around some.

It's definitely not going to be a back-patchable change.  But I could
get behind a change that addresses the confusion without taking away
arguably-useful functionality.

            regards, tom lane


Re: Re: BUG #15361: Add column if not exists create duplicateconstraint

От
Jamie Strachan
Дата:
On 2018-09-02 12:38 PM, Tom Lane wrote:
The problem is precisely that it's very fuzzy exactly what THIS is.
The case at hand, with an FK constraint, is maybe even a better
demonstration of that than the cases we considered previously.
Currently, if the column exists but lacks an FK constraint, the code
will make sure one gets added; with what you propose, it won't.
You can argue all day about which semantics are more useful, but I don't
see that there's a bright line dividing right from wrong here ---
especially since, AFAICS, there's not currently any way for a user to
get the other behavior if she doesn't like the one we choose.
I would like to suggest that to find the bright line, we look at the "IF NOT EXISTS" of the CREATE TABLE command:
IF NOT EXISTS

Do not throw an error if a relation with the same name already exists. A notice is issued in this case. Note that there is no guarantee that the existing relation is anything like the one that would have been created.

So, that version doesn't automatically fix the table to match the new specification.

Also, the IF NOT EXISTS of CREATE SEQUENCE:

IF NOT EXISTS

Do not throw an error if a relation with the same name already exists. A notice is issued in this case. Note that there is no guarantee that the existing relation is anything like the sequence that would have been created - it might not even be a sequence.

This command doesn't even guarantee the result is a sequence!

My (admittedly, unimportant) opinion is that this is a bug.  For CREATE TABLE and CREATE SEQUENCE, there are two possible states that the database can be in.  Either with the old table/sequence definition, or with the new definition.

With the current behaviour of ADD COLUMN IF NOT EXISTS with CONSTRAINT(s), you cannot know what the resulting state is, other than that there will be one more constraint added to the column.
Furthermore, I submit the following interaction:

jstrachan=# create table test (foo integer);
CREATE TABLE
jstrachan=# alter table test add column if not exists foo boolean;
ALTER TABLE
jstrachan=# \d test
     Table "public.test"
 Column |  Type   | Modifiers
--------+---------+-----------
 foo    | integer |

If the ADD COLUMN command should automatically fix the constraints of the possibly-added column, then it should also automatically change the datatype!


Thanks very much for all your work, and PostgreSQL is awesome.

Jamie Strachan