Обсуждение: BUG #15180: Alter table add column if not exists with uniqueconstraint will add extra duplicate

Поиск
Список
Период
Сортировка

BUG #15180: Alter table add column if not exists with uniqueconstraint will add extra duplicate

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15180
Logged by:          Olav Gjerde
Email address:      olav@backupbay.com
PostgreSQL version: 10.1
Operating system:   x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 6.3.0
Description:

Alter table add column if not exists with unique constraint will add extra
duplicate of the unique constraint when the column exists.

Example:
ALTER TABLE api_values ADD COLUMN IF NOT EXISTS master_key bigint NULL
UNIQUE;

If you run this several times you will get more unique constraints:
"api_values_master_key_key" UNIQUE CONSTRAINT, btree (master_key)
"api_values_master_key_key1" UNIQUE CONSTRAINT, btree (master_key)
"api_values_master_key_key2" UNIQUE CONSTRAINT, btree (master_key)

Workaround is just dropping the constraint if exits before the alter table
add column statement. But I am afraid a lot of developers will enter this
trap as it is kinda unexpected behavior. I read there is a similar problem
with serial/sequences.


On Mon, Apr 30, 2018 at 02:22:32PM +0000, PG Bug reporting form wrote:
> Alter table add column if not exists with unique constraint will add extra
> duplicate of the unique constraint when the column exists.
>
> Example:
> ALTER TABLE api_values ADD COLUMN IF NOT EXISTS master_key bigint NULL
> UNIQUE;
>
> If you run this several times you will get more unique constraints:
> "api_values_master_key_key" UNIQUE CONSTRAINT, btree (master_key)
> "api_values_master_key_key1" UNIQUE CONSTRAINT, btree (master_key)
> "api_values_master_key_key2" UNIQUE CONSTRAINT, btree (master_key)
>
> Workaround is just dropping the constraint if exits before the alter table
> add column statement. But I am afraid a lot of developers will enter this
> trap as it is kinda unexpected behavior. I read there is a similar problem
> with serial/sequences.

I am sure you mean that:
https://www.postgresql.org/message-id/12400.1506455842@sss.pgh.pa.us
Similarly, the index is created before deciding to make use of it in the
new column...  I don't recall all the details in tablecmds.c, but I am
pretty sure that it would be quite messy to create the column before
creating the index itself.
--
Michael

Вложения
Michael Paquier <michael@paquier.xyz> writes:
> On Mon, Apr 30, 2018 at 02:22:32PM +0000, PG Bug reporting form wrote:
>> Alter table add column if not exists with unique constraint will add extra
>> duplicate of the unique constraint when the column exists.
>> Example:
>> ALTER TABLE api_values ADD COLUMN IF NOT EXISTS master_key bigint NULL
>> UNIQUE;

> ...  I don't recall all the details in tablecmds.c, but I am
> pretty sure that it would be quite messy to create the column before
> creating the index itself.

After thinking about it for awhile, I'm not exactly convinced that this
is a bug at all.  "UNIQUE" underspecifies the desired index, so that it's
impossible to say that a given existing index does or does not match the
command.  The code errs in favor of deciding that it doesn't, but the
opposite decision could also be "wrong" in some use-cases.

I'll spare you my usual rant about how CREATE IF NOT EXISTS sucks because
the subsequent state of the object isn't well-defined ... oops, too late.
But this seems like just another case of that problem.

            regards, tom lane


Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Michael Paquier <michael@paquier.xyz> writes:
> > On Mon, Apr 30, 2018 at 02:22:32PM +0000, PG Bug reporting form wrote:
> >> Alter table add column if not exists with unique constraint will add extra
> >> duplicate of the unique constraint when the column exists.
> >> Example:
> >> ALTER TABLE api_values ADD COLUMN IF NOT EXISTS master_key bigint NULL
> >> UNIQUE;
>
> > ...  I don't recall all the details in tablecmds.c, but I am
> > pretty sure that it would be quite messy to create the column before
> > creating the index itself.
>
> After thinking about it for awhile, I'm not exactly convinced that this
> is a bug at all.  "UNIQUE" underspecifies the desired index, so that it's
> impossible to say that a given existing index does or does not match the
> command.  The code errs in favor of deciding that it doesn't, but the
> opposite decision could also be "wrong" in some use-cases.

How is any of that relevant?  The column either exists, or it doesn't.
If the column exists, then the entire add-column operation (and anything
associated with it) should become a no-op, not only half of it.

> I'll spare you my usual rant about how CREATE IF NOT EXISTS sucks because
> the subsequent state of the object isn't well-defined ... oops, too late.
> But this seems like just another case of that problem.

My understanding, and, it seems, that of the original poster and likely
anyone who works with PG, is that an IF NOT EXISTS means "only perform
this operation if the thing doesn't already exist."  Even the NOTICE
reported from the above command says that we're skipping things:

=> create table t4 (c1 int);
CREATE TABLE
=*> alter table t4 add column if not exists c2 bigint null unique;
ALTER TABLE
=*> alter table t4 add column if not exists c2 bigint null unique;
NOTICE:  column "c2" of relation "t4" already exists, skipping
ALTER TABLE

It doesn't say anything about "well, we didn't add the column, but we
did add the constraint that you asked for to some column that happened
to already exist with that name."  That hardly seemed like we "skipped"
doing the operation.

This clearly looks like a bug to me, and one we should figure out how to
fix.  We don't make CREATE TABLE IF NOT EXIST go monkey with the
existing table definition when the table already exists, and I certainly
don't see any argument here that it makes sense for the behavior of
ALTER TABLE ... ADD COLUMN IF NOT EXISTS to be different from that.

Thanks!

Stephen

Вложения
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> I'll spare you my usual rant about how CREATE IF NOT EXISTS sucks because
>> the subsequent state of the object isn't well-defined ... oops, too late.
>> But this seems like just another case of that problem.

> My understanding, and, it seems, that of the original poster and likely
> anyone who works with PG, is that an IF NOT EXISTS means "only perform
> this operation if the thing doesn't already exist.

Not clear.  If the column exists but there's no unique index on it,
should this command cause the index to spring into existence?
C.I.N.E. is by its nature too fuzzy to allow any principled answer to
that.  The facts on the ground are that right now, this does result in
creation of an index.  If we take that away, I guarantee you somebody
else will file a bug report complaining that it used to work and we
broke their code.

            regards, tom lane


Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> >> I'll spare you my usual rant about how CREATE IF NOT EXISTS sucks because
> >> the subsequent state of the object isn't well-defined ... oops, too late.
> >> But this seems like just another case of that problem.
>
> > My understanding, and, it seems, that of the original poster and likely
> > anyone who works with PG, is that an IF NOT EXISTS means "only perform
> > this operation if the thing doesn't already exist.
>
> Not clear.  If the column exists but there's no unique index on it,
> should this command cause the index to spring into existence?

No, it shouldn't, and I view that as quite clear.

The operation is "ADD COLUMN IF NOT EXISTS ... column definition."

If the column exists then the operation should be a noop.

ALTER TABLE has a very explicit way to segregate independent operations
using the ','.  In this case, there's only one operation being requested
and it's the ADD COLUMN, and that's what the IF NOT EXISTS applies to.

> C.I.N.E. is by its nature too fuzzy to allow any principled answer to
> that.  The facts on the ground are that right now, this does result in
> creation of an index.  If we take that away, I guarantee you somebody
> else will file a bug report complaining that it used to work and we
> broke their code.

I disagree entirely.  If someone was depending on that misbehavior then
that's their bug and issue to deal with.

Thanks!

Stephen

Вложения
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Not clear.  If the column exists but there's no unique index on it,
>> should this command cause the index to spring into existence?

> No, it shouldn't, and I view that as quite clear.  

I'm not exactly convinced; but if we wanted to deal with this case
as well as the SERIAL-column case that was complained of earlier,
I think we'd have to proceed more or less like this:

* Extend the execution state of ALTER TABLE to include an array of
flags.  (In the general case, we need a separate flag for each IF
NOT EXISTS clause in the command.)

* Invent new AT sub-command(s) that test the existence of columns,
or any other sub-objects we have IF NOT EXISTS for, and set a
specified flag based on the result.

* Extend AlterTableCmd with a field that marks a particular subcommand
as to be executed or not conditionally on the state of flag N.  Remove
the existing special-case conditional behavior in ATExecAddColumn.

* Extend transformAlterTableStmt so that, whenever there is an IF NOT
EXISTS flag on a given subcommand, it assigns a flag number for that
condition, prepends a suitable test subcommand to the list of
AlterTableCmds it's building, and marks all the AlterTableCmds generated
from that subcommand as conditional on that flag.

This is kind of complicated, but in order to handle the SERIAL-column
case, we have to do the sequence creation conditionally before the
column is created, so I don't think we can make it any simpler.

A possible objection to this design is that right now, you can do

    alter table foo
      add column if not exists f2 text,
      add column if not exists f2 int;

and it will skip the second ADD subcommand because by that point the
column exists.  With this design, both test subcommands would find that
f2 doesn't exist so we'd try to do both ADD subcommands, and the second
one would fail.  That doesn't particularly bother me, because this
command is silly.  Conceivably, transformAlterTableStmt could be taught
to check for identical test subcommands and turn the later ones into
constant-false results (or throw away the later subcommands entirely);
but I'm doubtful that it's worth the trouble.

            regards, tom lane


Re: BUG #15180: Alter table add column if not exists with uniqueconstraint will add extra duplicate

От
"David G. Johnston"
Дата:
On Tue, May 1, 2018 at 11:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
A possible objection to this design is that right now, you can do

        alter table foo
          add column if not exists f2 text,
          add column if not exists f2 int;

and it will skip the second ADD subcommand because by that point the
column exists.  With this design, both test subcommands would find that
f2 doesn't exist so we'd try to do both ADD subcommands, and the second
one would fail.  That doesn't particularly bother me, because this
command is silly.

​I'd argue its probably not that silly when you consider copy-paste errors - in which case actually failing instead of silently ignoring the second instance of the same name would be looked upon favorably by the user.  And since pg_dump isn't going to be affected by this the new error seems more positive than negative on-the-whole.

As with the serial thread I'm definitely +1 for making this behave in a less surprising manner - and Tom's proposed flow seems to me to match the expected behavior (I agree with Stephen here, I think I got bogged down in the sequence aspects of the previous thread when having it be this simple is preferable).

David J.

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Tue, May 1, 2018 at 11:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> A possible objection to this design is that right now, you can do
>> 
>>     alter table foo
>>       add column if not exists f2 text,
>>       add column if not exists f2 int;
>> 
>> and it will skip the second ADD subcommand because by that point the
>> column exists.  With this design, both test subcommands would find that
>> f2 doesn't exist so we'd try to do both ADD subcommands, and the second
>> one would fail.  That doesn't particularly bother me, because this
>> command is silly.

> ​I'd argue its probably not that silly when you consider copy-paste errors
> - in which case actually failing instead of silently ignoring the second
> instance of the same name would be looked upon favorably by the user.

Yeah, good point.

I also noticed while looking at the code that AT_AddOids is a rather
klugy implementation of, effectively, ADD COLUMN IF NOT EXISTS.
Perhaps it'd save code to fold it into this mechanism.  Or maybe not,
since the OID column is pretty special-casey anyway.

            regards, tom lane