Обсуждение: Identity column behavior discrepancies when inserting one or many rows

Поиск
Список
Период
Сортировка

Identity column behavior discrepancies when inserting one or many rows

От
Jean Prulière
Дата:
Hi there !

I am teaching PostgreSQL at an online school and we recently stumbled upon a rather strange behavior while experimenting with the identity column feature. I know it is not something any skilled person would do but it is still, imho, different from what the documentation states. Our students love testing and it is quite frustrating if no one can explain what they found.

Here is a very simple script to reproduce what we encountered :

CREATE TABLE test (
   id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
   sometext text NOT NULL
);

-- that one works : INSERT 0 1
-- I think it should not and throw a 428C9
INSERT INTO test (id, sometext)
VALUES (DEFAULT, 'test');

-- this one does not : SQL state : 428C9
INSERT INTO test (id, sometext)
VALUES (DEFAULT, 'test2'),
(DEFAULT, 'test3');
 
Though GENERATED ALWAYS implies the absence of the identity column in the column names list of any INSERT statement, listing it there is OK, without the OVERRIDING SYSTEM VALUE flag, as long as only one row is inserted (and DEFAULT is used as value, of course). But starting at 2 rows (and I can only guess it never stops), the proper error is thrown, advising to use the aforementioned flag.

Is there any reason I missed such behavior would be expected ?

I upgraded my cluster (was 11.9, is now 13) to the latest version to be sure it is still effective : it is.

Here is the version and platform info returned by calling version() : PostgreSQL 13.0 (Ubuntu 13.0-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit

It is my first bug report, sorry if I failed to report some piece of info :-|

Peace !

Jean Pruliere



Re: Identity column behavior discrepancies when inserting one or many rows

От
"David G. Johnston"
Дата:
On Mon, Oct 12, 2020 at 4:04 AM Jean Prulière <jean@oclock.io> wrote:
Here is a very simple script to reproduce what we encountered :

CREATE TABLE test (
   id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
   sometext text NOT NULL
);

-- that one works : INSERT 0 1
-- I think it should not and throw a 428C9
INSERT INTO test (id, sometext)
VALUES (DEFAULT, 'test');

-- this one does not : SQL state : 428C9
INSERT INTO test (id, sometext)
VALUES (DEFAULT, 'test2'),
(DEFAULT, 'test3');
 
Though GENERATED ALWAYS implies the absence of the identity column in the column names list of any INSERT statement, listing it there is OK, without the OVERRIDING SYSTEM VALUE flag, as long as only one row is inserted (and DEFAULT is used as value, of course). But starting at 2 rows (and I can only guess it never stops), the proper error is thrown, advising to use the aforementioned flag.

Is there any reason I missed such behavior would be expected ?

I agree the inconsistency seems to be undocumented but I wonder why the second case doesn't work, not why the first one does.  INSERT says: "For a generated column, specifying this is permitted but merely specifies the normal behavior of computing the column from its generation expression."  That says the first case works, default means the same as omitting the column altogether.

David J.

Re: Identity column behavior discrepancies when inserting one or many rows

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Mon, Oct 12, 2020 at 4:04 AM Jean Prulière <jean@oclock.io> wrote:
>> Here is a very simple script to reproduce what we encountered :
>> 
>> CREATE TABLE test (
>> id int GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
>> sometext text NOT NULL
>> );
>> 
>> -- that one works : INSERT 0 1
>> -- I think it should not and throw a 428C9
>> INSERT INTO test (id, sometext)
>> 
>> VALUES (DEFAULT, 'test');
>> 
>> -- this one does not : SQL state : 428C9
>> INSERT INTO test (id, sometext)
>> 
>> VALUES (DEFAULT, 'test2'),
>> (DEFAULT, 'test3');

> I agree the inconsistency seems to be undocumented but I wonder why the
> second case doesn't work, not why the first one does.  INSERT says: "For a
> generated column, specifying this is permitted but merely specifies the
> normal behavior of computing the column from its generation expression."
> That says the first case works, default means the same as omitting the
> column altogether.

Yeah.  Ideally both examples should work.  The fact that the second one
does not is an implementation deficiency: the code that recognizes no-op
DEFAULTs for this purpose doesn't descend into multi-row VALUES lists.
Somebody submitted a patch to improve that awhile ago, but I didn't like
the first version and there hasn't been an update yet:

https://commitfest.postgresql.org/30/2681/

Perhaps you could get your students interested in rewriting that patch?

            regards, tom lane