Обсуждение: ON CONFLICT and WHERE

Поиск
Список
Период
Сортировка

ON CONFLICT and WHERE

От
Adrian Klaver
Дата:
In process of answering an SO question I ran across the below.

The original question example:

CREATE TABLE books (
    id int4 NOT NULL,
    version int8 NOT NULL,
    updated timestamp NULL,
    CONSTRAINT books_pkey PRIMARY KEY (id)
);

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE version IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < 
CURRENT_TIMESTAMP from books where id = 12;

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < 
CURRENT_TIMESTAMP from books where id = 12;

With select results as:

  id | version |          updated           |       current_timestamp 
     | ?column?
----+---------+----------------------------+--------------------------------+----------
  12 |       0 | 11/13/2022 12:21:38.032578 | 11/13/2022 12:21:38.057545 
PST | f


  id | version |          updated           |       current_timestamp 
     | ?column?
----+---------+----------------------------+--------------------------------+----------
  12 |       1 | 11/13/2022 12:21:38.058673 | 11/13/2022 12:21:40.686231 
PST | f


I have not used WHERE with ON CONFLICT myself so it took longer then I 
care to admit to correct the above to:

DROP TABLE IF EXISTS books;

CREATE TABLE books (
     id int4 NOT NULL,
     version int8 NOT NULL,
     updated timestamp NULL,
     CONSTRAINT books_pkey PRIMARY KEY (id)
);

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP
WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < 
CURRENT_TIMESTAMP;

select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < 
CURRENT_TIMESTAMP from books where id = 12;

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP
WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < 
CURRENT_TIMESTAMP;

select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < 
CURRENT_TIMESTAMP from books where id = 12


With select results as:

  id | version |          updated           |       current_timestamp 
     | ?column?
----+---------+----------------------------+--------------------------------+----------
  12 |       0 | 11/13/2022 12:32:01.427769 | 11/13/2022 12:32:01.463705 
PST | f


id | version |          updated           |       current_timestamp 
   | ?column?
----+---------+----------------------------+--------------------------------+----------
  12 |       0 | 11/13/2022 12:32:01.427769 | 11/13/2022 12:32:01.476484 
PST | f



I ran this on both version 14 and 15 with same results.

The question is why did the first case just ignore the WHERE instead of 
throwing a syntax error?


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: ON CONFLICT and WHERE

От
Tom Lane
Дата:
Adrian Klaver <adrian.klaver@aklaver.com> writes:
> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
> ON CONFLICT (id)
> WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
> DO UPDATE
> SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

> I have not used WHERE with ON CONFLICT myself so it took longer then I 
> care to admit to correct the above to:

> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
> ON CONFLICT (id)
> DO UPDATE
> SET version = books.version + 1, updated = CURRENT_TIMESTAMP
> WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < 
> CURRENT_TIMESTAMP;

> The question is why did the first case just ignore the WHERE instead of 
> throwing a syntax error?

A WHERE placed there is an index_predicate attachment to the ON CONFLICT
clause.  It doesn't have any run-time effect other than to allow partial
indexes to be chosen as arbiter indexes.  TFM explains

    index_predicate

        Used to allow inference of partial unique indexes. Any indexes
        that satisfy the predicate (which need not actually be partial
        indexes) can be inferred.

This strikes me as a bit of a foot-gun.  I wonder if we should make
it safer by insisting that the resolved index be partial when there's
a WHERE clause here.  (This documentation text is about as clear as
mud, too.  What does "inferred" mean here?  I think it means "chosen as
arbiter index", but maybe I misunderstand.)

            regards, tom lane



Re: ON CONFLICT and WHERE

От
Adrian Klaver
Дата:
On 11/13/22 13:07, Tom Lane wrote:
> Adrian Klaver <adrian.klaver@aklaver.com> writes:
>> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
>> ON CONFLICT (id)
>> WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
>> DO UPDATE
>> SET version = books.version + 1, updated = CURRENT_TIMESTAMP;
> 
>> I have not used WHERE with ON CONFLICT myself so it took longer then I
>> care to admit to correct the above to:
> 
>> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
>> ON CONFLICT (id)
>> DO UPDATE
>> SET version = books.version + 1, updated = CURRENT_TIMESTAMP
>> WHERE books.version IS NULL OR books.updated + INTERVAL '2min' <
>> CURRENT_TIMESTAMP;
> 
>> The question is why did the first case just ignore the WHERE instead of
>> throwing a syntax error?
> 
> A WHERE placed there is an index_predicate attachment to the ON CONFLICT
> clause.  It doesn't have any run-time effect other than to allow partial
> indexes to be chosen as arbiter indexes.  TFM explains
> 
>      index_predicate
> 
>          Used to allow inference of partial unique indexes. Any indexes
>          that satisfy the predicate (which need not actually be partial
>          indexes) can be inferred.
> 
> This strikes me as a bit of a foot-gun.  I wonder if we should make
> it safer by insisting that the resolved index be partial when there's
> a WHERE clause here.  (This documentation text is about as clear as
> mud, too.  What does "inferred" mean here?  I think it means "chosen as
> arbiter index", but maybe I misunderstand.)

Alright I see how another use of WHERE comes into play.

I do agree with the clarity of the description, especially after looking 
at the example:

"
Insert new distributor if possible; otherwise DO NOTHING. Example 
assumes a unique index has been defined that constrains values appearing 
in the did column on a subset of rows where the is_active Boolean column 
evaluates to true:

-- This statement could infer a partial unique index on "did"
-- with a predicate of "WHERE is_active", but it could also
-- just use a regular unique constraint on "did"
INSERT INTO distributors (did, dname) VALUES (10, 'Conrad International')
     ON CONFLICT (did) WHERE is_active DO NOTHING;
"

I honestly cannot figure out what that is saying.

> 
>             regards, tom lane

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: ON CONFLICT and WHERE

От
Peter Geoghegan
Дата:
On Sun, Nov 13, 2022 at 1:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> A WHERE placed there is an index_predicate attachment to the ON CONFLICT
> clause.  It doesn't have any run-time effect other than to allow partial
> indexes to be chosen as arbiter indexes.  TFM explains
>
>     index_predicate
>
>         Used to allow inference of partial unique indexes. Any indexes
>         that satisfy the predicate (which need not actually be partial
>         indexes) can be inferred.
>
> This strikes me as a bit of a foot-gun.  I wonder if we should make
> it safer by insisting that the resolved index be partial when there's
> a WHERE clause here.

I don't think that it would be safer.

Adrian has asked why it's possible to attach an arbitrary
index_predicate type WHERE clause to an ON CONFLICT query, without
that really changing the behavior of the statement. That *is* a little
odd, so it's certainly a fair question (I can recall perhaps as many
as 5 similar questions over the years). But it's not the end of the
world, either -- there are far worse things.

I think that it would be a lot worse (just for example) to have your
ON CONFLICT query suddenly start throwing an ERROR in production, just
because you replaced a partial unique index with a unique constraint.
If we have a suitable unique index or constraint, why wouldn't we use
it in ON CONFLICT? Maybe it won't work out that way (maybe there won't
be any suitable unique index or constraint), but why not do our utmost
to insulate the user from what might be a serious production issue?
That was the guiding principle.

Overall I'm quite happy with the amount of foot-guns ON CONFLICT has,
especially compared to other comparable features in other DB systems
(which had plenty). There are one or two ostensibly odd things about
the syntax that are downstream consequences of trying to make the
constraint/unique index inference process maximally forgiving. I'm
pretty happy with that trade-off.

> (This documentation text is about as clear as
> mud, too.  What does "inferred" mean here?  I think it means "chosen as
> arbiter index", but maybe I misunderstand.)

Unique index/constraint inference is the process by which we choose an
arbiter index. See the second paragraph of the "ON CONFLICT Clause"
section of the INSERT docs.

-- 
Peter Geoghegan



Re: ON CONFLICT and WHERE

От
jian he
Дата:


On Mon, Nov 14, 2022 at 2:55 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 11/13/22 13:07, Tom Lane wrote:
> Adrian Klaver <adrian.klaver@aklaver.com> writes:
>> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
>> ON CONFLICT (id)
>> WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
>> DO UPDATE
>> SET version = books.version + 1, updated = CURRENT_TIMESTAMP;
>
>> I have not used WHERE with ON CONFLICT myself so it took longer then I
>> care to admit to correct the above to:
>
>> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
>> ON CONFLICT (id)
>> DO UPDATE
>> SET version = books.version + 1, updated = CURRENT_TIMESTAMP
>> WHERE books.version IS NULL OR books.updated + INTERVAL '2min' <
>> CURRENT_TIMESTAMP;
>
>> The question is why did the first case just ignore the WHERE instead of
>> throwing a syntax error?
>
> A WHERE placed there is an index_predicate attachment to the ON CONFLICT
> clause.  It doesn't have any run-time effect other than to allow partial
> indexes to be chosen as arbiter indexes.  TFM explains
>
>      index_predicate
>
>          Used to allow inference of partial unique indexes. Any indexes
>          that satisfy the predicate (which need not actually be partial
>          indexes) can be inferred.
>
> This strikes me as a bit of a foot-gun.  I wonder if we should make
> it safer by insisting that the resolved index be partial when there's
> a WHERE clause here.  (This documentation text is about as clear as
> mud, too.  What does "inferred" mean here?  I think it means "chosen as
> arbiter index", but maybe I misunderstand.)

Alright I see how another use of WHERE comes into play.

I do agree with the clarity of the description, especially after looking
at the example:

"
Insert new distributor if possible; otherwise DO NOTHING. Example
assumes a unique index has been defined that constrains values appearing
in the did column on a subset of rows where the is_active Boolean column
evaluates to true:

-- This statement could infer a partial unique index on "did"
-- with a predicate of "WHERE is_active", but it could also
-- just use a regular unique constraint on "did"
INSERT INTO distributors (did, dname) VALUES (10, 'Conrad International')
     ON CONFLICT (did) WHERE is_active DO NOTHING;
"

I honestly cannot figure out what that is saying.

>
>                       regards, tom lane

--
Adrian Klaver
adrian.klaver@aklaver.com



 
INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE version IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

Since id is already the primary key, it skipped the WHERE part. it resolves to the DO UPDATE part.

from test code.
create table insertconflicttest(key int4, fruit text);
create unique index partial_key_index on insertconflicttest(key) where fruit like '%berry';

In this case, the  on conflict clause should be exactly like on conflict (key) where fruit like '%berry'

-- fails
insert into insertconflicttest values (23, 'Blackberry') on conflict (key) do update set fruit = excluded.fruit;
ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
insert into insertconflicttest values (23, 'Blackberry') on conflict (key) where fruit like '%berry' or fruit = 'consequential' do nothing;
ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification
insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit) where fruit like '%berry' do update set fruit = excluded.fruit;
ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT specification


--
 I recommend David Deutsch's <<The Beginning of Infinity>>

  Jian