Обсуждение: INSERT ... ON CONFLICT syntax issues

Поиск
Список
Период
Сортировка

INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
I'm separating this discussion out of the  thread because I think this
needs wider input.

On 2015-04-24 19:21:37 -0700, Peter Geoghegan wrote:
> I've *provisionally* pushed code that goes back to the old way,
> Andres: https://github.com/petergeoghegan/postgres/commit/2a5d80b27d2c5832ad26dde4651c64dd2004f401
>
> Perhaps this is the least worst way, after all.

I still think it's a bad idea. To recap, the old and current way is:

INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) UPDATE|IGNORE

My problem with the WHERE being inside the parens in the above is that
it's
a) different from CREATE INDEX
b) unclear whether the WHERE belongs to colb or the whole index  expression. The equivalent for aggregates, which I bet
isgoing to be  used less often, caused a fair amount of confusing.
 

That's why I wanted the WHERE outside the (), which requires either
adding DO between the index inference clause, and the action, to avoid
ambiguities in the grammar.


But I'm generally having some doubts about the syntax.

Right now it's
INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE.

A couple things:

a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What  if we, at some later point, also want to
handleother kind of  violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...
 
b) For me there's a WITH before the index inference clause missing, to  have it read in 'SQL' style.
c) Right now the UPDATE can refer to pseudo relations 'TARGET' and  'EXCLUDED'. I think especially the latter doesn't
fitanymore at  all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD?
 

So I guess it boils down to that I think we should switch the syntax to
be:

INSERT ... ON UNIQUE VIOLATION [WITH (cola, colb) WHERE ...] DO {NOTHING|UPDATE}

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Sat, Apr 25, 2015 at 2:01 AM, Andres Freund <andres@anarazel.de> wrote:
> My problem with the WHERE being inside the parens in the above is that
> it's
> a) different from CREATE INDEX

I don't think that that's an important goal.

> b) unclear whether the WHERE belongs to colb or the whole index
>    expression. The equivalent for aggregates, which I bet is going to be
>    used less often, caused a fair amount of confusing.

I don't see those two situations as being comparable. The inference
specification does not accept aggregates. It seems obvious to me that
the predicate only ever applies to the entire table. And it's obvious
that it's part of the inference specification because it appears in
parentheses with everything else - otherwise, *users* might find this
phantom WHERE clause ambiguous/confusing.

> But I'm generally having some doubts about the syntax.
>
> Right now it's
> INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE.
>
> A couple things:
>
> a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
>    if we, at some later point, also want to handle other kind of
>    violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...

I think that naming unique violations alone would be wrong (not to
mention ludicrously verbose). Heikki and I both feel that the CONFLICT
keyword captures the fact that this could be a dup violation, or an
exclusion violation. The syntax has been like this for some time, and
hasn't been a point of contention for a long time, so I thought this
was settled. Note that the syntax is quite similar to the SQLite
syntax of the same feature, that has ON CONFLICT IGNORE (it also has
ON CONFLICT REPLACE, but not ON CONFLICT UPDATE).

> b) For me there's a WITH before the index inference clause missing, to
>    have it read in 'SQL' style.

I'm not seeing it. BTW, Robert was the one who initially proposed that
the unique index inference clause follow this exact style (albeit
before it accepted a WHERE clause to infer partial indexes, which was
only added a couple of months ago).

> c) Right now the UPDATE can refer to pseudo relations 'TARGET' and
>    'EXCLUDED'. I think especially the latter doesn't fit anymore at
>    all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD?

NEW and OLD are terribly misleading, since surely the NEW tuple is the
one actually appended to the relation by the UPDATE, and the OLD one
is the existing one (not the excluded one). Plus they have all that
intellectual baggage from rules.

CONFLICTING, as Greg Stark pointed out many months ago, is something
that applies to both tuples that can be referenced, which is why I
*stopped* using it months ago. They conflict with *each other*. Any
conflict must pertain to both.

Dictionary.com defines "exclude" as:

"""
verb (used with object), excluded, excluding.
1.
to shut or keep out; prevent the entrance of.
2.
to shut out from consideration, privilege, etc.:
Employees and their relatives were excluded from participation in the contest.
3.
to expel and keep out; thrust out; eject:
He was excluded from the club for infractions of the rules.
"""

Seems pretty descriptive of the situation to me - I actually put a lot
of thought into this. Additionally, the word is widely understood by
non-native speakers. TARGET is also very descriptive, because it
situationally describes either the existing tuple actually present in
the table, or (from a RETURNING clause) the final tuple present in the
table post-UPDATE. We use the term "target" for that pervasively (in
the docs and in the code).

> So I guess it boils down to that I think we should switch the syntax to
> be:
>
> INSERT ... ON UNIQUE VIOLATION [WITH (cola, colb) WHERE ...] DO {NOTHING|UPDATE}

Beauty is in the eye of the beholder and all, but that seems pretty
ugly to me. Honestly, I think we should just accept that the predicate
appears in the parentheses on the odd occasion that it appears at all
- partial unique indexes are not all that common.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On 2015-04-25 11:05:49 -0700, Peter Geoghegan wrote:
> On Sat, Apr 25, 2015 at 2:01 AM, Andres Freund <andres@anarazel.de> wrote:
> > My problem with the WHERE being inside the parens in the above is that
> > it's
> > a) different from CREATE INDEX
> 
> I don't think that that's an important goal.

Given that it's used to 'match' to indexes, I can't agree.

> > b) unclear whether the WHERE belongs to colb or the whole index
> >    expression. The equivalent for aggregates, which I bet is going to be
> >    used less often, caused a fair amount of confusing.
> 
> I don't see those two situations as being comparable. The inference
> specification does not accept aggregates.

Huh? It's pretty much entirely besides the point that inference doesn't
accept aggregates. The point is that ORDER BY for aggregates has
confused users because it's inside the parens.

> > a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
> >    if we, at some later point, also want to handle other kind of
> >    violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...
> 
> I think that naming unique violations alone would be wrong (not to
> mention ludicrously verbose).

Why?

> The syntax has been like this for some time, and
> hasn't been a point of contention for a long time, so I thought this
> was settled.

I really don't care if it's been that for a long while. This is a not
yet commited feature.

> > b) For me there's a WITH before the index inference clause missing, to
> >    have it read in 'SQL' style.
> 
> I'm not seeing it. BTW, Robert was the one who initially proposed that
> the unique index inference clause follow this exact style (albeit
> before it accepted a WHERE clause to infer partial indexes, which was
> only added a couple of months ago).

So?

I guess I can live with that uglyness; but I'd like somebody else to
chime in.

> > c) Right now the UPDATE can refer to pseudo relations 'TARGET' and
> >    'EXCLUDED'. I think especially the latter doesn't fit anymore at
> >    all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD?
> 
> NEW and OLD are terribly misleading, since surely the NEW tuple is the
> one actually appended to the relation by the UPDATE, and the OLD one
> is the existing one (not the excluded one). Plus they have all that
> intellectual baggage from rules.

What 'intellectual baggage' would that be? That they're already known to
have been used in another place? I don't see the problem.

How about EXISTING and NEW?

> """
> verb (used with object), excluded, excluding.
> 1.
> to shut or keep out; prevent the entrance of.
> 2.
> to shut out from consideration, privilege, etc.:
> Employees and their relatives were excluded from participation in the contest.
> 3.
> to expel and keep out; thrust out; eject:
> He was excluded from the club for infractions of the rules.
> """
> 
> Seems pretty descriptive of the situation to me - I actually put a lot
> of thought into this. Additionally, the word is widely understood by
> non-native speakers. TARGET is also very descriptive, because it
> situationally describes either the existing tuple actually present in
> the table, or (from a RETURNING clause) the final tuple present in the
> table post-UPDATE. We use the term "target" for that pervasively (in
> the docs and in the code).

Sorry, I don't buy either argument. EXISTING and NEW would surely at
least as widely understood than EXCLUDE and TARGET. The latter does just
about no sense to me; especially from a user POV. I don't think the
existing usage of the term has much to do what it's used for here. That
it has 'morphing' characteristics imo just makes it worse, rather than
better. Besides being confusing that it has different meanings, it's far
from inconceivable that somebody wants to return values from the
preexisting, new, and merged rows.

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Sat, Apr 25, 2015 at 11:24 AM, Andres Freund <andres@anarazel.de> wrote:
>> > b) unclear whether the WHERE belongs to colb or the whole index
>> >    expression. The equivalent for aggregates, which I bet is going to be
>> >    used less often, caused a fair amount of confusing.
>>
>> I don't see those two situations as being comparable. The inference
>> specification does not accept aggregates.
>
> Huh? It's pretty much entirely besides the point that inference doesn't
> accept aggregates. The point is that ORDER BY for aggregates has
> confused users because it's inside the parens.

Would any alternative cause less confusion? That's the real issue. And
I'm unconvinced that your alternative would.

>> > a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
>> >    if we, at some later point, also want to handle other kind of
>> >    violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...
>>
>> I think that naming unique violations alone would be wrong (not to
>> mention ludicrously verbose).
>
> Why?

Because, as I said, it might not be a unique violation at all. It
could be an exclusion violation.

>> > b) For me there's a WITH before the index inference clause missing, to
>> >    have it read in 'SQL' style.
>>
>> I'm not seeing it. BTW, Robert was the one who initially proposed that
>> the unique index inference clause follow this exact style (albeit
>> before it accepted a WHERE clause to infer partial indexes, which was
>> only added a couple of months ago).
>
> So?

So, his opinion matters if it comes down to a vote. The inference
specification syntax as implemented is exactly what he suggested (plus
I've added a predicate).

> I guess I can live with that uglyness; but I'd like somebody else to
> chime in.

Agreed.

>> > c) Right now the UPDATE can refer to pseudo relations 'TARGET' and
>> >    'EXCLUDED'. I think especially the latter doesn't fit anymore at
>> >    all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD?
>>
>> NEW and OLD are terribly misleading, since surely the NEW tuple is the
>> one actually appended to the relation by the UPDATE, and the OLD one
>> is the existing one (not the excluded one). Plus they have all that
>> intellectual baggage from rules.
>
> What 'intellectual baggage' would that be? That they're already known to
> have been used in another place? I don't see the problem.

The problem is that they make you think of rules, and they don't
describe what's going on at all.

>> Seems pretty descriptive of the situation to me - I actually put a lot
>> of thought into this. Additionally, the word is widely understood by
>> non-native speakers. TARGET is also very descriptive, because it
>> situationally describes either the existing tuple actually present in
>> the table, or (from a RETURNING clause) the final tuple present in the
>> table post-UPDATE. We use the term "target" for that pervasively (in
>> the docs and in the code).
>
> Sorry, I don't buy either argument. EXISTING and NEW would surely at
> least as widely understood than EXCLUDE and TARGET. The latter does just
> about no sense to me; especially from a user POV. I don't think the
> existing usage of the term has much to do what it's used for here.

Yes it does. The UPDATE docs refer to the target table in a way
intended to distinguish it from any joined-to table (FROM table). It's
clear as day. Maybe EXISTING is equally well understood as a word in
general, but it's way more ambiguous than EXCLUDED is here.

> That
> it has 'morphing' characteristics imo just makes it worse, rather than
> better. Besides being confusing that it has different meanings, it's far
> from inconceivable that somebody wants to return values from the
> preexisting, new, and merged rows.

This is how RETURNING works from UPDATEs in general. IOW, if you do an
UPDATE FROM (which is pretty similar to ON CONFLICT UPDATE,
syntax-wise), then you can only refer to the joined table's tuple and
the final post-update tuple from within RETURNING. You cannot refer to
the pre-UPDATE target tuple there either -- it's *exactly* the same
situation. Why should it be any different here? The
situational/morphing characteristic of the alias name TARGET is
therefore absolutely appropriate, in that it follows UPDATE.

To be fair, there is one unrelated slight difference with RETURNING
and conventional UPDATEs: You cannot return the EXCLUDED tuple (in the
same way that you can reference the joined-FROM tuple within
conventional UPDATEs). This is because the pertinent information is
likely to be in the target tuple (after all, the DML statement names
the proposed-for-insertion tuples itself, directly), but more
importantly because projecting both would necessitate *always*
qualifying the RETURNING column names to resolve which tuple is
intended (UPDATE FROM will seldom be a self-join, but this will always
be like a self-join).

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Sat, Apr 25, 2015 at 11:50 AM, Peter Geoghegan <pg@heroku.com> wrote:
> To be fair, there is one unrelated slight difference with RETURNING
> and conventional UPDATEs: You cannot return the EXCLUDED tuple (in the
> same way that you can reference the joined-FROM tuple within
> conventional UPDATEs). This is because the pertinent information is
> likely to be in the target tuple (after all, the DML statement names
> the proposed-for-insertion tuples itself, directly), but more
> importantly because projecting both would necessitate *always*
> qualifying the RETURNING column names to resolve which tuple is
> intended (UPDATE FROM will seldom be a self-join, but this will always
> be like a self-join).


It also makes sense because this is the RETURNING clause of an INSERT,
not an UPDATE. So the general INSERT behavior is what is expected. It
ought to be irrelevant if tuples were projected by actually inserting
or updating. Otherwise, your UPSERT is probably misconceived.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On 2015-04-25 11:50:59 -0700, Peter Geoghegan wrote:
> On Sat, Apr 25, 2015 at 11:24 AM, Andres Freund <andres@anarazel.de> wrote:
> >> > c) Right now the UPDATE can refer to pseudo relations 'TARGET' and
> >> >    'EXCLUDED'. I think especially the latter doesn't fit anymore at
> >> >    all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD?
> >>
> >> NEW and OLD are terribly misleading, since surely the NEW tuple is the
> >> one actually appended to the relation by the UPDATE, and the OLD one
> >> is the existing one (not the excluded one). Plus they have all that
> >> intellectual baggage from rules.
> >
> > What 'intellectual baggage' would that be? That they're already known to
> > have been used in another place? I don't see the problem.
> 
> The problem is that they make you think of rules, and they don't
> describe what's going on at all.

95% of all users will know NEW/OLD from triggers, not rules. Where NEW
is used in a quite comparable way.

> >> Seems pretty descriptive of the situation to me - I actually put a lot
> >> of thought into this. Additionally, the word is widely understood by
> >> non-native speakers. TARGET is also very descriptive, because it
> >> situationally describes either the existing tuple actually present in
> >> the table, or (from a RETURNING clause) the final tuple present in the
> >> table post-UPDATE. We use the term "target" for that pervasively (in
> >> the docs and in the code).
> >
> > Sorry, I don't buy either argument. EXISTING and NEW would surely at
> > least as widely understood than EXCLUDE and TARGET. The latter does just
> > about no sense to me; especially from a user POV. I don't think the
> > existing usage of the term has much to do what it's used for here.
> 
> Yes it does. The UPDATE docs refer to the target table in a way
> intended to distinguish it from any joined-to table (FROM table). It's
> clear as day.

Which means the term is used in a different way for INSERTs and UPDATEs
already.  To me it sounds like it's a remnant of your earlier syntax
proposal for UPSERT.

> Maybe EXISTING is equally well understood as a word in general, but
> it's way more ambiguous than EXCLUDED is here.

What? I'm not suggesting to replace EXCLUDED by EXISTING - that'd make
absolutely no sense. My suggesting is to have NEW refer to the tuple
specified in the INSERT and EXISTING to the, well, pre existing tuple
that the conflict is with.

> > That
> > it has 'morphing' characteristics imo just makes it worse, rather than
> > better. Besides being confusing that it has different meanings, it's far
> > from inconceivable that somebody wants to return values from the
> > preexisting, new, and merged rows.
> 
> This is how RETURNING works from UPDATEs in general.

And there's been a patch (which unfortunately died because it's
implementation wasn't good), to allow referring to the other versions of
the tuple. It has been wished for numerous times.


> IOW, if you do an UPDATE FROM (which is pretty similar to ON CONFLICT
> UPDATE, syntax-wise), then you can only refer to the joined table's
> tuple and the final post-update tuple from within RETURNING.

> You cannot refer to the pre-UPDATE target tuple there either -- it's
> *exactly* the same situation. Why should it be any different here? The
> situational/morphing characteristic of the alias name TARGET is
> therefore absolutely appropriate, in that it follows UPDATE.

Contrasting
> TARGET is also very descriptive, because it
> situationally describes either the existing tuple actually present in
> the table, or (from a RETURNING clause) the final tuple present in the
> table post-UPDATE. We use the term "target" for that pervasively (in
> the docs and in the code).

the docs say:  Since  <literal>RETURNING</> is not part of the <command>UPDATE</>  auxiliary query, the special
<literal>ONCONFLICT UPDATE</> aliases  (<varname>TARGET</> and <varname>EXCLUDED</>) may not be  referenced;  only the
rowas it exists after updating (or  inserting) is returned.
 

So I don't understand that whole chain of argument. There's no such
morphing behaviour, unless I miss something?

2a5d80b27d2c5832ad26dde4651c64dd2004f401:
> The problem with this seems to be that it more or less
> necessitates making both IGNORE and UPDATE fully reserved keywords in
> order to avoid an ambiguity, which we prefer not to do

It does not. As mentioned in the thread DO UPDATE/NOTHING work without
anything like that.

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Sat, Apr 25, 2015 at 12:23 PM, Andres Freund <andres@anarazel.de> wrote:
> 95% of all users will know NEW/OLD from triggers, not rules. Where NEW
> is used in a quite comparable way.

I don't think it's comparable.

>> >> Seems pretty descriptive of the situation to me - I actually put a lot
>> >> of thought into this. Additionally, the word is widely understood by
>> >> non-native speakers. TARGET is also very descriptive, because it
>> >> situationally describes either the existing tuple actually present in
>> >> the table, or (from a RETURNING clause) the final tuple present in the
>> >> table post-UPDATE. We use the term "target" for that pervasively (in
>> >> the docs and in the code).
>> >
>> > Sorry, I don't buy either argument. EXISTING and NEW would surely at
>> > least as widely understood than EXCLUDE and TARGET. The latter does just
>> > about no sense to me; especially from a user POV. I don't think the
>> > existing usage of the term has much to do what it's used for here.
>>
>> Yes it does. The UPDATE docs refer to the target table in a way
>> intended to distinguish it from any joined-to table (FROM table). It's
>> clear as day.
>
> Which means the term is used in a different way for INSERTs and UPDATEs
> already.

No, it isn't. It both cases the table is the one involved in the parse
analysis setTargetTable() call.

>> Maybe EXISTING is equally well understood as a word in general, but
>> it's way more ambiguous than EXCLUDED is here.
>
> What? I'm not suggesting to replace EXCLUDED by EXISTING - that'd make
> absolutely no sense. My suggesting is to have NEW refer to the tuple
> specified in the INSERT and EXISTING to the, well, pre existing tuple
> that the conflict is with.

Okay, but that doesn't change my opinion.

>> > That
>> > it has 'morphing' characteristics imo just makes it worse, rather than
>> > better. Besides being confusing that it has different meanings, it's far
>> > from inconceivable that somebody wants to return values from the
>> > preexisting, new, and merged rows.
>>
>> This is how RETURNING works from UPDATEs in general.
>
> And there's been a patch (which unfortunately died because it's
> implementation wasn't good), to allow referring to the other versions of
> the tuple. It has been wished for numerous times.

Well, if that patch is ever committed, then it won't be hard to get
the behavior here too, since it is literally exactly the same code. I
don't change anything about it, and that seems to be your problem.

> the docs say:
>    Since
>    <literal>RETURNING</> is not part of the <command>UPDATE</>
>    auxiliary query, the special <literal>ON CONFLICT UPDATE</> aliases
>    (<varname>TARGET</> and <varname>EXCLUDED</>) may not be
>    referenced;  only the row as it exists after updating (or
>    inserting) is returned.
>
> So I don't understand that whole chain of argument. There's no such
> morphing behaviour, unless I miss something?

That's a documentation bug (a remnant of an earlier version). Sorry
about that. You can reference TARGET from returning. It's directly
contradicted by this much earlier statement on the INSERT doc page:

"Both aliases can be used in the auxiliary query targetlist and WHERE
clause, while the TARGET alias can be used anywhere within the entire
statement (e.g., within the RETURNING clause)"

I'll go fix that.

> 2a5d80b27d2c5832ad26dde4651c64dd2004f401:
>> The problem with this seems to be that it more or less
>> necessitates making both IGNORE and UPDATE fully reserved keywords in
>> order to avoid an ambiguity, which we prefer not to do
>
> It does not. As mentioned in the thread DO UPDATE/NOTHING work without
> anything like that.

I just mean that it couldn't work as-was in the repo at that time.
This commit message was written before your proposal of 8 hours ago.

We're going to have to agree to disagree here. I've given you my
opinion. I'm burnt out on this patch, and whatever the path of least
resistance is is the path I'll take. Frankly, the only reason that I'm
putting up any kind of argument is because I don't think that your
proposal is the path of least resistance.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Sat, Apr 25, 2015 at 12:35 PM, Peter Geoghegan <pg@heroku.com> wrote:
>>> > That
>>> > it has 'morphing' characteristics imo just makes it worse, rather than
>>> > better. Besides being confusing that it has different meanings, it's far
>>> > from inconceivable that somebody wants to return values from the
>>> > preexisting, new, and merged rows.
>>>
>>> This is how RETURNING works from UPDATEs in general.
>>
>> And there's been a patch (which unfortunately died because it's
>> implementation wasn't good), to allow referring to the other versions of
>> the tuple. It has been wished for numerous times.
>
> Well, if that patch is ever committed, then it won't be hard to get
> the behavior here too, since it is literally exactly the same code. I
> don't change anything about it, and that seems to be your problem.


I withdraw this remark. Even in a world where this patch is committed,
it still makes sense for the INSERT returning behavior to not be
altered (and to project only TARGET tuples even if they come from the
auxiliary UPDATE). The "join" is within the auxiliary UPDATE, not the
INSERT, and it should be no more possible to project intermediate
tuples (like EXCLUDED.*) from the INSERT's RETURNING than it is to
project CTE scan tuples from an INSERT ... RETURNING with a CTE.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Heikki Linnakangas
Дата:
On 04/25/2015 12:01 PM, Andres Freund wrote:
> INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) UPDATE|IGNORE
>
> My problem with the WHERE being inside the parens in the above is that
> it's
> a) different from CREATE INDEX
> b) unclear whether the WHERE belongs to colb or the whole index
>     expression. The equivalent for aggregates, which I bet is going to be
>     used less often, caused a fair amount of confusing.
>
> That's why I wanted the WHERE outside the (), which requires either
> adding DO between the index inference clause, and the action, to avoid
> ambiguities in the grammar.

Yeah, having the WHERE outside the parens seems much nicer. What is the 
ambiguity?

> But I'm generally having some doubts about the syntax.
>
> Right now it's
> INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE.
>
> A couple things:
>
> a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
>     if we, at some later point, also want to handle other kind of
>     violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...

As Peter said, it's also for exclusion constraints. Perhaps "ON 
CONSTRAINT VIOLATION"? It doesn't apply to foreign key constraints, 
though. I think "ON CONFLICT" is fine.

> b) For me there's a WITH before the index inference clause missing, to
>     have it read in 'SQL' style.

Agreed. ON would sound more natural than WITH though:

INSERT INTO mytable ON CONFLICT ON (keycol) UPDATE ...

The ability to specify a constraint by name hasn't been implemented, but 
that would read quite naturally as:

INSERT INTO mytable ON CONFLICT ON CONSTRAINT my_constraint UPDATE ...

- Heikki




Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On April 26, 2015 11:22:01 AM GMT+02:00, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>On 04/25/2015 12:01 PM, Andres Freund wrote:
>> INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial])
>UPDATE|IGNORE
>>
>> My problem with the WHERE being inside the parens in the above is
>that
>> it's
>> a) different from CREATE INDEX
>> b) unclear whether the WHERE belongs to colb or the whole index
>>     expression. The equivalent for aggregates, which I bet is going
>to be
>>     used less often, caused a fair amount of confusing.
>>
>> That's why I wanted the WHERE outside the (), which requires either
>> adding DO between the index inference clause, and the action, to
>avoid
>> ambiguities in the grammar.
>
>Yeah, having the WHERE outside the parens seems much nicer. What is the
>
>ambiguity?


With a full keyword in between (like DO), there's none. But without it its ambiguous where a trailing UPDATE belongs
to.At least from the point of a LALR grammar. WHERE UPDATE; is legal. I don't see the DO as much of a problem though.
 


>> But I'm generally having some doubts about the syntax.
>>
>> Right now it's
>> INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE.
>>
>> A couple things:
>>
>> a) Why is is 'CONFLICT"? We're talking about a uniquness violation.
>What
>>     if we, at some later point, also want to handle other kind of
>>     violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION
>...
>
>As Peter said, it's also for exclusion constraints. Perhaps "ON 
>CONSTRAINT VIOLATION"? It doesn't apply to foreign key constraints, 
>though. I think "ON CONFLICT" is fine.

What if we, as at least I have previously wished for, want to allow handling other types of constraints? It'd be quite
coolto be able to insert the referenced key on a fkey violation for some use cases.
 

>> b) For me there's a WITH before the index inference clause missing,
>to
>>     have it read in 'SQL' style.
>
>Agreed. ON would sound more natural than WITH though:
>
>INSERT INTO mytable ON CONFLICT ON (keycol) UPDATE ...

I chose WITh because of the repeated DO; that's all ;)


--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.



Re: INSERT ... ON CONFLICT syntax issues

От
Petr Jelinek
Дата:
On 26/04/15 12:08, Andres Freund wrote:
> On April 26, 2015 11:22:01 AM GMT+02:00, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> On 04/25/2015 12:01 PM, Andres Freund wrote:
>>>
>>> That's why I wanted the WHERE outside the (), which requires either
>>> adding DO between the index inference clause, and the action, to
>> avoid
>>> ambiguities in the grammar.
>>
>> Yeah, having the WHERE outside the parens seems much nicer. What is the
>>
>> ambiguity?
>
>
> With a full keyword in between (like DO), there's none. But without it its ambiguous where a trailing UPDATE belongs
to.At least from the point of a LALR grammar. WHERE UPDATE; is legal. I don't see the DO as much of a problem though.
 
>

The DO variant with WHERE outside of parenthesis sounds fine to me. Or 
at least better than the alternatives I've seen or can come up with.

>>> A couple things:
>>>
>>> a) Why is is 'CONFLICT"? We're talking about a uniquness violation.
>> What
>>>      if we, at some later point, also want to handle other kind of
>>>      violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION
>> ...
>>
>> As Peter said, it's also for exclusion constraints. Perhaps "ON
>> CONSTRAINT VIOLATION"? It doesn't apply to foreign key constraints,
>> though. I think "ON CONFLICT" is fine.
>
> What if we, as at least I have previously wished for, want to allow handling other types of constraints? It'd be
quitecool to be able to insert the referenced key on a fkey violation for some use cases.
 
>
>>> b) For me there's a WITH before the index inference clause missing,
>> to
>>>      have it read in 'SQL' style.
>>
>> Agreed. ON would sound more natural than WITH though:
>>
>> INSERT INTO mytable ON CONFLICT ON (keycol) UPDATE ...
>
> I chose WITh because of the repeated DO; that's all ;)
>

The ON CONFLICT ON sounds really weird to me. Either ON CONSTRAINT 
VIOLATION (foo) or ON CONFLICT [WITH] (foo) both seem acceptable.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT syntax issues

От
Stephen Frost
Дата:
* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> On 04/25/2015 12:01 PM, Andres Freund wrote:
> >INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) UPDATE|IGNORE
> >
> >My problem with the WHERE being inside the parens in the above is that
> >it's
> >a) different from CREATE INDEX
> >b) unclear whether the WHERE belongs to colb or the whole index
> >    expression. The equivalent for aggregates, which I bet is going to be
> >    used less often, caused a fair amount of confusing.
> >
> >That's why I wanted the WHERE outside the (), which requires either
> >adding DO between the index inference clause, and the action, to avoid
> >ambiguities in the grammar.
>
> Yeah, having the WHERE outside the parens seems much nicer. What is
> the ambiguity?

I like having it outside the parens also.

> >But I'm generally having some doubts about the syntax.
> >
> >Right now it's
> >INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE.
> >
> >A couple things:
> >
> >a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
> >    if we, at some later point, also want to handle other kind of
> >    violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...
>
> As Peter said, it's also for exclusion constraints. Perhaps "ON
> CONSTRAINT VIOLATION"? It doesn't apply to foreign key constraints,
> though. I think "ON CONFLICT" is fine.

I don't mind using "CONFLICT" here, seems to make sense to me.

> >b) For me there's a WITH before the index inference clause missing, to
> >    have it read in 'SQL' style.
>
> Agreed. ON would sound more natural than WITH though:
>
> INSERT INTO mytable ON CONFLICT ON (keycol) UPDATE ...
>
> The ability to specify a constraint by name hasn't been implemented,
> but that would read quite naturally as:
>
> INSERT INTO mytable ON CONFLICT ON CONSTRAINT my_constraint UPDATE ...

I don't particularly like the double-ON in this..

I've not tried, but is the first ON required to be a full keyword?
Seems like it probably is, but just to finish the thought I had, what
about:

INSERT INTO mytable .. IF CONFLICT ON (a,b) WHERE .. THEN UPDATE

IF is currently just an unreserved keyword though.

We could use FOR though:

INSERT INTO mytable .. FOR CONFLICT ON (a,b) WHERE .. THEN UPDATE

Though that'd probably sound better as:

INSERT INTO mytable .. FOR CONFLICT ON (a,b) WHERE .. DO UPDATE

Another option is:

INSERT INTO mytable .. WHEN CONFLICT ON (a,b) WHERE .. DO UPDATE

Which could also be:

INSERT INTO mytable .. WHEN CONFLICT ON (a,b) WHERE .. THEN UPDATE

of course..

What's important, in my view, is to keep the simple case simple and so
I'm not particularly wedded to any of these approaches, just trying to
help with other suggestions.

INSERT INTO mytable VALUES ('key1','key2','val1','val2')
ON CONFLICT UPDATE SET val1 = 'val1', val2 = 'val2';

strikes me as a the 99% use-case here that we need to keep sane, and
it'd be really nice if we didn't have to include the SET clause and
duplicate those values at all..  That could be something we add later
though, I don't think it needs to be done now.
Thanks!
    Stephen

Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Sun, Apr 26, 2015 at 6:34 AM, Stephen Frost <sfrost@snowman.net> wrote:
> What's important, in my view, is to keep the simple case simple and so
> I'm not particularly wedded to any of these approaches, just trying to
> help with other suggestions.
>
> INSERT INTO mytable VALUES ('key1','key2','val1','val2')
> ON CONFLICT UPDATE SET val1 = 'val1', val2 = 'val2';
>
> strikes me as a the 99% use-case here that we need to keep sane, and
> it'd be really nice if we didn't have to include the SET clause and
> duplicate those values at all..  That could be something we add later
> though, I don't think it needs to be done now.

You can do that already. That's what the EXCLUDED.* alias that is
automatically added is for (the thing that Andres disliked the
spelling of - or the other thing). This is legal, for example:

INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2')
ON CONFLICT (foo) UPDATE SET (foo, bar, baz, bat) = (EXCLUDED.foo,
EXCLUDED.bar, EXCLUDED.baz, EXCLUDED.bat)';

I don't want to accept something that automatically merges the
excluded tuple (e.g., "SET (*) = EXLCUDED.*"), for reasons outlined
here: https://wiki.postgresql.org/wiki/UPSERT#VoltDB.27s_UPSERT
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Stephen Frost
Дата:
Peter,

* Peter Geoghegan (pg@heroku.com) wrote:
> On Sun, Apr 26, 2015 at 6:34 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > What's important, in my view, is to keep the simple case simple and so
> > I'm not particularly wedded to any of these approaches, just trying to
> > help with other suggestions.
> >
> > INSERT INTO mytable VALUES ('key1','key2','val1','val2')
> > ON CONFLICT UPDATE SET val1 = 'val1', val2 = 'val2';
> >
> > strikes me as a the 99% use-case here that we need to keep sane, and
> > it'd be really nice if we didn't have to include the SET clause and
> > duplicate those values at all..  That could be something we add later
> > though, I don't think it needs to be done now.
>
> You can do that already. That's what the EXCLUDED.* alias that is
> automatically added is for (the thing that Andres disliked the
> spelling of - or the other thing). This is legal, for example:
>
> INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2')
> ON CONFLICT (foo) UPDATE SET (foo, bar, baz, bat) = (EXCLUDED.foo,
> EXCLUDED.bar, EXCLUDED.baz, EXCLUDED.bat)';

Yeah, that's not exactly simpler and I don't expect to see it used very
often (as in, less than 1%) because of that.

> I don't want to accept something that automatically merges the
> excluded tuple (e.g., "SET (*) = EXLCUDED.*"), for reasons outlined
> here: https://wiki.postgresql.org/wiki/UPSERT#VoltDB.27s_UPSERT

Perhaps I'm missing it, but the reasons that I see there appear to be:

"It'd be like SELECT *" and "we'd have to decide what to do about the
value for unspecified columns".  As for the latter- we have to do that
*anyway*, no?  What happens if you do:

INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2')
ON CONFLICT (foo) UPDATE SET (baz) = (EXCLUDED.baz);

?

As for the "SELECT *" concern, I fail to see how it's any different from
the exact same currently-encouraged usage of INSERT + UPDATE:

INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2');

... catch the exception

UPDATE mytable SET baz = 'val1', bat = 'val2' WHERE foo = 'key1' and bar = 'key2';

Clearly there are issues with the above if someone is running around
adding columns to tables and PG has to figure out if we should be
setting the non-mentioned columns to NULL or to the default for the
column, but we're all quite happy to do so and trust that whomever is
adding the column has set a sane default and that PG will use it when
the column isn't included in either the INSERT or the UPDATE.

Note that I wasn't suggesting your "SET (*) = EXLCUDED.*" syntax and if
that would expand to something different than what I've outlined above
then it would make sense to not include it (... or fix it to act the
same, and then it's just a more verbose approach).

Further, this is *very* different from how the "SELECT *" concern can
cause things to break unexpectedly- new columns end up getting returned
which the application is unlikely to be prepared for.  That doesn't
happen here and so I don't believe it makes any sense to try and compare
the two.

Happy to discuss, of course, and apologies if I missed some other issue-
I was just reading what I found at the link provided.
Thanks!
    Stephen

Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Sun, Apr 26, 2015 at 11:08 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> I don't want to accept something that automatically merges the
>> excluded tuple (e.g., "SET (*) = EXLCUDED.*"), for reasons outlined
>> here: https://wiki.postgresql.org/wiki/UPSERT#VoltDB.27s_UPSERT
>
> Perhaps I'm missing it, but the reasons that I see there appear to be:
>
> "It'd be like SELECT *" and "we'd have to decide what to do about the
> value for unspecified columns".  As for the latter- we have to do that
> *anyway*, no?  What happens if you do:
>
> INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2')
> ON CONFLICT (foo) UPDATE SET (baz) = (EXCLUDED.baz);
>
> ?

It's like any other UPDATE - the values of columns not appearing in
the targetlist are unchanged from the original row version now
superseded. It doesn't matter that you had some other values in the
INSERT. You only get what you ask for.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Stephen Frost
Дата:
* Peter Geoghegan (pg@heroku.com) wrote:
> On Sun, Apr 26, 2015 at 11:08 AM, Stephen Frost <sfrost@snowman.net> wrote:
> >> I don't want to accept something that automatically merges the
> >> excluded tuple (e.g., "SET (*) = EXLCUDED.*"), for reasons outlined
> >> here: https://wiki.postgresql.org/wiki/UPSERT#VoltDB.27s_UPSERT
> >
> > Perhaps I'm missing it, but the reasons that I see there appear to be:
> >
> > "It'd be like SELECT *" and "we'd have to decide what to do about the
> > value for unspecified columns".  As for the latter- we have to do that
> > *anyway*, no?  What happens if you do:
> >
> > INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2')
> > ON CONFLICT (foo) UPDATE SET (baz) = (EXCLUDED.baz);
> >
> > ?
>
> It's like any other UPDATE - the values of columns not appearing in
> the targetlist are unchanged from the original row version now
> superseded. It doesn't matter that you had some other values in the
> INSERT. You only get what you ask for.

Ok, that makes sense..  So is the concern that an INSERT would end up
getting default values while an UPDATE would preserve whatever's there?

I don't see that as an issue.

Are you still against having a way to say "go forth and update whatever
non-conflicting columns I've specified in the INSERT, if there is a
conflict"..?

Again, not saying it has to be done now, but it'd certainly be nice if
we had it initially because otherwise the ORMs and "frameworks" of the
world will be stuck supporting the more verbose approach for as long as
we support it (~5 years..).
Thanks!
    Stephen

Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Sun, Apr 26, 2015 at 11:35 AM, Stephen Frost <sfrost@snowman.net> wrote:
> Ok, that makes sense..  So is the concern that an INSERT would end up
> getting default values while an UPDATE would preserve whatever's there?
>
> I don't see that as an issue.

I think it easily could be.

> Are you still against having a way to say "go forth and update whatever
> non-conflicting columns I've specified in the INSERT, if there is a
> conflict"..?
>
> Again, not saying it has to be done now, but it'd certainly be nice if
> we had it initially because otherwise the ORMs and "frameworks" of the
> world will be stuck supporting the more verbose approach for as long as
> we support it (~5 years..).

The more verbose approach is entirely necessary much of the time. For example:

insert into upsert_race_test (index, count)
values ('541','-1') on conflict update set count=TARGET.count + EXCLUDED.count;

Merging like this will be a very common requirement.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Stephen Frost
Дата:
* Peter Geoghegan (pg@heroku.com) wrote:
> On Sun, Apr 26, 2015 at 11:35 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > Ok, that makes sense..  So is the concern that an INSERT would end up
> > getting default values while an UPDATE would preserve whatever's there?
> >
> > I don't see that as an issue.
>
> I think it easily could be.

Ok..  Can you elaborate on that?  Would it be an issue that's different
from the same thing done as independent commands?

Perhaps it'd be an issue for individuals who attempt to combine some
more complicated INSERT/UPDATE logic and don't realize that they'd get
whatever the existing value is for the non-specified columns rather than
the default value, but I'm sure they'd realize it on testing it and,
well, there's lots of ways users can misuse SQL and PG and get what they
expect 99% of the time (JOIN would be a great example..) only to have
things break one day.

> > Are you still against having a way to say "go forth and update whatever
> > non-conflicting columns I've specified in the INSERT, if there is a
> > conflict"..?
> >
> > Again, not saying it has to be done now, but it'd certainly be nice if
> > we had it initially because otherwise the ORMs and "frameworks" of the
> > world will be stuck supporting the more verbose approach for as long as
> > we support it (~5 years..).
>
> The more verbose approach is entirely necessary much of the time. For example:
>
> insert into upsert_race_test (index, count)
> values ('541','-1') on conflict update set count=TARGET.count + EXCLUDED.count;
>
> Merging like this will be a very common requirement.

I was just about to reply to myself that I didn't intend to say that we
would remove the more verbose syntax but rather that they'd have to
use the more verbose syntax as long as we supported a release which
*didn't* have the simpler syntax, which would be ~5 years.
Thanks!
    Stephen

Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Sun, Apr 26, 2015 at 11:43 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> I think it easily could be.
>
> Ok..  Can you elaborate on that?  Would it be an issue that's different
> from the same thing done as independent commands?

I think that the stuff I linked to describes my concerns exhaustively.
In any case, it's a discussion for another day.


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Bruce Momjian
Дата:
On Sun, Apr 26, 2015 at 09:34:12AM -0400, Stephen Frost wrote:
> * Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> > On 04/25/2015 12:01 PM, Andres Freund wrote:
> > >INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) UPDATE|IGNORE
> > >
> > >My problem with the WHERE being inside the parens in the above is that
> > >it's
> > >a) different from CREATE INDEX
> > >b) unclear whether the WHERE belongs to colb or the whole index
> > >    expression. The equivalent for aggregates, which I bet is going to be
> > >    used less often, caused a fair amount of confusing.
> > >
> > >That's why I wanted the WHERE outside the (), which requires either
> > >adding DO between the index inference clause, and the action, to avoid
> > >ambiguities in the grammar.
> > 
> > Yeah, having the WHERE outside the parens seems much nicer. What is
> > the ambiguity?
> 
> I like having it outside the parens also.

Agreed, and I like the DO [ UPDATE | NOTHING ] too.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Eisentraut
Дата:
On 4/25/15 2:05 PM, Peter Geoghegan wrote:
> Note that the syntax is quite similar to the SQLite
> syntax of the same feature, that has ON CONFLICT IGNORE (it also has
> ON CONFLICT REPLACE, but not ON CONFLICT UPDATE).

I don't know anything about SQLite's syntax, but from the online syntax
diagrams

https://www.sqlite.org/lang_insert.html
https://www.sqlite.org/syntax/conflict-clause.html

it appears that they are using quite a different syntax.  The ON
CONFLICT clause is attached to a constraint, specifying the default
action for that constraint.  The INSERT command can then override this
default choice.  I think.




Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Mon, Apr 27, 2015 at 1:19 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> it appears that they are using quite a different syntax.  The ON
> CONFLICT clause is attached to a constraint, specifying the default
> action for that constraint.  The INSERT command can then override this
> default choice.  I think.

Well, MySQL's ON DUPLICATE KEY UPDATE thing is pretty close to what I
have. I intend CONFLICT as a broader term, which is somewhat similar
to SQLite (and is not needlessly verbose).

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Mon, Apr 27, 2015 at 10:20 AM, Bruce Momjian <bruce@momjian.us> wrote:
> Agreed, and I like the DO [ UPDATE | NOTHING ] too.

Here is what I think I need to do:

* Don't change the ON CONFLICT spelling.

* Don't change the names of the pseudo-alias EXCLUDED.* (or the alias
TARGET.*). Those seem fine to me as well.

* Change the syntax to put the WHERE clause used to infer partial
indexes outside parens.

* Change the syntax to make this work, by adding the fully reserved
keyword DO. Assuming you have a partial index (WHERE is_active) on the
column "key", you're left with:

INSERT .... ON CONFLICT (key) where is_active DO UPDATE set ... WHERE ... ;

or:

INSERT .... ON CONFLICT (key) where is_active DO NOTHING;

The DO keyword makes this work where it cannot otherwise, because it's
a RESERVED_KEYWORD.

* Finally, add "ON CONFLICT ON CONSTRAINT my_constraint" support, so
you can name (exactly one) constraint by name. Particularly useful for
unique constraints. I really don't want to make this accept multiple
constraints, even though we may infer multiple constraints, because
messy, and is probably too complex to every be put to good use
(bearing in mind that exclusion constraints, that really need this,
will still only be supported by the IGNORE/DO NOTHING variant).

Are we in agreement?
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Mon, Apr 27, 2015 at 4:21 PM, Peter Geoghegan <pg@heroku.com> wrote:
> * Don't change the ON CONFLICT spelling.
>
> * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias
> TARGET.*). Those seem fine to me as well.
>
> * Change the syntax to put the WHERE clause used to infer partial
> indexes outside parens.
>
> * Change the syntax to make this work, by adding the fully reserved
> keyword DO. Assuming you have a partial index (WHERE is_active) on the
> column "key", you're left with:
>
> INSERT .... ON CONFLICT (key) where is_active DO UPDATE set ... WHERE ... ;
>
> or:
>
> INSERT .... ON CONFLICT (key) where is_active DO NOTHING;
>
> The DO keyword makes this work where it cannot otherwise, because it's
> a RESERVED_KEYWORD.

I've pushed code that does all this to Github. Documentation changes
will follow soon.

> * Finally, add "ON CONFLICT ON CONSTRAINT my_constraint" support, so
> you can name (exactly one) constraint by name. Particularly useful for
> unique constraints. I really don't want to make this accept multiple
> constraints, even though we may infer multiple constraints, because
> messy, and is probably too complex to every be put to good use
> (bearing in mind that exclusion constraints, that really need this,
> will still only be supported by the IGNORE/DO NOTHING variant).

I have yet to do this, but it should be fairly straightforward.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Petr Jelinek
Дата:
On 28/04/15 03:51, Peter Geoghegan wrote:
> On Mon, Apr 27, 2015 at 4:21 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> * Don't change the ON CONFLICT spelling.
>>
>> * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias
>> TARGET.*). Those seem fine to me as well.
>>
>> * Change the syntax to put the WHERE clause used to infer partial
>> indexes outside parens.
>>
>> * Change the syntax to make this work, by adding the fully reserved
>> keyword DO. Assuming you have a partial index (WHERE is_active) on the
>> column "key", you're left with:
>>
>> INSERT .... ON CONFLICT (key) where is_active DO UPDATE set ... WHERE ... ;
>>
>> or:
>>
>> INSERT .... ON CONFLICT (key) where is_active DO NOTHING;
>>
>> The DO keyword makes this work where it cannot otherwise, because it's
>> a RESERVED_KEYWORD.
>
> I've pushed code that does all this to Github. Documentation changes
> will follow soon.
>
>> * Finally, add "ON CONFLICT ON CONSTRAINT my_constraint" support, so
>> you can name (exactly one) constraint by name. Particularly useful for
>> unique constraints. I really don't want to make this accept multiple
>> constraints, even though we may infer multiple constraints, because
>> messy, and is probably too complex to every be put to good use
>> (bearing in mind that exclusion constraints, that really need this,
>> will still only be supported by the IGNORE/DO NOTHING variant).
>
> I have yet to do this, but it should be fairly straightforward.
>

Most of this sounds reasonable to me. However I still dislike the "ON 
CONFLICT ON" part, the idea of WITH Andres suggested feels better.

I am also very sure that every time I'll write this statement I will 
have to look into manual for the names of TARGET and EXCLUDED because 
they don't seem intuitive to me at all (especially the EXCLUDED).

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote:
> I am also very sure that every time I'll write this statement I will have to
> look into manual for the names of TARGET and EXCLUDED because they don't
> seem intuitive to me at all (especially the EXCLUDED).

Same here. I don't understand why 'CONFLICTING' would be more ambiguous
than EXCLUDED (as Peter argued earlier). Especially given that the whole
syntax is called ON CONFLICT.

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Stephen Frost
Дата:
* Andres Freund (andres@anarazel.de) wrote:
> On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote:
> > I am also very sure that every time I'll write this statement I will have to
> > look into manual for the names of TARGET and EXCLUDED because they don't
> > seem intuitive to me at all (especially the EXCLUDED).
>
> Same here. I don't understand why 'CONFLICTING' would be more ambiguous
> than EXCLUDED (as Peter argued earlier). Especially given that the whole
> syntax is called ON CONFLICT.

Any way we can alias it?  Both of those strike me as annoyingly long and
if we could allow an alias then people can do whatever they want...

No, I haven't got any suggestion on how to do that. :)

It's also something we can probably improve on in the future...
Thanks!
    Stephen

Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On 2015-04-28 10:40:10 -0400, Stephen Frost wrote:
> * Andres Freund (andres@anarazel.de) wrote:
> > On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote:
> > > I am also very sure that every time I'll write this statement I will have to
> > > look into manual for the names of TARGET and EXCLUDED because they don't
> > > seem intuitive to me at all (especially the EXCLUDED).
> > 
> > Same here. I don't understand why 'CONFLICTING' would be more ambiguous
> > than EXCLUDED (as Peter argued earlier). Especially given that the whole
> > syntax is called ON CONFLICT.
> 
> Any way we can alias it?  Both of those strike me as annoyingly long and
> if we could allow an alias then people can do whatever they want...
> 
> No, I haven't got any suggestion on how to do that. :)
> 
> It's also something we can probably improve on in the future...

I earlier suggested NEW/OLD. I still think that's not too bad as I don't
buy the argument that they're too associated with rules.

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Stephen Frost
Дата:
* Andres Freund (andres@anarazel.de) wrote:
> On 2015-04-28 10:40:10 -0400, Stephen Frost wrote:
> > * Andres Freund (andres@anarazel.de) wrote:
> > > On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote:
> > > > I am also very sure that every time I'll write this statement I will have to
> > > > look into manual for the names of TARGET and EXCLUDED because they don't
> > > > seem intuitive to me at all (especially the EXCLUDED).
> > >
> > > Same here. I don't understand why 'CONFLICTING' would be more ambiguous
> > > than EXCLUDED (as Peter argued earlier). Especially given that the whole
> > > syntax is called ON CONFLICT.
> >
> > Any way we can alias it?  Both of those strike me as annoyingly long and
> > if we could allow an alias then people can do whatever they want...
> >
> > No, I haven't got any suggestion on how to do that. :)
> >
> > It's also something we can probably improve on in the future...
>
> I earlier suggested NEW/OLD. I still think that's not too bad as I don't
> buy the argument that they're too associated with rules.

+1, NEW/OLD seem pretty natural and I'm not worried about what they look
like in rules, and their usage in triggers matches up with what they'd
mean here, I'd think.
Thanks!
    Stephen

Re: INSERT ... ON CONFLICT syntax issues

От
Petr Jelinek
Дата:
On 28/04/15 16:44, Andres Freund wrote:
> On 2015-04-28 10:40:10 -0400, Stephen Frost wrote:
>> * Andres Freund (andres@anarazel.de) wrote:
>>> On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote:
>>>> I am also very sure that every time I'll write this statement I will have to
>>>> look into manual for the names of TARGET and EXCLUDED because they don't
>>>> seem intuitive to me at all (especially the EXCLUDED).
>>>
>>> Same here. I don't understand why 'CONFLICTING' would be more ambiguous
>>> than EXCLUDED (as Peter argued earlier). Especially given that the whole
>>> syntax is called ON CONFLICT.
>>
>> Any way we can alias it?  Both of those strike me as annoyingly long and
>> if we could allow an alias then people can do whatever they want...
>>
>> No, I haven't got any suggestion on how to do that. :)
>>
>> It's also something we can probably improve on in the future...
>
> I earlier suggested NEW/OLD. I still think that's not too bad as I don't
> buy the argument that they're too associated with rules.
>

Hmm, I would never think of rules when talking about NEW/OLD, the 
association I have is with triggers.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: INSERT ... ON CONFLICT syntax issues

От
Geoff Winkless
Дата:
On 28 April 2015 at 15:46, Stephen Frost <sfrost@snowman.net> wrote:
+1, NEW/OLD seem pretty natural and I'm not worried about what they look
like in rules, and their usage in triggers matches up with what they'd
mean here, I'd think.
 
Since I've stuck my head above the parapet once I figured I'd give m
y 2p's worth:
​IMHO ​
NEW/OLD doesn't fit at all. 

In triggers you're applying it to something that (without the trigger) would be the new or old version of a matching row
​, so it's completely intuitive​
; in this instance without the ON CONFLICT there would never be a
​"​
new
​"​
, because it would be
​a ​
failure
​​
.
​​

​MySQL uses VALUES(columnname) to reference the intended INSERT value (what you might term "NEW") and the target name to reference "OLD". I understand that people might think the bracketed syntax isn't very pleasant because that looks like a function, but it seems more reasonable than NEW (can we use VALUES.columname?); finally I don't see why we need an "OLD" (or TARGET) at all - am I missing the point?

Geoff


Re: INSERT ... ON CONFLICT syntax issues

От
Geoff Winkless
Дата:
On 28 April 2015 at 15:57, I wrote:
​MySQL uses VALUES(columnname) to reference the intended INSERT value (what you might term "NEW") and the target name to reference "OLD". I understand that people might think the bracketed syntax isn't very pleasant because that looks like a function, but it seems more reasonable than NEW (can we use VALUES.columname?); 
 
On balance I
​think I ​
don't like VALUES.column either
​, because although it looks all fine when you're doing a single INSERT ... VALUES () it gets confusing if you're INSERTing from a SELECT.

​As you were. :(

Geoff

Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Tue, Apr 28, 2015 at 7:38 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote:
>> I am also very sure that every time I'll write this statement I will have to
>> look into manual for the names of TARGET and EXCLUDED because they don't
>> seem intuitive to me at all (especially the EXCLUDED).
>
> Same here. I don't understand why 'CONFLICTING' would be more ambiguous
> than EXCLUDED (as Peter argued earlier). Especially given that the whole
> syntax is called ON CONFLICT.

Because the TARGET and EXCLUDED tuples conflict with each other -
they're both conflicting.


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Stephen Frost
Дата:
* Peter Geoghegan (pg@heroku.com) wrote:
> On Tue, Apr 28, 2015 at 7:38 AM, Andres Freund <andres@anarazel.de> wrote:
> > On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote:
> >> I am also very sure that every time I'll write this statement I will have to
> >> look into manual for the names of TARGET and EXCLUDED because they don't
> >> seem intuitive to me at all (especially the EXCLUDED).
> >
> > Same here. I don't understand why 'CONFLICTING' would be more ambiguous
> > than EXCLUDED (as Peter argued earlier). Especially given that the whole
> > syntax is called ON CONFLICT.
>
> Because the TARGET and EXCLUDED tuples conflict with each other -
> they're both conflicting.

I agree with that, but how are NEW and OLD ambiguous?  NEW is clearly
the tuple being added, while OLD is clearly the existing tuple.

Now that I think about it though, where that'd get ugly is using this
command *inside* a trigger function, which I can certainly imagine
people will want to do...
Thanks,
    Stephen

Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Tue, Apr 28, 2015 at 9:42 AM, Stephen Frost <sfrost@snowman.net> wrote:
> I agree with that, but how are NEW and OLD ambiguous?  NEW is clearly
> the tuple being added, while OLD is clearly the existing tuple.

Yes, but EXCLUDED is neither the tuple being added, nor is it the new
tuple. It's something else entirely.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Tue, Apr 28, 2015 at 9:45 AM, Peter Geoghegan <pg@heroku.com> wrote:
> Yes, but EXCLUDED is neither the tuple being added, nor is it the new
> tuple. It's something else entirely.


I mean, nor is it the existing tuple.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Stephen Frost
Дата:
* Peter Geoghegan (pg@heroku.com) wrote:
> On Tue, Apr 28, 2015 at 9:42 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > I agree with that, but how are NEW and OLD ambiguous?  NEW is clearly
> > the tuple being added, while OLD is clearly the existing tuple.
>
> Yes, but EXCLUDED is neither the tuple being added, nor is it the new
> tuple. It's something else entirely.

I don't see that, it's exactly the tuple attempting to be inserted.  I
agree that it might not be the tuple originally in the INSERT statement
due to before triggers, but there isn't an alias anywhere for that.

Now, in 99% of cases there aren't going to be before triggers so I'm not
particularly worried about that distinction, nor do I think we need to
provide an alias for the tuple from the INSERT piece of the clause, but
to say that EXCLUDED isn't the tuple being added doesn't make any sense
to me, based on how I read the documentation proposed here:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

I'll further point out that the documentation doesn't bother to make the
before-trigger distinction always either:
Note that an EXCLUDED expression is used to reference values originallyproposed for insertion:

Perhaps I've missed something, but that seems to go along with the
notion that EXCLUDED references the tuple that we're attempting to add
through the INSERT, and that's certainly what I'd consider NEW to be
also.

I do think there's a real issue with using OLD/NEW because of their
usage in triggers, and I don't see any good solution to that issue. :(
Thanks,
    Stephen

Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Tue, Apr 28, 2015 at 7:36 AM, Petr Jelinek <petr@2ndquadrant.com> wrote:
> I am also very sure that every time I'll write this statement I will have to
> look into manual for the names of TARGET and EXCLUDED because they don't
> seem intuitive to me at all (especially the EXCLUDED).

According to wordcount.org, the word "exclude" is ranked # 5796 in the
English language in terms of frequency of use. It's in the vocabulary
of 6 year olds.

I don't know why you find it counter-intuitive, since it perfectly
describes the purpose of the tuple.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
"David G. Johnston"
Дата:
On Tue, Apr 28, 2015 at 9:58 AM, Stephen Frost <sfrost@snowman.net> wrote:
* Peter Geoghegan (pg@heroku.com) wrote:
> On Tue, Apr 28, 2015 at 9:42 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > I agree with that, but how are NEW and OLD ambiguous?  NEW is clearly
> > the tuple being added, while OLD is clearly the existing tuple.
>
> Yes, but EXCLUDED is neither the tuple being added, nor is it the new
> tuple. It's something else entirely.

​So?  I see this as a prime case for choosing practicality/functionality over precision.

​If I was to pick 2 words I would probably pick "PROPOSED" and "EXISTING".

But, the syntax is already verbose and being able to use a three-letter​ reference has its own appeal.


I don't see that, it's exactly the tuple attempting to be inserted.  I
agree that it might not be the tuple originally in the INSERT statement
due to before triggers, but there isn't an alias anywhere for that.

Now, in 99% of cases there aren't going to be before triggers so I'm not
particularly worried about that distinction, nor do I think we need to
provide an alias for the tuple from the INSERT piece of the clause, but
to say that EXCLUDED isn't the tuple being added doesn't make any sense
to me, based on how I read the documentation proposed here:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

​This example exemplifies the poorness of the proposed wording, IMO:

 
​[...] ​
SET dname = EXCLUDED.dname || ' (formerly ' || TARGET.dname || ')'

​NEW.dname || '(formerly ' || OLD.dname || ')' reads perfectly well.

Yes, this is an isolated example...​but am I missing the fact that there is a third tuple that needs to be referenced?  

If there are only two the choices of NEW and OLD seem to be both easily learned and readable.

​David J.

Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Tue, Apr 28, 2015 at 10:36 AM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> This example exemplifies the poorness of the proposed wording, IMO:
>
>
> [...]
> SET dname = EXCLUDED.dname || ' (formerly ' || TARGET.dname || ')'
>
> NEW.dname || '(formerly ' || OLD.dname || ')' reads perfectly well.
>
> Yes, this is an isolated example...but am I missing the fact that there is a
> third tuple that needs to be referenced?
>
> If there are only two the choices of NEW and OLD seem to be both easily
> learned and readable.

Whatever Andres and/or Heikki want is what I'll agree to. Honestly, I
just don't care anymore.


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Robert Haas
Дата:
On Mon, Apr 27, 2015 at 7:21 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Apr 27, 2015 at 10:20 AM, Bruce Momjian <bruce@momjian.us> wrote:
>> Agreed, and I like the DO [ UPDATE | NOTHING ] too.
>
> Here is what I think I need to do:
>
> * Don't change the ON CONFLICT spelling.

What I proposed originally was ON DUPLICATE.  Not ON CONFLICT.  And I
still like that better.  ON UNIQUE CONFLICT, which Andres mentioned,
gets us there too, but it's

> * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias
> TARGET.*). Those seem fine to me as well.

There seem to be a few votes for NEW and OLD.  That's what I proposed
originally, and (surprise, surprise) I still like that better too.

> * Change the syntax to put the WHERE clause used to infer partial
> indexes outside parens.

+1.

> * Change the syntax to make this work, by adding the fully reserved
> keyword DO. Assuming you have a partial index (WHERE is_active) on the
> column "key", you're left with:
>
> INSERT .... ON CONFLICT (key) where is_active DO UPDATE set ... WHERE ... ;
>
> or:
>
> INSERT .... ON CONFLICT (key) where is_active DO NOTHING;
>
> The DO keyword makes this work where it cannot otherwise, because it's
> a RESERVED_KEYWORD.

Seems fine.

> * Finally, add "ON CONFLICT ON CONSTRAINT my_constraint" support, so
> you can name (exactly one) constraint by name. Particularly useful for
> unique constraints. I really don't want to make this accept multiple
> constraints, even though we may infer multiple constraints, because
> messy, and is probably too complex to every be put to good use
> (bearing in mind that exclusion constraints, that really need this,
> will still only be supported by the IGNORE/DO NOTHING variant).

I still think that constraints should never be named in the syntax.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT syntax issues

От
Stephen Frost
Дата:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Mon, Apr 27, 2015 at 7:21 PM, Peter Geoghegan <pg@heroku.com> wrote:
> > On Mon, Apr 27, 2015 at 10:20 AM, Bruce Momjian <bruce@momjian.us> wrote:
> >> Agreed, and I like the DO [ UPDATE | NOTHING ] too.
> >
> > Here is what I think I need to do:
> >
> > * Don't change the ON CONFLICT spelling.
>
> What I proposed originally was ON DUPLICATE.  Not ON CONFLICT.  And I
> still like that better.  ON UNIQUE CONFLICT, which Andres mentioned,
> gets us there too, but it's

My general feeling is "keep it short" but I'm not particular beyond
that.  I do like the idea that we'll be able to support more options in
the future.

> > * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias
> > TARGET.*). Those seem fine to me as well.
>
> There seem to be a few votes for NEW and OLD.  That's what I proposed
> originally, and (surprise, surprise) I still like that better too.

I was promoting NEW/OLD, until I realized that we'd end up having a
problem in trigger functions because NEW/OLD are already defined there,
unless you have a suggestion for how to improve on that?

> > * Finally, add "ON CONFLICT ON CONSTRAINT my_constraint" support, so
> > you can name (exactly one) constraint by name. Particularly useful for
> > unique constraints. I really don't want to make this accept multiple
> > constraints, even though we may infer multiple constraints, because
> > messy, and is probably too complex to every be put to good use
> > (bearing in mind that exclusion constraints, that really need this,
> > will still only be supported by the IGNORE/DO NOTHING variant).
>
> I still think that constraints should never be named in the syntax.

I guess I don't see a particular problem with that..?  Perhaps I'm
missing something, but if there's multiple ways for something to
conflict, it might be nice to be able to differentiate between them?
Then again, I'm not sure if that's what the intent here is.
Thanks!
    Stephen

Re: INSERT ... ON CONFLICT syntax issues

От
Robert Haas
Дата:
On Wed, Apr 29, 2015 at 3:13 PM, Stephen Frost <sfrost@snowman.net> wrote:
> My general feeling is "keep it short" but I'm not particular beyond
> that.  I do like the idea that we'll be able to support more options in
> the future.

Yeah.  To me "ON CONFLICT" sounds like "if there's a war, then...".
So I like ON DUPLICATE, which I think is clear as crystal.  But if we
settle on something else I won't cry myself to sleep.

>> > * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias
>> > TARGET.*). Those seem fine to me as well.
>>
>> There seem to be a few votes for NEW and OLD.  That's what I proposed
>> originally, and (surprise, surprise) I still like that better too.
>
> I was promoting NEW/OLD, until I realized that we'd end up having a
> problem in trigger functions because NEW/OLD are already defined there,
> unless you have a suggestion for how to improve on that?

I don't.  That's a good point.

>> > * Finally, add "ON CONFLICT ON CONSTRAINT my_constraint" support, so
>> > you can name (exactly one) constraint by name. Particularly useful for
>> > unique constraints. I really don't want to make this accept multiple
>> > constraints, even though we may infer multiple constraints, because
>> > messy, and is probably too complex to every be put to good use
>> > (bearing in mind that exclusion constraints, that really need this,
>> > will still only be supported by the IGNORE/DO NOTHING variant).
>>
>> I still think that constraints should never be named in the syntax.
>
> I guess I don't see a particular problem with that..?  Perhaps I'm
> missing something, but if there's multiple ways for something to
> conflict, it might be nice to be able to differentiate between them?
> Then again, I'm not sure if that's what the intent here is.

So, with unique indexes, people can create an index concurrently, then
drop the old index concurrently, and nothing breaks.  I don't think we
have a similar capacity for constraints at the moment, but we should.
When somebody does that dance, the object names change, but all of the
DML keeps working.  That's a property I'd like to preserve.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On 2015-04-29 15:31:59 -0400, Robert Haas wrote:
> On Wed, Apr 29, 2015 at 3:13 PM, Stephen Frost <sfrost@snowman.net> wrote:
> >> I still think that constraints should never be named in the syntax.
> >
> > I guess I don't see a particular problem with that..?  Perhaps I'm
> > missing something, but if there's multiple ways for something to
> > conflict, it might be nice to be able to differentiate between them?
> > Then again, I'm not sure if that's what the intent here is.
> 
> So, with unique indexes, people can create an index concurrently, then
> drop the old index concurrently, and nothing breaks.  I don't think we
> have a similar capacity for constraints at the moment, but we should.
> When somebody does that dance, the object names change, but all of the
> DML keeps working.  That's a property I'd like to preserve.

On the other hand it's way more convenient to specify a single
constraint name than several columns and a predicate. I'm pretty sure
there's situations where I a) rather live with a smaller chance of error
during a replacement of the constraint b) if we get concurrently
replaceable constraints the naming should be doable too.

I don't see your argument strong enough to argue against allowing this
*as an alternative*.

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Simon Riggs
Дата:
On 25 April 2015 at 14:05, Peter Geoghegan <pg@heroku.com> wrote:
 
> a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
>    if we, at some later point, also want to handle other kind of
>    violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...

I think that naming unique violations alone would be wrong (not to
mention ludicrously verbose). Heikki and I both feel that the CONFLICT
keyword captures the fact that this could be a dup violation, or an
exclusion violation. The syntax has been like this for some time, and
hasn't been a point of contention for a long time, so I thought this
was settled.

I dislike the way that ignoring objections for a period leads them to be potentially discarded. I'd prefer to think that as a community we are able to listen to people even when they aren't continually present to reinforce the original objection(s).

Not supporting MySQL syntax will seem like a bizarre choice to people watching this from a distance. I accept that the patch implements useful behaviour that MySQL does not implement and we thus provide enhanced syntax, but the default should be match on PK using the MySQL syntax.
 
Note that the syntax is quite similar to the SQLite
syntax of the same feature, that has ON CONFLICT IGNORE (it also has
ON CONFLICT REPLACE, but not ON CONFLICT UPDATE).

Why are we not also supporting ON CONFLICT REPLACE and IGNORE then?

If we are using syntax from other products then it should be identical syntax, or the argument to use it doesn't stand.

We must think about what SQL Standard people are likely to say and do. If we act as independently, our thought may be ignored. If we act in support of other previous implementations we may draw support to adopt that as the standard. Whatever the standard says we will eventually support, so we should be acting with an eye to that future.
 
--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Wed, Apr 29, 2015 at 4:09 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> I dislike the way that ignoring objections for a period leads them to be
> potentially discarded. I'd prefer to think that as a community we are able
> to listen to people even when they aren't continually present to reinforce
> the original objection(s).

What objection was ignored? Because, I looked just now, and the only
thing I could find of substance that you said on the MySQL syntax [1]
seemed pretty lukewarm. You mostly argued for MERGE.

> Not supporting MySQL syntax will seem like a bizarre choice to people
> watching this from a distance. I accept that the patch implements useful
> behaviour that MySQL does not implement and we thus provide enhanced syntax,
> but the default should be match on PK using the MySQL syntax.

Does it?

We can't use the MERGE syntax, because this isn't MERGE. Everything
else UPSERT-like some new and distinct custom syntax, for various
reasons.

>> Note that the syntax is quite similar to the SQLite
>> syntax of the same feature, that has ON CONFLICT IGNORE (it also has
>> ON CONFLICT REPLACE, but not ON CONFLICT UPDATE).
>
>
> Why are we not also supporting ON CONFLICT REPLACE and IGNORE then?

The point I was trying to make was that CONFLICT also appears as a
more general term than "duplicate key" or whatever.

> If we are using syntax from other products then it should be identical
> syntax, or the argument to use it doesn't stand.

I was making a narrow point about the keyword "CONFLICT". Nothing more.

> We must think about what SQL Standard people are likely to say and do. If we
> act as independently, our thought may be ignored. If we act in support of
> other previous implementations we may draw support to adopt that as the
> standard. Whatever the standard says we will eventually support, so we
> should be acting with an eye to that future.

UPSERT seems like exactly the kind of thing that the SQL standard does
not concern itself with. For example, I have a "unique index inference
specification". The SQL standard does not have anything to say about
indexes. I would be extremely surprised if the SQL standard adopted
MySQL's UPSERT thing. They omit the "SET" on the UPDATE, probably to
fix some parser ambiguity issue. While there are some similarities to
what I have here, it's a bit shoddy.

I have requirements coming out of my ears for this patch, Simon. I
think it's odd that you're taking umbrage because I supposedly ignored
something you said 6 months ago.

[1] http://www.postgresql.org/message-id/CA+U5nMK-efLg00FhCWk=ASbET_77iSS87EGDsptq0uKzQdrV6Q@mail.gmail.com
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Stephen Frost
Дата:
* Andres Freund (andres@anarazel.de) wrote:
> On the other hand it's way more convenient to specify a single
> constraint name than several columns and a predicate. I'm pretty sure
> there's situations where I a) rather live with a smaller chance of error
> during a replacement of the constraint b) if we get concurrently
> replaceable constraints the naming should be doable too.
>
> I don't see your argument strong enough to argue against allowing this
> *as an alternative*.

Agreed.
Thanks!
    Stephen

Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Sun, Apr 26, 2015 at 2:22 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> The ability to specify a constraint by name hasn't been implemented, but
> that would read quite naturally as:
>
> INSERT INTO mytable ON CONFLICT ON CONSTRAINT my_constraint UPDATE ...

For the record, I have made this change on Github (Andres and I have
been doing a lot of clean-up. I point this change in particular out
because it's a behavioral change). The INSERT documentation has been
updated to reflect this, and includes an example. This copy of the
documentation is current:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Wed, Apr 29, 2015 at 12:09 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias
>> TARGET.*). Those seem fine to me as well.
>
> There seem to be a few votes for NEW and OLD.  That's what I proposed
> originally, and (surprise, surprise) I still like that better too.

That makes the following valid:
 INSERT INTO distributors (did, dname) VALUES (5, 'Gizmo transglobal') ON CONFLICT (did) DO UPDATE SET dname =
NEW.dnameRETURNING OLD.dname;
 

So you're projecting "OLD.dname" from RETURNING, here -- so "OLD"
refers to the row added back to the relation on update (or perhaps the
row simply inserted). That's pretty bad. I really don't want to add a
kludge to make the target relation have an alias in one context but
not in the other.


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Heikki Linnakangas
Дата:
On 05/05/2015 12:16 AM, Peter Geoghegan wrote:
> On Sun, Apr 26, 2015 at 2:22 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> The ability to specify a constraint by name hasn't been implemented, but
>> that would read quite naturally as:
>>
>> INSERT INTO mytable ON CONFLICT ON CONSTRAINT my_constraint UPDATE ...
>
> For the record, I have made this change on Github ...

Great, thanks.

I'm a bit late to the party as I haven't paid much attention to the 
syntax before, but let me give some comments on this "arbiter index 
inference" thingie.


To recap, there are three variants:

A. INSERT ... ON CONFLICT DO NOTHING

No arbiter is specified. This means that a conflict on any unique or 
exclusion constraint is not allowed (and will do nothing instead). This 
variant is only accepted for DO NOTHING.

B. INSERT ... ON CONFLICT ON <constraint name> DO NOTHING/UPDATE

In this variant, you explicitly specify the constraint by name.

C. INSERT ... ON CONFLICT (<index params>) [WHERE <expression>] DO 
NOTHING/UPDATE

This specifies an index (or indexes, in the corner case that there are 
several identical ones), by listing the columns/expressions and the 
predicate for a partial index. The list of columns and WHERE match the 
syntax for CREATE INDEX.


That's pretty good overall. A few questions:

1. Why is the variant without specifying an index or constraint not 
allowed with DO UPDATE? I agree it might not make much sense, but then 
again, it might. If we're afraid that it's too unsafe to be the 
"default" if you don't specify any constraint, how about allowing it 
with a more verbose "ON CONFLICT ON ANY CONSTRAINT" syntax?

2. Why can't you specify multiple constraints, even though we implicitly 
allow "any" with the first variant?


Finally, a couple of suggestions. It would be pretty handy to allow:

INSERT ... ON CONFLICT ON PRIMARY KEY DO NOTHING/UPDATE

Also, I wonder if we should change the B syntax to be:

INSERT ... ON CONFLICT ON *CONSTRAINT* <constraint name> DO NOTHING/UPDATE

That would allow the syntax can be expanded in the future to specify 
conflicts on other kind of things. The "ON PRIMARY KEY" syntax should be 
unambiguous with out, because PRIMARY is a reserved keyword, but for 
example, we might want to add "ON UNIQUE INDEX <index name>" later.

- Heikki




INSERT ... ON CONFLICT error messages

От
Heikki Linnakangas
Дата:
I've read through all the error messages in the patch. Some comments:

====

postgres=# create table foo (id int4);
CREATE TABLE
postgres=# create unique index foo_y on foo (id) where id > 0;
CREATE INDEX
postgres=# insert into foo values (-1)  on conflict (id) where id > 0 do 
nothing;
ERROR:  inferred arbiter partial unique index's predicate does not cover 
tuple proposed for insertion
DETAIL:  ON CONFLICT inference clause implies that the tuple proposed 
for insertion must be covered by the predicate of partial index "foo_y".

I'm surprised. If the inserted value doesn't match the WHERE clause of 
the constraint, there is clearly no conflict, so I would assume the 
above to work without error.

====

postgres=# create table foo (id int4 primary key, col2 int4, constraint 
col2_unique unique (col2) deferrable);
CREATE TABLE
postgres=# insert into foo values (2)  on conflict on foo_pkey do nothing;
ERROR:  ON CONFLICT is not supported on relations with deferred unique 
constraints/exclusion constraints

Why not? I would understand if the specified arbiter constraint is 
deferred, but why does it matter if one of the other constraints is?

====

postgres=# create table foo (id int4 primary key);
CREATE TABLE
postgres=# begin isolation level serializable;
BEGIN
postgres=# select 1; -- to get the snapshot ?column?
----------        1
(1 row)

<in another session, insert 1>
postgres=# insert into foo values (1)  on conflict on foo_pkey do nothing;
ERROR:  could not serialize access due to concurrent insert or update 
directing alternative ON CONFLICT path

What about a delete? Don't you get a serialization error, if another 
backend deletes the conflicting tuple, and you perform an INSERT based 
on the effects of the deleting transaction, which is not supposed to be 
visible to you?

The error message is quite long. How about just giving the default 
"could not serialize access due to concurrent update" message? I don't 
think the "directing alternative ON CONFLICT path" really helps the user.

====

postgres=# insert into foo values (1), (2)  on conflict on foo_pkey do 
update set id = 2;
ERROR:  ON CONFLICT DO UPDATE command could not lock/update 
self-inserted tuple
HINT:  Ensure that no rows proposed for insertion within the same 
command have duplicate constrained values.

"lock/update" doesn't sound good. If the system knows which it is, 
should say so. But in this case, locking the row just an implementation 
detail. We're performing the UPDATE when that error happens, not locking.

How about:

ERROR:  could not update tuple that was inserted in the same command
HINT:  Ensure that the rows being inserted do not conflict with each other.

BTW, the tuple might be an updated version of an existing tuple, i.e. 
the ON CONFLICT DO UPDATE ... was fired on an earlier row. Not 
necessarily a row that was inserted in the same command. The message is 
technically correct, but you need to understand the difference between 
tuple and a row. Actually, should we avoid using the term "tuple" 
altogether in user-facing errors?

====

postgres=# insert into foo values (1)  on conflict (xmin) do nothing;
ERROR:  system columns may not appear in unique index inference 
specification

"may" is usually not the right word to use in error messages, as it 
implies permission to do something (see style guide). It's not totally 
inappropriate in this case, but how about:

ERROR:  system columns cannot be used in an ON CONFLICT clause

====

postgres=# create table foo (id int4, constraint xcheck CHECK (id > 
0));CREATE TABLE
postgres=# insert into foo values (1) on conflict on xcheck do nothing;
ERROR:  constraint in ON CONFLICT clause has no associated index

How about:

ERROR:  "xcheck" is a CHECK constraint
DETAIL:  Only unique or exclusion constraints can be used in ON CONFLICT 
ON clause.

That's assuming that "xcheck" actually is a CHECK constraint. Similarly 
for a foreign key constraint.

====

postgres=# create table foo (id int4, constraint foo_x exclude using 
gist (id with =) );
CREATE TABLE
postgres=# insert into foo values (1)  on conflict on foo_x do update 
set id=-1;
ERROR:  "foo_x" is not a unique index
DETAIL:  ON CONFLICT DO UPDATE requires unique index as arbiter.

I think it would be better to refer to the constraint here, rather than 
the index. The user specified the constraint, the fact that it's backed 
by an index is just an implementation detail.

Actually, the user specified an exclusion constraint, and expected DO 
UPDATE to work with that. But we don't support that. So we should say:

ERROR:  ON CONFLICT ... DO UPDATE not supported with exclusion constraints

====

postgres=# create table foo (id int4 primary key, t text);
CREATE TABLE
postgres=# insert into foo values (1)  on conflict (t) do nothing;
ERROR:  could not infer which unique index to use from 
expressions/columns and predicate provided for ON CONFLICT

I think we should assume that the user listed the columns correctly, but 
there is no constraint on them. The current message implies that the 
list of columns was wrong. Either case is possible, of course, but I'd 
suggest:

ERROR: there is no unique or exclusion constraint matching the ON 
CONFLICT specification

====

postgres=# insert into foo values (1)  on conflict (t nulls first) do 
nothing;
ERROR:  ON CONFLICT does not accept ordering or NULLS FIRST/LAST 
specifications
LINE 1: insert into foo values (1)  on conflict (t nulls first) do n...
^
HINT:  These factors do not affect uniqueness of indexed datums.

I'd suggest:

ERROR:  NULLS FIRST/LAST is not allowed in ON CONFLICT clause

and just leave out the hint. Or say something along the lines "You can 
just leave it out".

Any chance the errlocation could point to the actual NULLS FIRST/LAST 
clause, not the paren starting the column list? And ERRCODE_SYNTAX_ERROR 
would be more appropriate for this.

====

postgres=# insert into foo values (1)  on conflict do update set count = 
excluded.count + 1;
ERROR:  ON CONFLICT DO UPDATE requires explicit arbiter index
LINE 1: insert into foo values (1)  on conflict do update set count ...                                    ^

"Hmm, so where do I stick the index then?" or "But I just created the 
index! Why can't the system find it?"

Can we avoid exposing the term "arbiter index" to the user? From a 
user's point of view, he cannot specify an index directly. He can 
specify a list of columns and an optional WHERE clause, or a constraint 
name.

How about:

ERROR:  DO UPDATE requires a constraint name or index specification
HINT:  For example, ON CONFLICT ON <constraint> or ON CONFLICT (<column>)

====

postgres=# insert into pg_roles values ('myrole')  on conflict do nothing;
ERROR:  ON CONFLICT not supported with catalog relations
LINE 1: insert into pg_roles values ('myrole')  on conflict do nothi...
^

I think the more common term used in error messages is "system catalog 
table"

====

postgres=# create table foo (id int4 primary key);
CREATE TABLE
postgres=# create rule insert_foo as on insert to foo do also insert 
into bar values (-1);
CREATE RULE
postgres=# insert into foo values (1)  on conflict on foo_pkey do update 
set id=1;
ERROR:  INSERT with ON CONFLICT clause may not target relation with 
INSERT or UPDATE rules

That's not strictly true. A "DO INSTEAD NOTHING" rule doesn't prevent ON 
CONFLICT from working, for example. Not sure how to phrase that into the 
message without making it too long.

"may" is again not very appropriate here. Should be "cannot". And the 
term "target relation" might not be very clear to an average user.

====

Phew, that was quite a list..

- Heikki




Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On 2015-05-04 14:16:42 -0700, Peter Geoghegan wrote:
> On Sun, Apr 26, 2015 at 2:22 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> > The ability to specify a constraint by name hasn't been implemented, but
> > that would read quite naturally as:
> >
> > INSERT INTO mytable ON CONFLICT ON CONSTRAINT my_constraint UPDATE ...
> 
> For the record, I have made this change on Github

Theoretically this changes the pictures for FDWs, right? Right now
there's
+    <para>
+     <command>INSERT</> with an <literal>ON CONFLICT</> clause is not
+     supported with a unique index inference specification, since a
+     conflict arbitrating unique index cannot meaningfully be inferred
+     on a foreign table (this implies that <literal>ON CONFLICT DO
+     UPDATE</> is never supported, since the specification is
+     mandatory there).
+    </para>
but theoretically the constraint name could be meaningful on the other
side...

I don't think this is anyting for 9.5, but it might be interesting for
later.

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Tue, May 5, 2015 at 9:36 AM, Andres Freund <andres@anarazel.de> wrote:
> Theoretically this changes the pictures for FDWs, right? Right now
> there's
> +    <para>
> +     <command>INSERT</> with an <literal>ON CONFLICT</> clause is not
> +     supported with a unique index inference specification, since a
> +     conflict arbitrating unique index cannot meaningfully be inferred
> +     on a foreign table (this implies that <literal>ON CONFLICT DO
> +     UPDATE</> is never supported, since the specification is
> +     mandatory there).
> +    </para>
> but theoretically the constraint name could be meaningful on the other
> side...

Well, the inference clause could be too -- in that sense, the
constraint name isn't special at all. But you need to invent a way of
making the optimizer infer an index on the foreign side (and even with
a named constraint, we go from constraint Oid in the parser to
pg_index Oid in the optimizer, so it's a similar process to regular
inference). Of course, teaching the optimizer about foreign indexes is
a whole new infrastructure.

Note that this really is the explanation for why postgres_fdw only has
limited support. Sure, I haven't added deparsing logic for ON CONFLICT
UPDATE, but it would be pretty easy to do so, and that isn't the
blocker at all.

> I don't think this is anyting for 9.5, but it might be interesting for
> later.

Sure.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On 2015-05-05 15:27:09 +0300, Heikki Linnakangas wrote:
> I'm a bit late to the party as I haven't paid much attention to the syntax
> before, but let me give some comments on this "arbiter index inference"
> thingie.
> 
> 
> To recap, there are three variants:
> 
> A. INSERT ... ON CONFLICT DO NOTHING
> 
> No arbiter is specified. This means that a conflict on any unique or
> exclusion constraint is not allowed (and will do nothing instead). This
> variant is only accepted for DO NOTHING.
> 
> B. INSERT ... ON CONFLICT ON <constraint name> DO NOTHING/UPDATE
> 
> In this variant, you explicitly specify the constraint by name.

I do think it's a bit sad to not be able to specify unique indexes that
aren't constraints. So I'd like to have a corresponding ON INDEX - which
would be trivial.

> C. INSERT ... ON CONFLICT (<index params>) [WHERE <expression>] DO
> NOTHING/UPDATE
> 
> This specifies an index (or indexes, in the corner case that there are
> several identical ones), by listing the columns/expressions and the
> predicate for a partial index. The list of columns and WHERE match the
> syntax for CREATE INDEX.
> 
> 
> That's pretty good overall. A few questions:
> 
> 1. Why is the variant without specifying an index or constraint not allowed
> with DO UPDATE? I agree it might not make much sense, but then again, it
> might. If we're afraid that it's too unsafe to be the "default" if you don't
> specify any constraint, how about allowing it with a more verbose "ON
> CONFLICT ON ANY CONSTRAINT" syntax?

I think that'd be useful. Peter seems to be against it on pureness
grounds when we argued against it before, but I know that I'd wished for
it before.

> 2. Why can't you specify multiple constraints, even though we implicitly
> allow "any" with the first variant?

Yea.

> Finally, a couple of suggestions. It would be pretty handy to allow:
> 
> INSERT ... ON CONFLICT ON PRIMARY KEY DO NOTHING/UPDATE

Not sure if that really has that big of a use case, but it'd also be
simple.

> Also, I wonder if we should change the B syntax to be:
> 
> INSERT ... ON CONFLICT ON *CONSTRAINT* <constraint name> DO NOTHING/UPDATE

Oh yes.

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Tue, May 5, 2015 at 5:27 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> To recap, there are three variants:
>
> A. INSERT ... ON CONFLICT DO NOTHING
>
> No arbiter is specified. This means that a conflict on any unique or
> exclusion constraint is not allowed (and will do nothing instead). This
> variant is only accepted for DO NOTHING.
>
> B. INSERT ... ON CONFLICT ON <constraint name> DO NOTHING/UPDATE
>
> In this variant, you explicitly specify the constraint by name.
>
> C. INSERT ... ON CONFLICT (<index params>) [WHERE <expression>] DO
> NOTHING/UPDATE
>
> This specifies an index (or indexes, in the corner case that there are
> several identical ones), by listing the columns/expressions and the
> predicate for a partial index. The list of columns and WHERE match the
> syntax for CREATE INDEX.

I would just say that there are two variants, only one of which
mandates the inference clause. But I'm nitpicking.

> That's pretty good overall. A few questions:

Thanks. I'm glad that we are now able to cover really any conceivable
use-case, while playing nice with every existing feature (now
updatable views are supported with ON CONFLICT DO UPDATE -- and we're
also going to be able to suppor subqueries in the UPDATE). We've been
incredibly thorough.

> 1. Why is the variant without specifying an index or constraint not allowed
> with DO UPDATE? I agree it might not make much sense, but then again, it
> might. If we're afraid that it's too unsafe to be the "default" if you don't
> specify any constraint, how about allowing it with a more verbose "ON
> CONFLICT ON ANY CONSTRAINT" syntax?

I think it's dangerous. It's basically wrong headed to omit any
constraint for DO UPDATE. I put a lot of effort into covering every
possible case with the inference clause, and I think it's pretty cool
that we have something that's so flexible. I don't feel bad about
forcing users to be explicit about what they want, because the only
conceivable downside is that they'll have to do a little extra typing
(if even that - it's probably going to be ORM-generated more often
than not). The upside -- not having their query break unexpectedly one
day, when a new constraint is added -- is huge.

> 2. Why can't you specify multiple constraints, even though we implicitly
> allow "any" with the first variant?

It's just an escape hatch. I don't want to encourage its over use, and
I want to keep the grammar simple.

> Finally, a couple of suggestions. It would be pretty handy to allow:
>
> INSERT ... ON CONFLICT ON PRIMARY KEY DO NOTHING/UPDATE

Is that really likely to be less verbose than naming the attributes directly?

> INSERT ... ON CONFLICT ON *CONSTRAINT* <constraint name> DO NOTHING/UPDATE
>
> That would allow the syntax can be expanded in the future to specify
> conflicts on other kind of things. The "ON PRIMARY KEY" syntax should be
> unambiguous with out, because PRIMARY is a reserved keyword, but for
> example, we might want to add "ON UNIQUE INDEX <index name>" later.

Okay, we can do that.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Wed, May 6, 2015 at 7:53 AM, Andres Freund <andres@anarazel.de> wrote:
>> In this variant, you explicitly specify the constraint by name.
>
> I do think it's a bit sad to not be able to specify unique indexes that
> aren't constraints. So I'd like to have a corresponding ON INDEX - which
> would be trivial.

Then what's the point of having ON CONSTRAINT? The point of it working
that way was we're not exposing the "implementation detail" of the
index. While I happen to think that that's a distinction without a
difference anyway, that certainly was the idea.

I would care about the fact that you can't name a unique index with no
constraint if there wasn't already a way of doing that with inference
(I'm thinking in particular of partial indexes here, which never have
constraints). But there is. So what's the problem?
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On 2015-05-06 13:05:16 -0700, Peter Geoghegan wrote:
> On Wed, May 6, 2015 at 7:53 AM, Andres Freund <andres@anarazel.de> wrote:
> >> In this variant, you explicitly specify the constraint by name.
> >
> > I do think it's a bit sad to not be able to specify unique indexes that
> > aren't constraints. So I'd like to have a corresponding ON INDEX - which
> > would be trivial.
> 
> Then what's the point of having ON CONSTRAINT?

That it supports exclusion constraints?

> I would care about the fact that you can't name a unique index with no
> constraint if there wasn't already a way of doing that with inference
> (I'm thinking in particular of partial indexes here, which never have
> constraints). But there is. So what's the problem?

Personally I think a complex expression index is something many people
will not want to specify every time. And since partial/expression
indexes can't even have constraints...

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Wed, May 6, 2015 at 1:22 PM, Andres Freund <andres@anarazel.de> wrote:
> That it supports exclusion constraints?

But so does just naming the index. I don't think it's significant that
exclusion operators are in pg_constraint -- you could just as easily
name the index, since that's all you ultimately end up with anyway.

>> I would care about the fact that you can't name a unique index with no
>> constraint if there wasn't already a way of doing that with inference
>> (I'm thinking in particular of partial indexes here, which never have
>> constraints). But there is. So what's the problem?
>
> Personally I think a complex expression index is something many people
> will not want to specify every time. And since partial/expression
> indexes can't even have constraints...

The downsides seem severe.  A CREATE INDEX CONCURRENTLY just broke
your statement, because you didn't account for the new, equivalent
unique index during inference, and now you have to drop the old index
and break application code. Is that really worth introducing just to
prevent app devs from writing the inference specification? The
specification explicitly documents their intent, which seems like a
good thing.

Robert really disliked the idea of even naming the constraint, let
alone the index. I'm trying to balance things, here.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Heikki Linnakangas
Дата:
Andres pointed out on IM that the TARGET alias is a bit crummy. In 
particular, adding an ON CONFLICT DO UPDATE can make a RETURNING clause 
invalid, because we change the alias of the target rel:

create table foo (id int4 primary key, t text);

This works:

postgres=# insert into foo (id, t) values (1, 'x') returning foo.t; t
--- x
(1 row)

INSERT 0 1

Same statement with ON CONFLICT DO UPDATE fails:

postgres=# insert into foo (id, t) values (1, 'x') on conflict (id) do 
update set t = 'x' returning foo.t;
ERROR:  invalid reference to FROM-clause entry for table "foo"
LINE 1: ...'x') on conflict (id) do update set t = 'x' returning foo.t;
               ^
 
HINT:  Perhaps you meant to reference the table alias "target".

I'll see about fixing that. It's not just a matter of creating another 
alias for the same rel, I'm afraid: "foo.t" is supposed to refer to the 
tuple that we attempted to insert, like it does without the ON CONFLICT.

But actually, I don't much like the "target" alias in the first place. 
We never really completed this discussion, everyone just got tired:

On 04/29/2015 10:13 PM, Stephen Frost wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> On Mon, Apr 27, 2015 at 7:21 PM, Peter Geoghegan <pg@heroku.com> wrote:
>>> * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias
>>> TARGET.*). Those seem fine to me as well.
>>
>> There seem to be a few votes for NEW and OLD.  That's what I proposed
>> originally, and (surprise, surprise) I still like that better too.
>
> I was promoting NEW/OLD, until I realized that we'd end up having a
> problem in trigger functions because NEW/OLD are already defined there,
> unless you have a suggestion for how to improve on that?

Reading through this sub-thread, these spellings have been proposed:

1. TARGET and EXCLUDED

2. NEW and EXISTING

3. NEW and OLD

4. PROPOSED and EXISTING

5. CONFLICTING and EXISTING

Did I miss any? Now, let me opine on these.

EXCLUDED seems fine to me. I don't see us using that term elsewhere, and 
it makes me think of exclusion constraints, but nevertheless I think 
it's pretty easy remember what it means. TARGET, however, is totally 
inscrutable. Peter argued earlier that:

> TARGET is also very descriptive, because it situationally describes
> either the existing tuple actually present in the table, or (from a
> RETURNING clause) the final tuple present in the table post-UPDATE.
> We use the term "target" for that pervasively (in the docs and in the
> code).

but I find that totally unconvincing. It's clear that TARGET refers to 
the table being upserted, but it's totally unclear on *which* version of 
the tuple it refers to.

NEW and OLD are pretty good. Like in an UPDATE trigger, NEW refers to 
the version after the UPDATE, and OLD to the version before. However, 
there's the serious problem that in a trigger function, OLD/NEW are 
already in use. How bad is that? At least in PL/pgSQL you can work 
around it by aliasing the variables, but it's a bit inconvenient. How 
often would INSERT .. ON CONFLICT DO UPDATE be used in a trigger?

I don't have much to say about the rest. PROPOSED, EXISTING, 
CONFLICTING, they're all fairly descriptive, but long.

- Heikki




Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On 2015-05-06 13:37:07 -0700, Peter Geoghegan wrote:
> On Wed, May 6, 2015 at 1:22 PM, Andres Freund <andres@anarazel.de> wrote:
> > That it supports exclusion constraints?
> 
> But so does just naming the index. I don't think it's significant that
> exclusion operators are in pg_constraint -- you could just as easily
> name the index, since that's all you ultimately end up with anyway.

The index name for constraints is generated and not trivially safely
guessable for a user.

> >> I would care about the fact that you can't name a unique index with no
> >> constraint if there wasn't already a way of doing that with inference
> >> (I'm thinking in particular of partial indexes here, which never have
> >> constraints). But there is. So what's the problem?
> >
> > Personally I think a complex expression index is something many people
> > will not want to specify every time. And since partial/expression
> > indexes can't even have constraints...
> 
> The downsides seem severe.  A CREATE INDEX CONCURRENTLY just broke
> your statement, because you didn't account for the new, equivalent
> unique index during inference, and now you have to drop the old index
> and break application code. Is that really worth introducing just to
> prevent app devs from writing the inference specification? The
> specification explicitly documents their intent, which seems like a
> good thing.

I don't find that all that severe. It might just as well be the case
that the new unique constraint isn't intended to be handled by ON
CONFLICT. And having a inference specification that's longer than the
rest of the statement surely isn't helpful.

> Robert really disliked the idea of even naming the constraint, let
> alone the index. I'm trying to balance things, here.

I fail to see what doing something halfhearted helps.

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Wed, May 6, 2015 at 1:48 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> TARGET is also very descriptive, because it situationally describes
>> either the existing tuple actually present in the table, or (from a
>> RETURNING clause) the final tuple present in the table post-UPDATE.
>> We use the term "target" for that pervasively (in the docs and in the
>> code).
>
>
> but I find that totally unconvincing. It's clear that TARGET refers to the
> table being upserted, but it's totally unclear on *which* version of the
> tuple it refers to.

Then we're simply talking about 2 different things. My understanding
is that it *is* the relation. And like UPDATE's RETURNING, it will be
the same relation/alias but a different tuple here. Andres said this
was a mutating tuple or something like that, and I suppose it is. But
Vars are variables.

Now, Andres (and now you) want to change it so that the TARGET alias
becomes magical and expression-like, so that it really does refer to a
tuple and not a relation (and so is closer to EXCLUDED.*). And you
seem pretty set on that. That being the case, clearly TARGET is
unsuitable for the reasons you state.

I suppose that it doesn't much matter, but that's how I understood the
situation all along. So I can see why you don't like "TARGET" in light
of that. I would vote for EXISTING as an alternative, given that it's
pretty clear that what is now TARGET.* will become a magic
alias/expression thing. EXISTING is the EXISTING tuple, which goes
well with EXCLUDED.
-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On 2015-05-06 23:48:18 +0300, Heikki Linnakangas wrote:
> I'll see about fixing that. It's not just a matter of creating another alias
> for the same rel, I'm afraid: "foo.t" is supposed to refer to the tuple that
> we attempted to insert, like it does without the ON CONFLICT.

I'm not sure what you mean here?

> But actually, I don't much like the "target" alias in the first place. We
> never really completed this discussion, everyone just got tired:

Right. But that doesn't affect the "it's not just a matter of ..." bit
above, right?

> Reading through this sub-thread, these spellings have been proposed:
> 
> 1. TARGET and EXCLUDED
> 
> 2. NEW and EXISTING
> 
> 3. NEW and OLD
> 
> 4. PROPOSED and EXISTING
> 
> 5. CONFLICTING and EXISTING
> 
> Did I miss any? Now, let me opine on these.

How about
6. The tablename and EXCLUDED? Possibility with the ability to specify  an AS for INSERT INTO foo AS whatever?

From an implementation pov that'd be simple ;)

> NEW and OLD are pretty good. Like in an UPDATE trigger, NEW refers to the
> version after the UPDATE, and OLD to the version before. However, there's
> the serious problem that in a trigger function, OLD/NEW are already in use.
> How bad is that? At least in PL/pgSQL you can work around it by aliasing the
> variables, but it's a bit inconvenient. How often would INSERT .. ON
> CONFLICT DO UPDATE be used in a trigger?

I personally think it's a killer. It'll be very annoying to understand
mistaken usage of NEW/OLD in that case.

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Wed, May 6, 2015 at 2:01 PM, Andres Freund <andres@anarazel.de> wrote:
> How about
> 6. The tablename and EXCLUDED? Possibility with the ability to specify
>    an AS for INSERT INTO foo AS whatever?
>
> From an implementation pov that'd be simple ;)

That's what I wanted to do when I realized what Andres wanted to do
with the TARGET alias. Clearly that would compel us to actually make
the RETURNING clause buy into this alias, just as with a regular
UPDATE. And not having the alias on the target also be magical seems
like a good thing. Nothing can be broken by this scheme. No?

>> NEW and OLD are pretty good. Like in an UPDATE trigger, NEW refers to the
>> version after the UPDATE, and OLD to the version before. However, there's
>> the serious problem that in a trigger function, OLD/NEW are already in use.
>> How bad is that? At least in PL/pgSQL you can work around it by aliasing the
>> variables, but it's a bit inconvenient. How often would INSERT .. ON
>> CONFLICT DO UPDATE be used in a trigger?
>
> I personally think it's a killer. It'll be very annoying to understand
> mistaken usage of NEW/OLD in that case.

+1

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Heikki Linnakangas
Дата:
On 05/06/2015 11:05 PM, Peter Geoghegan wrote:
> On Wed, May 6, 2015 at 7:53 AM, Andres Freund <andres@anarazel.de> wrote:
>>> In this variant, you explicitly specify the constraint by name.
>>
>> I do think it's a bit sad to not be able to specify unique indexes that
>> aren't constraints. So I'd like to have a corresponding ON INDEX - which
>> would be trivial.
>
> Then what's the point of having ON CONSTRAINT? The point of it working
> that way was we're not exposing the "implementation detail" of the
> index. While I happen to think that that's a distinction without a
> difference anyway, that certainly was the idea.

Right, that's the idea. Indexes are just an implementation detail - 
conceivably you could have a constraint that's backed by some other 
mechanism. You should not embed implementation details like index names 
in your queries.

Unfortunately you can't create a "partial constraint" - you'll have to 
create a partial index. I wish we would fix that directly, by allowing 
partial unique constraints.

That said, I wouldn't necessarily be opposed to also having the syntax 
to name an index directly, as long as we had some notices in the docs to 
tell people to avoid it.

- Heikki




Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On 2015-05-07 00:10:22 +0300, Heikki Linnakangas wrote:
> Right, that's the idea. Indexes are just an implementation detail -

I think that's a distinction just about no user out there cares about.

> Unfortunately you can't create a "partial constraint" - you'll have to
> create a partial index. I wish we would fix that directly, by allowing
> partial unique constraints.

It's not just partial ones, it's also expression ones, right?

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Heikki Linnakangas
Дата:
On 05/07/2015 12:18 AM, Andres Freund wrote:
> On 2015-05-07 00:10:22 +0300, Heikki Linnakangas wrote:
>> Right, that's the idea. Indexes are just an implementation detail -
>
> I think that's a distinction just about no user out there cares about.
>
>> Unfortunately you can't create a "partial constraint" - you'll have to
>> create a partial index. I wish we would fix that directly, by allowing
>> partial unique constraints.
>
> It's not just partial ones, it's also expression ones, right?

True.

- Heikki




Re: INSERT ... ON CONFLICT syntax issues

От
Heikki Linnakangas
Дата:
On 05/07/2015 12:01 AM, Andres Freund wrote:
> On 2015-05-06 23:48:18 +0300, Heikki Linnakangas wrote:
>> I'll see about fixing that. It's not just a matter of creating another alias
>> for the same rel, I'm afraid: "foo.t" is supposed to refer to the tuple that
>> we attempted to insert, like it does without the ON CONFLICT.
>
> I'm not sure what you mean here?

Sorry, forget about that. I was confused and mixed up EXCLUDED and 
TARGET. Looks like they really aren't very good names :-).

- Heikki




Re: INSERT ... ON CONFLICT syntax issues

От
Geoff Winkless
Дата:
On 6 May 2015 at 22:30, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 05/07/2015 12:01 AM, Andres Freund wrote:
On 2015-05-06 23:48:18 +0300, Heikki Linnakangas wrote:
I'll see about fixing that. It's not just a matter of creating another alias
for the same rel, I'm afraid: "foo.t" is supposed to refer to the tuple that
we attempted to insert, like it does without the ON CONFLICT.

I'm not sure what you mean here?

Sorry, forget about that. I was confused and mixed up EXCLUDED and TARGET. Looks like they really aren't very good names :-).

Could
INSERT.column 
​and ​
CONFLICT.column work? 

So INSERT is the row that you were inserting, and CONFLICT is the row with which it conflicted?
Geoff

Re: INSERT ... ON CONFLICT syntax issues

От
Heikki Linnakangas
Дата:
On 05/07/2015 12:01 AM, Andres Freund wrote:
> How about
> 6. The tablename and EXCLUDED? Possibility with the ability to specify
>     an AS for INSERT INTO foo AS whatever?
>
>  From an implementation pov that'd be simple ;)

I did this, because as you say it's simple to implement, and it resolves 
the problem with RETURNING.

BTW, it's worth noting that the <tablename>.col (or TARGET.col before) 
means different things in the DO UPDATE clause, and in RETURNING. 
Consider this example:

postgres=# create table foo (id int4 primary key, t text);
CREATE TABLE
postgres=# insert into foo values (1, 'original');
INSERT 0 1
postgres=# insert into foo values (1, 'inserted') on conflict (id) do 
update set t = excluded.t || foo.t returning foo.t;        t
------------------ insertedoriginal
(1 row)


In the DO UPDATE, foo.t was 'original', but in the RETURNING, it was 
'insertedoriginal'. That's what I was thinking yesterday, when I said 
that it's not straightforward to just replace "target" with 
"<tablename>", but got confused. This isn't new, however; it works the 
same in a normal UPDATE RETURNING.

- Heikki



Re: INSERT ... ON CONFLICT syntax issues

От
Heikki Linnakangas
Дата:
On 05/07/2015 12:01 AM, Andres Freund wrote:
> 6. The tablename and EXCLUDED? Possibility with the ability to specify
>     an AS for INSERT INTO foo AS whatever?

If we don't allow "AS whatever", and you create a table called 
"excluded", you're stuck with the ambiguity in the DO UPDATE statement 
as you can't alias either one. So we have to add support for "INSERT 
INTO foo AS whatever", if we go with <tablename> and EXCLUDED.

Does anyone see a problem with "INSERT INTO foo AS whatever"? It seems 
pretty straightforward to implement.

If we don't use <tablename> and go back to TARGET/EXCLUDED or whatever 
we call them, they must only be visible to the UPDATE statement, not 
RETURNING. Otherwise we're back to square one.

- Heikki




Re: INSERT ... ON CONFLICT syntax issues

От
Andres Freund
Дата:
On 2015-05-07 16:15:18 +0300, Heikki Linnakangas wrote:
> On 05/07/2015 12:01 AM, Andres Freund wrote:
> >6. The tablename and EXCLUDED? Possibility with the ability to specify
> >    an AS for INSERT INTO foo AS whatever?
> 
> If we don't allow "AS whatever", and you create a table called "excluded",
> you're stuck with the ambiguity in the DO UPDATE statement as you can't
> alias either one. So we have to add support for "INSERT INTO foo AS
> whatever", if we go with <tablename> and EXCLUDED.
> 
> Does anyone see a problem with "INSERT INTO foo AS whatever"? It seems
> pretty straightforward to implement.

I don't see a problem at all, with one exception: If we want the AS to
be optional like in a bunch of other places, we have to either promote
VALUES to a reserved keyword, only accept unreserved keywords, or play
precedence games. I think it'd be perfectly fine to not make AS
optional.

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT syntax issues

От
Peter Geoghegan
Дата:
On Thu, May 7, 2015 at 10:37 AM, Andres Freund <andres@anarazel.de> wrote:
> I don't see a problem at all, with one exception: If we want the AS to
> be optional like in a bunch of other places, we have to either promote
> VALUES to a reserved keyword, only accept unreserved keywords, or play
> precedence games. I think it'd be perfectly fine to not make AS
> optional.

+1. In any case, we can't very well not support having any alias on the target.


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT error messages

От
Peter Geoghegan
Дата:
On Tue, May 5, 2015 at 8:52 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> I've read through all the error messages in the patch. Some comments:

I'll work through this feedback.

> postgres=# create table foo (id int4);
> CREATE TABLE
> postgres=# create unique index foo_y on foo (id) where id > 0;
> CREATE INDEX
> postgres=# insert into foo values (-1)  on conflict (id) where id > 0 do
> nothing;
> ERROR:  inferred arbiter partial unique index's predicate does not cover
> tuple proposed for insertion
> DETAIL:  ON CONFLICT inference clause implies that the tuple proposed for
> insertion must be covered by the predicate of partial index "foo_y".
>
> I'm surprised. If the inserted value doesn't match the WHERE clause of the
> constraint, there is clearly no conflict, so I would assume the above to
> work without error.

I'm not particularly attached to that behavior. I could revert it.


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT error messages

От
Peter Geoghegan
Дата:
On Tue, May 5, 2015 at 8:52 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> Why not? I would understand if the specified arbiter constraint is deferred,
> but why does it matter if one of the other constraints is?

This has been fixed since, BTW. We now support relations that happen
to have deferred constraints. The constraint indexes cannot be used as
arbiters, but that makes perfect sense.


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT error messages

От
Andres Freund
Дата:
On 2015-05-07 11:16:12 -0700, Peter Geoghegan wrote:
> On Tue, May 5, 2015 at 8:52 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> > postgres=# create table foo (id int4);
> > CREATE TABLE
> > postgres=# create unique index foo_y on foo (id) where id > 0;
> > CREATE INDEX
> > postgres=# insert into foo values (-1)  on conflict (id) where id > 0 do
> > nothing;
> > ERROR:  inferred arbiter partial unique index's predicate does not cover
> > tuple proposed for insertion
> > DETAIL:  ON CONFLICT inference clause implies that the tuple proposed for
> > insertion must be covered by the predicate of partial index "foo_y".
> >
> > I'm surprised. If the inserted value doesn't match the WHERE clause of the
> > constraint, there is clearly no conflict, so I would assume the above to
> > work without error.
> 
> I'm not particularly attached to that behavior. I could revert it.

Hm. I don't really see a point in allowing it - it seems more likely to
be a mistake by the user, expecting that the insertion now works
conflict free.  But I don't really care much.

Greetings,

Andres Freund



Re: INSERT ... ON CONFLICT error messages

От
Peter Geoghegan
Дата:
On Thu, May 7, 2015 at 11:32 AM, Andres Freund <andres@anarazel.de> wrote:
> Hm. I don't really see a point in allowing it - it seems more likely to
> be a mistake by the user, expecting that the insertion now works
> conflict free.  But I don't really care much.

The only downside I see is that it might not work in preventing the
statement from proceeding. In other words, you might have a partial
unique index style inference clause (i.e. with a predicate) that
infers a non-partial index, because everything else matches, and no
predicate (on the index) satisfies the inference predicate. So it
comes down to the actual definition of indexes, as opposed to the
statement that inferred those indexes, which isn't quite the same
thing.

It is hardly really a defect at all, but it is something I have
considered as a little confusing. However, I doubt that many people
will ever find themselves looking for an explanation for this.

-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT error messages

От
Peter Geoghegan
Дата:
On Thu, May 7, 2015 at 2:14 PM, Peter Geoghegan <pg@heroku.com> wrote:
> The only downside I see is that it might not work in preventing the
> statement from proceeding. In other words, you might have a partial
> unique index style inference clause (i.e. with a predicate) that
> infers a non-partial index, because everything else matches, and no
> predicate (on the index) satisfies the inference predicate. So it
> comes down to the actual definition of indexes, as opposed to the
> statement that inferred those indexes, which isn't quite the same
> thing.

Hmm. I think it might be inconsistent with our position on NULL values
with this feature - which is that insertion will always proceed - to
deny insertion from proceeding here. On reflection, it seems like a
bit of a POLA violation. So I'm going to remove the error.


-- 
Peter Geoghegan



Re: INSERT ... ON CONFLICT syntax issues

От
Geoff Winkless
Дата:
On 7 May 2015 at 18:37, Andres Freund <andres@anarazel.de> wrote:
I don't see a problem at all, with one exception: If we want the AS to
be optional like in a bunch of other places, we have to either promote
VALUES to a reserved keyword, only accept unreserved keywords, or play
precedence games. I think it'd be perfectly fine to not make AS
optional.

Although I've always used "AS"
​in all contexts ​
because I think the language is
​horribly ​
unclear without it, it seems obtuse to
​allow its absence 
in the SQL-conforming parts of the language and not
​elsewhere
.
​Is anyone really using VALUES as a non-keyword? It's reserved in all the SQL standards, which seems like storing up trouble for anyone using it otherwise.

Geoff