Обсуждение: INSERT ... ON CONFLICT syntax issues
I'm separating this discussion out of the thread because I think this needs wider input. On 2015-04-24 19:21:37 -0700, Peter Geoghegan wrote: > I've *provisionally* pushed code that goes back to the old way, > Andres: https://github.com/petergeoghegan/postgres/commit/2a5d80b27d2c5832ad26dde4651c64dd2004f401 > > Perhaps this is the least worst way, after all. I still think it's a bad idea. To recap, the old and current way is: INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) UPDATE|IGNORE My problem with the WHERE being inside the parens in the above is that it's a) different from CREATE INDEX b) unclear whether the WHERE belongs to colb or the whole index expression. The equivalent for aggregates, which I bet isgoing to be used less often, caused a fair amount of confusing. That's why I wanted the WHERE outside the (), which requires either adding DO between the index inference clause, and the action, to avoid ambiguities in the grammar. But I'm generally having some doubts about the syntax. Right now it's INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE. A couple things: a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What if we, at some later point, also want to handleother kind of violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ... b) For me there's a WITH before the index inference clause missing, to have it read in 'SQL' style. c) Right now the UPDATE can refer to pseudo relations 'TARGET' and 'EXCLUDED'. I think especially the latter doesn't fitanymore at all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD? So I guess it boils down to that I think we should switch the syntax to be: INSERT ... ON UNIQUE VIOLATION [WITH (cola, colb) WHERE ...] DO {NOTHING|UPDATE} Greetings, Andres Freund
On Sat, Apr 25, 2015 at 2:01 AM, Andres Freund <andres@anarazel.de> wrote: > My problem with the WHERE being inside the parens in the above is that > it's > a) different from CREATE INDEX I don't think that that's an important goal. > b) unclear whether the WHERE belongs to colb or the whole index > expression. The equivalent for aggregates, which I bet is going to be > used less often, caused a fair amount of confusing. I don't see those two situations as being comparable. The inference specification does not accept aggregates. It seems obvious to me that the predicate only ever applies to the entire table. And it's obvious that it's part of the inference specification because it appears in parentheses with everything else - otherwise, *users* might find this phantom WHERE clause ambiguous/confusing. > But I'm generally having some doubts about the syntax. > > Right now it's > INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE. > > A couple things: > > a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What > if we, at some later point, also want to handle other kind of > violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ... I think that naming unique violations alone would be wrong (not to mention ludicrously verbose). Heikki and I both feel that the CONFLICT keyword captures the fact that this could be a dup violation, or an exclusion violation. The syntax has been like this for some time, and hasn't been a point of contention for a long time, so I thought this was settled. Note that the syntax is quite similar to the SQLite syntax of the same feature, that has ON CONFLICT IGNORE (it also has ON CONFLICT REPLACE, but not ON CONFLICT UPDATE). > b) For me there's a WITH before the index inference clause missing, to > have it read in 'SQL' style. I'm not seeing it. BTW, Robert was the one who initially proposed that the unique index inference clause follow this exact style (albeit before it accepted a WHERE clause to infer partial indexes, which was only added a couple of months ago). > c) Right now the UPDATE can refer to pseudo relations 'TARGET' and > 'EXCLUDED'. I think especially the latter doesn't fit anymore at > all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD? NEW and OLD are terribly misleading, since surely the NEW tuple is the one actually appended to the relation by the UPDATE, and the OLD one is the existing one (not the excluded one). Plus they have all that intellectual baggage from rules. CONFLICTING, as Greg Stark pointed out many months ago, is something that applies to both tuples that can be referenced, which is why I *stopped* using it months ago. They conflict with *each other*. Any conflict must pertain to both. Dictionary.com defines "exclude" as: """ verb (used with object), excluded, excluding. 1. to shut or keep out; prevent the entrance of. 2. to shut out from consideration, privilege, etc.: Employees and their relatives were excluded from participation in the contest. 3. to expel and keep out; thrust out; eject: He was excluded from the club for infractions of the rules. """ Seems pretty descriptive of the situation to me - I actually put a lot of thought into this. Additionally, the word is widely understood by non-native speakers. TARGET is also very descriptive, because it situationally describes either the existing tuple actually present in the table, or (from a RETURNING clause) the final tuple present in the table post-UPDATE. We use the term "target" for that pervasively (in the docs and in the code). > So I guess it boils down to that I think we should switch the syntax to > be: > > INSERT ... ON UNIQUE VIOLATION [WITH (cola, colb) WHERE ...] DO {NOTHING|UPDATE} Beauty is in the eye of the beholder and all, but that seems pretty ugly to me. Honestly, I think we should just accept that the predicate appears in the parentheses on the odd occasion that it appears at all - partial unique indexes are not all that common. -- Peter Geoghegan
On 2015-04-25 11:05:49 -0700, Peter Geoghegan wrote: > On Sat, Apr 25, 2015 at 2:01 AM, Andres Freund <andres@anarazel.de> wrote: > > My problem with the WHERE being inside the parens in the above is that > > it's > > a) different from CREATE INDEX > > I don't think that that's an important goal. Given that it's used to 'match' to indexes, I can't agree. > > b) unclear whether the WHERE belongs to colb or the whole index > > expression. The equivalent for aggregates, which I bet is going to be > > used less often, caused a fair amount of confusing. > > I don't see those two situations as being comparable. The inference > specification does not accept aggregates. Huh? It's pretty much entirely besides the point that inference doesn't accept aggregates. The point is that ORDER BY for aggregates has confused users because it's inside the parens. > > a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What > > if we, at some later point, also want to handle other kind of > > violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ... > > I think that naming unique violations alone would be wrong (not to > mention ludicrously verbose). Why? > The syntax has been like this for some time, and > hasn't been a point of contention for a long time, so I thought this > was settled. I really don't care if it's been that for a long while. This is a not yet commited feature. > > b) For me there's a WITH before the index inference clause missing, to > > have it read in 'SQL' style. > > I'm not seeing it. BTW, Robert was the one who initially proposed that > the unique index inference clause follow this exact style (albeit > before it accepted a WHERE clause to infer partial indexes, which was > only added a couple of months ago). So? I guess I can live with that uglyness; but I'd like somebody else to chime in. > > c) Right now the UPDATE can refer to pseudo relations 'TARGET' and > > 'EXCLUDED'. I think especially the latter doesn't fit anymore at > > all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD? > > NEW and OLD are terribly misleading, since surely the NEW tuple is the > one actually appended to the relation by the UPDATE, and the OLD one > is the existing one (not the excluded one). Plus they have all that > intellectual baggage from rules. What 'intellectual baggage' would that be? That they're already known to have been used in another place? I don't see the problem. How about EXISTING and NEW? > """ > verb (used with object), excluded, excluding. > 1. > to shut or keep out; prevent the entrance of. > 2. > to shut out from consideration, privilege, etc.: > Employees and their relatives were excluded from participation in the contest. > 3. > to expel and keep out; thrust out; eject: > He was excluded from the club for infractions of the rules. > """ > > Seems pretty descriptive of the situation to me - I actually put a lot > of thought into this. Additionally, the word is widely understood by > non-native speakers. TARGET is also very descriptive, because it > situationally describes either the existing tuple actually present in > the table, or (from a RETURNING clause) the final tuple present in the > table post-UPDATE. We use the term "target" for that pervasively (in > the docs and in the code). Sorry, I don't buy either argument. EXISTING and NEW would surely at least as widely understood than EXCLUDE and TARGET. The latter does just about no sense to me; especially from a user POV. I don't think the existing usage of the term has much to do what it's used for here. That it has 'morphing' characteristics imo just makes it worse, rather than better. Besides being confusing that it has different meanings, it's far from inconceivable that somebody wants to return values from the preexisting, new, and merged rows. Greetings, Andres Freund
On Sat, Apr 25, 2015 at 11:24 AM, Andres Freund <andres@anarazel.de> wrote: >> > b) unclear whether the WHERE belongs to colb or the whole index >> > expression. The equivalent for aggregates, which I bet is going to be >> > used less often, caused a fair amount of confusing. >> >> I don't see those two situations as being comparable. The inference >> specification does not accept aggregates. > > Huh? It's pretty much entirely besides the point that inference doesn't > accept aggregates. The point is that ORDER BY for aggregates has > confused users because it's inside the parens. Would any alternative cause less confusion? That's the real issue. And I'm unconvinced that your alternative would. >> > a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What >> > if we, at some later point, also want to handle other kind of >> > violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ... >> >> I think that naming unique violations alone would be wrong (not to >> mention ludicrously verbose). > > Why? Because, as I said, it might not be a unique violation at all. It could be an exclusion violation. >> > b) For me there's a WITH before the index inference clause missing, to >> > have it read in 'SQL' style. >> >> I'm not seeing it. BTW, Robert was the one who initially proposed that >> the unique index inference clause follow this exact style (albeit >> before it accepted a WHERE clause to infer partial indexes, which was >> only added a couple of months ago). > > So? So, his opinion matters if it comes down to a vote. The inference specification syntax as implemented is exactly what he suggested (plus I've added a predicate). > I guess I can live with that uglyness; but I'd like somebody else to > chime in. Agreed. >> > c) Right now the UPDATE can refer to pseudo relations 'TARGET' and >> > 'EXCLUDED'. I think especially the latter doesn't fit anymore at >> > all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD? >> >> NEW and OLD are terribly misleading, since surely the NEW tuple is the >> one actually appended to the relation by the UPDATE, and the OLD one >> is the existing one (not the excluded one). Plus they have all that >> intellectual baggage from rules. > > What 'intellectual baggage' would that be? That they're already known to > have been used in another place? I don't see the problem. The problem is that they make you think of rules, and they don't describe what's going on at all. >> Seems pretty descriptive of the situation to me - I actually put a lot >> of thought into this. Additionally, the word is widely understood by >> non-native speakers. TARGET is also very descriptive, because it >> situationally describes either the existing tuple actually present in >> the table, or (from a RETURNING clause) the final tuple present in the >> table post-UPDATE. We use the term "target" for that pervasively (in >> the docs and in the code). > > Sorry, I don't buy either argument. EXISTING and NEW would surely at > least as widely understood than EXCLUDE and TARGET. The latter does just > about no sense to me; especially from a user POV. I don't think the > existing usage of the term has much to do what it's used for here. Yes it does. The UPDATE docs refer to the target table in a way intended to distinguish it from any joined-to table (FROM table). It's clear as day. Maybe EXISTING is equally well understood as a word in general, but it's way more ambiguous than EXCLUDED is here. > That > it has 'morphing' characteristics imo just makes it worse, rather than > better. Besides being confusing that it has different meanings, it's far > from inconceivable that somebody wants to return values from the > preexisting, new, and merged rows. This is how RETURNING works from UPDATEs in general. IOW, if you do an UPDATE FROM (which is pretty similar to ON CONFLICT UPDATE, syntax-wise), then you can only refer to the joined table's tuple and the final post-update tuple from within RETURNING. You cannot refer to the pre-UPDATE target tuple there either -- it's *exactly* the same situation. Why should it be any different here? The situational/morphing characteristic of the alias name TARGET is therefore absolutely appropriate, in that it follows UPDATE. To be fair, there is one unrelated slight difference with RETURNING and conventional UPDATEs: You cannot return the EXCLUDED tuple (in the same way that you can reference the joined-FROM tuple within conventional UPDATEs). This is because the pertinent information is likely to be in the target tuple (after all, the DML statement names the proposed-for-insertion tuples itself, directly), but more importantly because projecting both would necessitate *always* qualifying the RETURNING column names to resolve which tuple is intended (UPDATE FROM will seldom be a self-join, but this will always be like a self-join). -- Peter Geoghegan
On Sat, Apr 25, 2015 at 11:50 AM, Peter Geoghegan <pg@heroku.com> wrote: > To be fair, there is one unrelated slight difference with RETURNING > and conventional UPDATEs: You cannot return the EXCLUDED tuple (in the > same way that you can reference the joined-FROM tuple within > conventional UPDATEs). This is because the pertinent information is > likely to be in the target tuple (after all, the DML statement names > the proposed-for-insertion tuples itself, directly), but more > importantly because projecting both would necessitate *always* > qualifying the RETURNING column names to resolve which tuple is > intended (UPDATE FROM will seldom be a self-join, but this will always > be like a self-join). It also makes sense because this is the RETURNING clause of an INSERT, not an UPDATE. So the general INSERT behavior is what is expected. It ought to be irrelevant if tuples were projected by actually inserting or updating. Otherwise, your UPSERT is probably misconceived. -- Peter Geoghegan
On 2015-04-25 11:50:59 -0700, Peter Geoghegan wrote: > On Sat, Apr 25, 2015 at 11:24 AM, Andres Freund <andres@anarazel.de> wrote: > >> > c) Right now the UPDATE can refer to pseudo relations 'TARGET' and > >> > 'EXCLUDED'. I think especially the latter doesn't fit anymore at > >> > all. How about 'CONFLICTING' and 'EXISTING'? Or even NEW and OLD? > >> > >> NEW and OLD are terribly misleading, since surely the NEW tuple is the > >> one actually appended to the relation by the UPDATE, and the OLD one > >> is the existing one (not the excluded one). Plus they have all that > >> intellectual baggage from rules. > > > > What 'intellectual baggage' would that be? That they're already known to > > have been used in another place? I don't see the problem. > > The problem is that they make you think of rules, and they don't > describe what's going on at all. 95% of all users will know NEW/OLD from triggers, not rules. Where NEW is used in a quite comparable way. > >> Seems pretty descriptive of the situation to me - I actually put a lot > >> of thought into this. Additionally, the word is widely understood by > >> non-native speakers. TARGET is also very descriptive, because it > >> situationally describes either the existing tuple actually present in > >> the table, or (from a RETURNING clause) the final tuple present in the > >> table post-UPDATE. We use the term "target" for that pervasively (in > >> the docs and in the code). > > > > Sorry, I don't buy either argument. EXISTING and NEW would surely at > > least as widely understood than EXCLUDE and TARGET. The latter does just > > about no sense to me; especially from a user POV. I don't think the > > existing usage of the term has much to do what it's used for here. > > Yes it does. The UPDATE docs refer to the target table in a way > intended to distinguish it from any joined-to table (FROM table). It's > clear as day. Which means the term is used in a different way for INSERTs and UPDATEs already. To me it sounds like it's a remnant of your earlier syntax proposal for UPSERT. > Maybe EXISTING is equally well understood as a word in general, but > it's way more ambiguous than EXCLUDED is here. What? I'm not suggesting to replace EXCLUDED by EXISTING - that'd make absolutely no sense. My suggesting is to have NEW refer to the tuple specified in the INSERT and EXISTING to the, well, pre existing tuple that the conflict is with. > > That > > it has 'morphing' characteristics imo just makes it worse, rather than > > better. Besides being confusing that it has different meanings, it's far > > from inconceivable that somebody wants to return values from the > > preexisting, new, and merged rows. > > This is how RETURNING works from UPDATEs in general. And there's been a patch (which unfortunately died because it's implementation wasn't good), to allow referring to the other versions of the tuple. It has been wished for numerous times. > IOW, if you do an UPDATE FROM (which is pretty similar to ON CONFLICT > UPDATE, syntax-wise), then you can only refer to the joined table's > tuple and the final post-update tuple from within RETURNING. > You cannot refer to the pre-UPDATE target tuple there either -- it's > *exactly* the same situation. Why should it be any different here? The > situational/morphing characteristic of the alias name TARGET is > therefore absolutely appropriate, in that it follows UPDATE. Contrasting > TARGET is also very descriptive, because it > situationally describes either the existing tuple actually present in > the table, or (from a RETURNING clause) the final tuple present in the > table post-UPDATE. We use the term "target" for that pervasively (in > the docs and in the code). the docs say: Since <literal>RETURNING</> is not part of the <command>UPDATE</> auxiliary query, the special <literal>ONCONFLICT UPDATE</> aliases (<varname>TARGET</> and <varname>EXCLUDED</>) may not be referenced; only the rowas it exists after updating (or inserting) is returned. So I don't understand that whole chain of argument. There's no such morphing behaviour, unless I miss something? 2a5d80b27d2c5832ad26dde4651c64dd2004f401: > The problem with this seems to be that it more or less > necessitates making both IGNORE and UPDATE fully reserved keywords in > order to avoid an ambiguity, which we prefer not to do It does not. As mentioned in the thread DO UPDATE/NOTHING work without anything like that. Greetings, Andres Freund
On Sat, Apr 25, 2015 at 12:23 PM, Andres Freund <andres@anarazel.de> wrote: > 95% of all users will know NEW/OLD from triggers, not rules. Where NEW > is used in a quite comparable way. I don't think it's comparable. >> >> Seems pretty descriptive of the situation to me - I actually put a lot >> >> of thought into this. Additionally, the word is widely understood by >> >> non-native speakers. TARGET is also very descriptive, because it >> >> situationally describes either the existing tuple actually present in >> >> the table, or (from a RETURNING clause) the final tuple present in the >> >> table post-UPDATE. We use the term "target" for that pervasively (in >> >> the docs and in the code). >> > >> > Sorry, I don't buy either argument. EXISTING and NEW would surely at >> > least as widely understood than EXCLUDE and TARGET. The latter does just >> > about no sense to me; especially from a user POV. I don't think the >> > existing usage of the term has much to do what it's used for here. >> >> Yes it does. The UPDATE docs refer to the target table in a way >> intended to distinguish it from any joined-to table (FROM table). It's >> clear as day. > > Which means the term is used in a different way for INSERTs and UPDATEs > already. No, it isn't. It both cases the table is the one involved in the parse analysis setTargetTable() call. >> Maybe EXISTING is equally well understood as a word in general, but >> it's way more ambiguous than EXCLUDED is here. > > What? I'm not suggesting to replace EXCLUDED by EXISTING - that'd make > absolutely no sense. My suggesting is to have NEW refer to the tuple > specified in the INSERT and EXISTING to the, well, pre existing tuple > that the conflict is with. Okay, but that doesn't change my opinion. >> > That >> > it has 'morphing' characteristics imo just makes it worse, rather than >> > better. Besides being confusing that it has different meanings, it's far >> > from inconceivable that somebody wants to return values from the >> > preexisting, new, and merged rows. >> >> This is how RETURNING works from UPDATEs in general. > > And there's been a patch (which unfortunately died because it's > implementation wasn't good), to allow referring to the other versions of > the tuple. It has been wished for numerous times. Well, if that patch is ever committed, then it won't be hard to get the behavior here too, since it is literally exactly the same code. I don't change anything about it, and that seems to be your problem. > the docs say: > Since > <literal>RETURNING</> is not part of the <command>UPDATE</> > auxiliary query, the special <literal>ON CONFLICT UPDATE</> aliases > (<varname>TARGET</> and <varname>EXCLUDED</>) may not be > referenced; only the row as it exists after updating (or > inserting) is returned. > > So I don't understand that whole chain of argument. There's no such > morphing behaviour, unless I miss something? That's a documentation bug (a remnant of an earlier version). Sorry about that. You can reference TARGET from returning. It's directly contradicted by this much earlier statement on the INSERT doc page: "Both aliases can be used in the auxiliary query targetlist and WHERE clause, while the TARGET alias can be used anywhere within the entire statement (e.g., within the RETURNING clause)" I'll go fix that. > 2a5d80b27d2c5832ad26dde4651c64dd2004f401: >> The problem with this seems to be that it more or less >> necessitates making both IGNORE and UPDATE fully reserved keywords in >> order to avoid an ambiguity, which we prefer not to do > > It does not. As mentioned in the thread DO UPDATE/NOTHING work without > anything like that. I just mean that it couldn't work as-was in the repo at that time. This commit message was written before your proposal of 8 hours ago. We're going to have to agree to disagree here. I've given you my opinion. I'm burnt out on this patch, and whatever the path of least resistance is is the path I'll take. Frankly, the only reason that I'm putting up any kind of argument is because I don't think that your proposal is the path of least resistance. -- Peter Geoghegan
On Sat, Apr 25, 2015 at 12:35 PM, Peter Geoghegan <pg@heroku.com> wrote: >>> > That >>> > it has 'morphing' characteristics imo just makes it worse, rather than >>> > better. Besides being confusing that it has different meanings, it's far >>> > from inconceivable that somebody wants to return values from the >>> > preexisting, new, and merged rows. >>> >>> This is how RETURNING works from UPDATEs in general. >> >> And there's been a patch (which unfortunately died because it's >> implementation wasn't good), to allow referring to the other versions of >> the tuple. It has been wished for numerous times. > > Well, if that patch is ever committed, then it won't be hard to get > the behavior here too, since it is literally exactly the same code. I > don't change anything about it, and that seems to be your problem. I withdraw this remark. Even in a world where this patch is committed, it still makes sense for the INSERT returning behavior to not be altered (and to project only TARGET tuples even if they come from the auxiliary UPDATE). The "join" is within the auxiliary UPDATE, not the INSERT, and it should be no more possible to project intermediate tuples (like EXCLUDED.*) from the INSERT's RETURNING than it is to project CTE scan tuples from an INSERT ... RETURNING with a CTE. -- Peter Geoghegan
On 04/25/2015 12:01 PM, Andres Freund wrote: > INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) UPDATE|IGNORE > > My problem with the WHERE being inside the parens in the above is that > it's > a) different from CREATE INDEX > b) unclear whether the WHERE belongs to colb or the whole index > expression. The equivalent for aggregates, which I bet is going to be > used less often, caused a fair amount of confusing. > > That's why I wanted the WHERE outside the (), which requires either > adding DO between the index inference clause, and the action, to avoid > ambiguities in the grammar. Yeah, having the WHERE outside the parens seems much nicer. What is the ambiguity? > But I'm generally having some doubts about the syntax. > > Right now it's > INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE. > > A couple things: > > a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What > if we, at some later point, also want to handle other kind of > violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ... As Peter said, it's also for exclusion constraints. Perhaps "ON CONSTRAINT VIOLATION"? It doesn't apply to foreign key constraints, though. I think "ON CONFLICT" is fine. > b) For me there's a WITH before the index inference clause missing, to > have it read in 'SQL' style. Agreed. ON would sound more natural than WITH though: INSERT INTO mytable ON CONFLICT ON (keycol) UPDATE ... The ability to specify a constraint by name hasn't been implemented, but that would read quite naturally as: INSERT INTO mytable ON CONFLICT ON CONSTRAINT my_constraint UPDATE ... - Heikki
On April 26, 2015 11:22:01 AM GMT+02:00, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >On 04/25/2015 12:01 PM, Andres Freund wrote: >> INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) >UPDATE|IGNORE >> >> My problem with the WHERE being inside the parens in the above is >that >> it's >> a) different from CREATE INDEX >> b) unclear whether the WHERE belongs to colb or the whole index >> expression. The equivalent for aggregates, which I bet is going >to be >> used less often, caused a fair amount of confusing. >> >> That's why I wanted the WHERE outside the (), which requires either >> adding DO between the index inference clause, and the action, to >avoid >> ambiguities in the grammar. > >Yeah, having the WHERE outside the parens seems much nicer. What is the > >ambiguity? With a full keyword in between (like DO), there's none. But without it its ambiguous where a trailing UPDATE belongs to.At least from the point of a LALR grammar. WHERE UPDATE; is legal. I don't see the DO as much of a problem though. >> But I'm generally having some doubts about the syntax. >> >> Right now it's >> INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE. >> >> A couple things: >> >> a) Why is is 'CONFLICT"? We're talking about a uniquness violation. >What >> if we, at some later point, also want to handle other kind of >> violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION >... > >As Peter said, it's also for exclusion constraints. Perhaps "ON >CONSTRAINT VIOLATION"? It doesn't apply to foreign key constraints, >though. I think "ON CONFLICT" is fine. What if we, as at least I have previously wished for, want to allow handling other types of constraints? It'd be quite coolto be able to insert the referenced key on a fkey violation for some use cases. >> b) For me there's a WITH before the index inference clause missing, >to >> have it read in 'SQL' style. > >Agreed. ON would sound more natural than WITH though: > >INSERT INTO mytable ON CONFLICT ON (keycol) UPDATE ... I chose WITh because of the repeated DO; that's all ;) --- Please excuse brevity and formatting - I am writing this on my mobile phone.
On 26/04/15 12:08, Andres Freund wrote: > On April 26, 2015 11:22:01 AM GMT+02:00, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> On 04/25/2015 12:01 PM, Andres Freund wrote: >>> >>> That's why I wanted the WHERE outside the (), which requires either >>> adding DO between the index inference clause, and the action, to >> avoid >>> ambiguities in the grammar. >> >> Yeah, having the WHERE outside the parens seems much nicer. What is the >> >> ambiguity? > > > With a full keyword in between (like DO), there's none. But without it its ambiguous where a trailing UPDATE belongs to.At least from the point of a LALR grammar. WHERE UPDATE; is legal. I don't see the DO as much of a problem though. > The DO variant with WHERE outside of parenthesis sounds fine to me. Or at least better than the alternatives I've seen or can come up with. >>> A couple things: >>> >>> a) Why is is 'CONFLICT"? We're talking about a uniquness violation. >> What >>> if we, at some later point, also want to handle other kind of >>> violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION >> ... >> >> As Peter said, it's also for exclusion constraints. Perhaps "ON >> CONSTRAINT VIOLATION"? It doesn't apply to foreign key constraints, >> though. I think "ON CONFLICT" is fine. > > What if we, as at least I have previously wished for, want to allow handling other types of constraints? It'd be quitecool to be able to insert the referenced key on a fkey violation for some use cases. > >>> b) For me there's a WITH before the index inference clause missing, >> to >>> have it read in 'SQL' style. >> >> Agreed. ON would sound more natural than WITH though: >> >> INSERT INTO mytable ON CONFLICT ON (keycol) UPDATE ... > > I chose WITh because of the repeated DO; that's all ;) > The ON CONFLICT ON sounds really weird to me. Either ON CONSTRAINT VIOLATION (foo) or ON CONFLICT [WITH] (foo) both seem acceptable. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
* Heikki Linnakangas (hlinnaka@iki.fi) wrote: > On 04/25/2015 12:01 PM, Andres Freund wrote: > >INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) UPDATE|IGNORE > > > >My problem with the WHERE being inside the parens in the above is that > >it's > >a) different from CREATE INDEX > >b) unclear whether the WHERE belongs to colb or the whole index > > expression. The equivalent for aggregates, which I bet is going to be > > used less often, caused a fair amount of confusing. > > > >That's why I wanted the WHERE outside the (), which requires either > >adding DO between the index inference clause, and the action, to avoid > >ambiguities in the grammar. > > Yeah, having the WHERE outside the parens seems much nicer. What is > the ambiguity? I like having it outside the parens also. > >But I'm generally having some doubts about the syntax. > > > >Right now it's > >INSERT ... ON CONFLICT opt_on_conf_clause UPDATE|IGNORE. > > > >A couple things: > > > >a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What > > if we, at some later point, also want to handle other kind of > > violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ... > > As Peter said, it's also for exclusion constraints. Perhaps "ON > CONSTRAINT VIOLATION"? It doesn't apply to foreign key constraints, > though. I think "ON CONFLICT" is fine. I don't mind using "CONFLICT" here, seems to make sense to me. > >b) For me there's a WITH before the index inference clause missing, to > > have it read in 'SQL' style. > > Agreed. ON would sound more natural than WITH though: > > INSERT INTO mytable ON CONFLICT ON (keycol) UPDATE ... > > The ability to specify a constraint by name hasn't been implemented, > but that would read quite naturally as: > > INSERT INTO mytable ON CONFLICT ON CONSTRAINT my_constraint UPDATE ... I don't particularly like the double-ON in this.. I've not tried, but is the first ON required to be a full keyword? Seems like it probably is, but just to finish the thought I had, what about: INSERT INTO mytable .. IF CONFLICT ON (a,b) WHERE .. THEN UPDATE IF is currently just an unreserved keyword though. We could use FOR though: INSERT INTO mytable .. FOR CONFLICT ON (a,b) WHERE .. THEN UPDATE Though that'd probably sound better as: INSERT INTO mytable .. FOR CONFLICT ON (a,b) WHERE .. DO UPDATE Another option is: INSERT INTO mytable .. WHEN CONFLICT ON (a,b) WHERE .. DO UPDATE Which could also be: INSERT INTO mytable .. WHEN CONFLICT ON (a,b) WHERE .. THEN UPDATE of course.. What's important, in my view, is to keep the simple case simple and so I'm not particularly wedded to any of these approaches, just trying to help with other suggestions. INSERT INTO mytable VALUES ('key1','key2','val1','val2') ON CONFLICT UPDATE SET val1 = 'val1', val2 = 'val2'; strikes me as a the 99% use-case here that we need to keep sane, and it'd be really nice if we didn't have to include the SET clause and duplicate those values at all.. That could be something we add later though, I don't think it needs to be done now. Thanks! Stephen
On Sun, Apr 26, 2015 at 6:34 AM, Stephen Frost <sfrost@snowman.net> wrote: > What's important, in my view, is to keep the simple case simple and so > I'm not particularly wedded to any of these approaches, just trying to > help with other suggestions. > > INSERT INTO mytable VALUES ('key1','key2','val1','val2') > ON CONFLICT UPDATE SET val1 = 'val1', val2 = 'val2'; > > strikes me as a the 99% use-case here that we need to keep sane, and > it'd be really nice if we didn't have to include the SET clause and > duplicate those values at all.. That could be something we add later > though, I don't think it needs to be done now. You can do that already. That's what the EXCLUDED.* alias that is automatically added is for (the thing that Andres disliked the spelling of - or the other thing). This is legal, for example: INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2') ON CONFLICT (foo) UPDATE SET (foo, bar, baz, bat) = (EXCLUDED.foo, EXCLUDED.bar, EXCLUDED.baz, EXCLUDED.bat)'; I don't want to accept something that automatically merges the excluded tuple (e.g., "SET (*) = EXLCUDED.*"), for reasons outlined here: https://wiki.postgresql.org/wiki/UPSERT#VoltDB.27s_UPSERT -- Peter Geoghegan
Peter, * Peter Geoghegan (pg@heroku.com) wrote: > On Sun, Apr 26, 2015 at 6:34 AM, Stephen Frost <sfrost@snowman.net> wrote: > > What's important, in my view, is to keep the simple case simple and so > > I'm not particularly wedded to any of these approaches, just trying to > > help with other suggestions. > > > > INSERT INTO mytable VALUES ('key1','key2','val1','val2') > > ON CONFLICT UPDATE SET val1 = 'val1', val2 = 'val2'; > > > > strikes me as a the 99% use-case here that we need to keep sane, and > > it'd be really nice if we didn't have to include the SET clause and > > duplicate those values at all.. That could be something we add later > > though, I don't think it needs to be done now. > > You can do that already. That's what the EXCLUDED.* alias that is > automatically added is for (the thing that Andres disliked the > spelling of - or the other thing). This is legal, for example: > > INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2') > ON CONFLICT (foo) UPDATE SET (foo, bar, baz, bat) = (EXCLUDED.foo, > EXCLUDED.bar, EXCLUDED.baz, EXCLUDED.bat)'; Yeah, that's not exactly simpler and I don't expect to see it used very often (as in, less than 1%) because of that. > I don't want to accept something that automatically merges the > excluded tuple (e.g., "SET (*) = EXLCUDED.*"), for reasons outlined > here: https://wiki.postgresql.org/wiki/UPSERT#VoltDB.27s_UPSERT Perhaps I'm missing it, but the reasons that I see there appear to be: "It'd be like SELECT *" and "we'd have to decide what to do about the value for unspecified columns". As for the latter- we have to do that *anyway*, no? What happens if you do: INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2') ON CONFLICT (foo) UPDATE SET (baz) = (EXCLUDED.baz); ? As for the "SELECT *" concern, I fail to see how it's any different from the exact same currently-encouraged usage of INSERT + UPDATE: INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2'); ... catch the exception UPDATE mytable SET baz = 'val1', bat = 'val2' WHERE foo = 'key1' and bar = 'key2'; Clearly there are issues with the above if someone is running around adding columns to tables and PG has to figure out if we should be setting the non-mentioned columns to NULL or to the default for the column, but we're all quite happy to do so and trust that whomever is adding the column has set a sane default and that PG will use it when the column isn't included in either the INSERT or the UPDATE. Note that I wasn't suggesting your "SET (*) = EXLCUDED.*" syntax and if that would expand to something different than what I've outlined above then it would make sense to not include it (... or fix it to act the same, and then it's just a more verbose approach). Further, this is *very* different from how the "SELECT *" concern can cause things to break unexpectedly- new columns end up getting returned which the application is unlikely to be prepared for. That doesn't happen here and so I don't believe it makes any sense to try and compare the two. Happy to discuss, of course, and apologies if I missed some other issue- I was just reading what I found at the link provided. Thanks! Stephen
On Sun, Apr 26, 2015 at 11:08 AM, Stephen Frost <sfrost@snowman.net> wrote: >> I don't want to accept something that automatically merges the >> excluded tuple (e.g., "SET (*) = EXLCUDED.*"), for reasons outlined >> here: https://wiki.postgresql.org/wiki/UPSERT#VoltDB.27s_UPSERT > > Perhaps I'm missing it, but the reasons that I see there appear to be: > > "It'd be like SELECT *" and "we'd have to decide what to do about the > value for unspecified columns". As for the latter- we have to do that > *anyway*, no? What happens if you do: > > INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2') > ON CONFLICT (foo) UPDATE SET (baz) = (EXCLUDED.baz); > > ? It's like any other UPDATE - the values of columns not appearing in the targetlist are unchanged from the original row version now superseded. It doesn't matter that you had some other values in the INSERT. You only get what you ask for. -- Peter Geoghegan
* Peter Geoghegan (pg@heroku.com) wrote: > On Sun, Apr 26, 2015 at 11:08 AM, Stephen Frost <sfrost@snowman.net> wrote: > >> I don't want to accept something that automatically merges the > >> excluded tuple (e.g., "SET (*) = EXLCUDED.*"), for reasons outlined > >> here: https://wiki.postgresql.org/wiki/UPSERT#VoltDB.27s_UPSERT > > > > Perhaps I'm missing it, but the reasons that I see there appear to be: > > > > "It'd be like SELECT *" and "we'd have to decide what to do about the > > value for unspecified columns". As for the latter- we have to do that > > *anyway*, no? What happens if you do: > > > > INSERT INTO mytable (foo, bar, baz, bat) VALUES ('key1','key2','val1','val2') > > ON CONFLICT (foo) UPDATE SET (baz) = (EXCLUDED.baz); > > > > ? > > It's like any other UPDATE - the values of columns not appearing in > the targetlist are unchanged from the original row version now > superseded. It doesn't matter that you had some other values in the > INSERT. You only get what you ask for. Ok, that makes sense.. So is the concern that an INSERT would end up getting default values while an UPDATE would preserve whatever's there? I don't see that as an issue. Are you still against having a way to say "go forth and update whatever non-conflicting columns I've specified in the INSERT, if there is a conflict"..? Again, not saying it has to be done now, but it'd certainly be nice if we had it initially because otherwise the ORMs and "frameworks" of the world will be stuck supporting the more verbose approach for as long as we support it (~5 years..). Thanks! Stephen
On Sun, Apr 26, 2015 at 11:35 AM, Stephen Frost <sfrost@snowman.net> wrote: > Ok, that makes sense.. So is the concern that an INSERT would end up > getting default values while an UPDATE would preserve whatever's there? > > I don't see that as an issue. I think it easily could be. > Are you still against having a way to say "go forth and update whatever > non-conflicting columns I've specified in the INSERT, if there is a > conflict"..? > > Again, not saying it has to be done now, but it'd certainly be nice if > we had it initially because otherwise the ORMs and "frameworks" of the > world will be stuck supporting the more verbose approach for as long as > we support it (~5 years..). The more verbose approach is entirely necessary much of the time. For example: insert into upsert_race_test (index, count) values ('541','-1') on conflict update set count=TARGET.count + EXCLUDED.count; Merging like this will be a very common requirement. -- Peter Geoghegan
* Peter Geoghegan (pg@heroku.com) wrote: > On Sun, Apr 26, 2015 at 11:35 AM, Stephen Frost <sfrost@snowman.net> wrote: > > Ok, that makes sense.. So is the concern that an INSERT would end up > > getting default values while an UPDATE would preserve whatever's there? > > > > I don't see that as an issue. > > I think it easily could be. Ok.. Can you elaborate on that? Would it be an issue that's different from the same thing done as independent commands? Perhaps it'd be an issue for individuals who attempt to combine some more complicated INSERT/UPDATE logic and don't realize that they'd get whatever the existing value is for the non-specified columns rather than the default value, but I'm sure they'd realize it on testing it and, well, there's lots of ways users can misuse SQL and PG and get what they expect 99% of the time (JOIN would be a great example..) only to have things break one day. > > Are you still against having a way to say "go forth and update whatever > > non-conflicting columns I've specified in the INSERT, if there is a > > conflict"..? > > > > Again, not saying it has to be done now, but it'd certainly be nice if > > we had it initially because otherwise the ORMs and "frameworks" of the > > world will be stuck supporting the more verbose approach for as long as > > we support it (~5 years..). > > The more verbose approach is entirely necessary much of the time. For example: > > insert into upsert_race_test (index, count) > values ('541','-1') on conflict update set count=TARGET.count + EXCLUDED.count; > > Merging like this will be a very common requirement. I was just about to reply to myself that I didn't intend to say that we would remove the more verbose syntax but rather that they'd have to use the more verbose syntax as long as we supported a release which *didn't* have the simpler syntax, which would be ~5 years. Thanks! Stephen
On Sun, Apr 26, 2015 at 11:43 AM, Stephen Frost <sfrost@snowman.net> wrote: >> I think it easily could be. > > Ok.. Can you elaborate on that? Would it be an issue that's different > from the same thing done as independent commands? I think that the stuff I linked to describes my concerns exhaustively. In any case, it's a discussion for another day. -- Peter Geoghegan
On Sun, Apr 26, 2015 at 09:34:12AM -0400, Stephen Frost wrote: > * Heikki Linnakangas (hlinnaka@iki.fi) wrote: > > On 04/25/2015 12:01 PM, Andres Freund wrote: > > >INSERT ... ON CONFLICT (cola, colb [WHERE predicate_for_partial]) UPDATE|IGNORE > > > > > >My problem with the WHERE being inside the parens in the above is that > > >it's > > >a) different from CREATE INDEX > > >b) unclear whether the WHERE belongs to colb or the whole index > > > expression. The equivalent for aggregates, which I bet is going to be > > > used less often, caused a fair amount of confusing. > > > > > >That's why I wanted the WHERE outside the (), which requires either > > >adding DO between the index inference clause, and the action, to avoid > > >ambiguities in the grammar. > > > > Yeah, having the WHERE outside the parens seems much nicer. What is > > the ambiguity? > > I like having it outside the parens also. Agreed, and I like the DO [ UPDATE | NOTHING ] too. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On 4/25/15 2:05 PM, Peter Geoghegan wrote: > Note that the syntax is quite similar to the SQLite > syntax of the same feature, that has ON CONFLICT IGNORE (it also has > ON CONFLICT REPLACE, but not ON CONFLICT UPDATE). I don't know anything about SQLite's syntax, but from the online syntax diagrams https://www.sqlite.org/lang_insert.html https://www.sqlite.org/syntax/conflict-clause.html it appears that they are using quite a different syntax. The ON CONFLICT clause is attached to a constraint, specifying the default action for that constraint. The INSERT command can then override this default choice. I think.
On Mon, Apr 27, 2015 at 1:19 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > it appears that they are using quite a different syntax. The ON > CONFLICT clause is attached to a constraint, specifying the default > action for that constraint. The INSERT command can then override this > default choice. I think. Well, MySQL's ON DUPLICATE KEY UPDATE thing is pretty close to what I have. I intend CONFLICT as a broader term, which is somewhat similar to SQLite (and is not needlessly verbose). -- Peter Geoghegan
On Mon, Apr 27, 2015 at 10:20 AM, Bruce Momjian <bruce@momjian.us> wrote: > Agreed, and I like the DO [ UPDATE | NOTHING ] too. Here is what I think I need to do: * Don't change the ON CONFLICT spelling. * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias TARGET.*). Those seem fine to me as well. * Change the syntax to put the WHERE clause used to infer partial indexes outside parens. * Change the syntax to make this work, by adding the fully reserved keyword DO. Assuming you have a partial index (WHERE is_active) on the column "key", you're left with: INSERT .... ON CONFLICT (key) where is_active DO UPDATE set ... WHERE ... ; or: INSERT .... ON CONFLICT (key) where is_active DO NOTHING; The DO keyword makes this work where it cannot otherwise, because it's a RESERVED_KEYWORD. * Finally, add "ON CONFLICT ON CONSTRAINT my_constraint" support, so you can name (exactly one) constraint by name. Particularly useful for unique constraints. I really don't want to make this accept multiple constraints, even though we may infer multiple constraints, because messy, and is probably too complex to every be put to good use (bearing in mind that exclusion constraints, that really need this, will still only be supported by the IGNORE/DO NOTHING variant). Are we in agreement? -- Peter Geoghegan
On Mon, Apr 27, 2015 at 4:21 PM, Peter Geoghegan <pg@heroku.com> wrote: > * Don't change the ON CONFLICT spelling. > > * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias > TARGET.*). Those seem fine to me as well. > > * Change the syntax to put the WHERE clause used to infer partial > indexes outside parens. > > * Change the syntax to make this work, by adding the fully reserved > keyword DO. Assuming you have a partial index (WHERE is_active) on the > column "key", you're left with: > > INSERT .... ON CONFLICT (key) where is_active DO UPDATE set ... WHERE ... ; > > or: > > INSERT .... ON CONFLICT (key) where is_active DO NOTHING; > > The DO keyword makes this work where it cannot otherwise, because it's > a RESERVED_KEYWORD. I've pushed code that does all this to Github. Documentation changes will follow soon. > * Finally, add "ON CONFLICT ON CONSTRAINT my_constraint" support, so > you can name (exactly one) constraint by name. Particularly useful for > unique constraints. I really don't want to make this accept multiple > constraints, even though we may infer multiple constraints, because > messy, and is probably too complex to every be put to good use > (bearing in mind that exclusion constraints, that really need this, > will still only be supported by the IGNORE/DO NOTHING variant). I have yet to do this, but it should be fairly straightforward. -- Peter Geoghegan
On 28/04/15 03:51, Peter Geoghegan wrote: > On Mon, Apr 27, 2015 at 4:21 PM, Peter Geoghegan <pg@heroku.com> wrote: >> * Don't change the ON CONFLICT spelling. >> >> * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias >> TARGET.*). Those seem fine to me as well. >> >> * Change the syntax to put the WHERE clause used to infer partial >> indexes outside parens. >> >> * Change the syntax to make this work, by adding the fully reserved >> keyword DO. Assuming you have a partial index (WHERE is_active) on the >> column "key", you're left with: >> >> INSERT .... ON CONFLICT (key) where is_active DO UPDATE set ... WHERE ... ; >> >> or: >> >> INSERT .... ON CONFLICT (key) where is_active DO NOTHING; >> >> The DO keyword makes this work where it cannot otherwise, because it's >> a RESERVED_KEYWORD. > > I've pushed code that does all this to Github. Documentation changes > will follow soon. > >> * Finally, add "ON CONFLICT ON CONSTRAINT my_constraint" support, so >> you can name (exactly one) constraint by name. Particularly useful for >> unique constraints. I really don't want to make this accept multiple >> constraints, even though we may infer multiple constraints, because >> messy, and is probably too complex to every be put to good use >> (bearing in mind that exclusion constraints, that really need this, >> will still only be supported by the IGNORE/DO NOTHING variant). > > I have yet to do this, but it should be fairly straightforward. > Most of this sounds reasonable to me. However I still dislike the "ON CONFLICT ON" part, the idea of WITH Andres suggested feels better. I am also very sure that every time I'll write this statement I will have to look into manual for the names of TARGET and EXCLUDED because they don't seem intuitive to me at all (especially the EXCLUDED). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote: > I am also very sure that every time I'll write this statement I will have to > look into manual for the names of TARGET and EXCLUDED because they don't > seem intuitive to me at all (especially the EXCLUDED). Same here. I don't understand why 'CONFLICTING' would be more ambiguous than EXCLUDED (as Peter argued earlier). Especially given that the whole syntax is called ON CONFLICT. Greetings, Andres Freund
* Andres Freund (andres@anarazel.de) wrote: > On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote: > > I am also very sure that every time I'll write this statement I will have to > > look into manual for the names of TARGET and EXCLUDED because they don't > > seem intuitive to me at all (especially the EXCLUDED). > > Same here. I don't understand why 'CONFLICTING' would be more ambiguous > than EXCLUDED (as Peter argued earlier). Especially given that the whole > syntax is called ON CONFLICT. Any way we can alias it? Both of those strike me as annoyingly long and if we could allow an alias then people can do whatever they want... No, I haven't got any suggestion on how to do that. :) It's also something we can probably improve on in the future... Thanks! Stephen
On 2015-04-28 10:40:10 -0400, Stephen Frost wrote: > * Andres Freund (andres@anarazel.de) wrote: > > On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote: > > > I am also very sure that every time I'll write this statement I will have to > > > look into manual for the names of TARGET and EXCLUDED because they don't > > > seem intuitive to me at all (especially the EXCLUDED). > > > > Same here. I don't understand why 'CONFLICTING' would be more ambiguous > > than EXCLUDED (as Peter argued earlier). Especially given that the whole > > syntax is called ON CONFLICT. > > Any way we can alias it? Both of those strike me as annoyingly long and > if we could allow an alias then people can do whatever they want... > > No, I haven't got any suggestion on how to do that. :) > > It's also something we can probably improve on in the future... I earlier suggested NEW/OLD. I still think that's not too bad as I don't buy the argument that they're too associated with rules. Greetings, Andres Freund
* Andres Freund (andres@anarazel.de) wrote: > On 2015-04-28 10:40:10 -0400, Stephen Frost wrote: > > * Andres Freund (andres@anarazel.de) wrote: > > > On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote: > > > > I am also very sure that every time I'll write this statement I will have to > > > > look into manual for the names of TARGET and EXCLUDED because they don't > > > > seem intuitive to me at all (especially the EXCLUDED). > > > > > > Same here. I don't understand why 'CONFLICTING' would be more ambiguous > > > than EXCLUDED (as Peter argued earlier). Especially given that the whole > > > syntax is called ON CONFLICT. > > > > Any way we can alias it? Both of those strike me as annoyingly long and > > if we could allow an alias then people can do whatever they want... > > > > No, I haven't got any suggestion on how to do that. :) > > > > It's also something we can probably improve on in the future... > > I earlier suggested NEW/OLD. I still think that's not too bad as I don't > buy the argument that they're too associated with rules. +1, NEW/OLD seem pretty natural and I'm not worried about what they look like in rules, and their usage in triggers matches up with what they'd mean here, I'd think. Thanks! Stephen
On 28/04/15 16:44, Andres Freund wrote: > On 2015-04-28 10:40:10 -0400, Stephen Frost wrote: >> * Andres Freund (andres@anarazel.de) wrote: >>> On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote: >>>> I am also very sure that every time I'll write this statement I will have to >>>> look into manual for the names of TARGET and EXCLUDED because they don't >>>> seem intuitive to me at all (especially the EXCLUDED). >>> >>> Same here. I don't understand why 'CONFLICTING' would be more ambiguous >>> than EXCLUDED (as Peter argued earlier). Especially given that the whole >>> syntax is called ON CONFLICT. >> >> Any way we can alias it? Both of those strike me as annoyingly long and >> if we could allow an alias then people can do whatever they want... >> >> No, I haven't got any suggestion on how to do that. :) >> >> It's also something we can probably improve on in the future... > > I earlier suggested NEW/OLD. I still think that's not too bad as I don't > buy the argument that they're too associated with rules. > Hmm, I would never think of rules when talking about NEW/OLD, the association I have is with triggers. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
+1, NEW/OLD seem pretty natural and I'm not worried about what they look
like in rules, and their usage in triggers matches up with what they'd
mean here, I'd think.
Since I've stuck my head above the parapet once I figured I'd give m
y 2p's worth: IMHO
NEW/OLD doesn't fit at all. In triggers you're applying it to something that (without the trigger) would be the new or old version of a matching row
, so it's completely intuitive
; in this instance without the ON CONFLICT there would never be a "
new"
, because it would be a
failure
.
MySQL uses VALUES(columnname) to reference the intended INSERT value (what you might term "NEW") and the target name to reference "OLD". I understand that people might think the bracketed syntax isn't very pleasant because that looks like a function, but it seems more reasonable than NEW (can we use VALUES.columname?); finally I don't see why we need an "OLD" (or TARGET) at all - am I missing the point?
Geoff
On 28 April 2015 at 15:57, I wrote:
MySQL uses VALUES(columnname) to reference the intended INSERT value (what you might term "NEW") and the target name to reference "OLD". I understand that people might think the bracketed syntax isn't very pleasant because that looks like a function, but it seems more reasonable than NEW (can we use VALUES.columname?);
On balance I think I
don't like VALUES.column either, because although it looks all fine when you're doing a single INSERT ... VALUES () it gets confusing if you're INSERTing from a SELECT.
As you were. :(
Geoff
On Tue, Apr 28, 2015 at 7:38 AM, Andres Freund <andres@anarazel.de> wrote: > On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote: >> I am also very sure that every time I'll write this statement I will have to >> look into manual for the names of TARGET and EXCLUDED because they don't >> seem intuitive to me at all (especially the EXCLUDED). > > Same here. I don't understand why 'CONFLICTING' would be more ambiguous > than EXCLUDED (as Peter argued earlier). Especially given that the whole > syntax is called ON CONFLICT. Because the TARGET and EXCLUDED tuples conflict with each other - they're both conflicting. -- Peter Geoghegan
* Peter Geoghegan (pg@heroku.com) wrote: > On Tue, Apr 28, 2015 at 7:38 AM, Andres Freund <andres@anarazel.de> wrote: > > On 2015-04-28 16:36:28 +0200, Petr Jelinek wrote: > >> I am also very sure that every time I'll write this statement I will have to > >> look into manual for the names of TARGET and EXCLUDED because they don't > >> seem intuitive to me at all (especially the EXCLUDED). > > > > Same here. I don't understand why 'CONFLICTING' would be more ambiguous > > than EXCLUDED (as Peter argued earlier). Especially given that the whole > > syntax is called ON CONFLICT. > > Because the TARGET and EXCLUDED tuples conflict with each other - > they're both conflicting. I agree with that, but how are NEW and OLD ambiguous? NEW is clearly the tuple being added, while OLD is clearly the existing tuple. Now that I think about it though, where that'd get ugly is using this command *inside* a trigger function, which I can certainly imagine people will want to do... Thanks, Stephen
On Tue, Apr 28, 2015 at 9:42 AM, Stephen Frost <sfrost@snowman.net> wrote: > I agree with that, but how are NEW and OLD ambiguous? NEW is clearly > the tuple being added, while OLD is clearly the existing tuple. Yes, but EXCLUDED is neither the tuple being added, nor is it the new tuple. It's something else entirely. -- Peter Geoghegan
On Tue, Apr 28, 2015 at 9:45 AM, Peter Geoghegan <pg@heroku.com> wrote: > Yes, but EXCLUDED is neither the tuple being added, nor is it the new > tuple. It's something else entirely. I mean, nor is it the existing tuple. -- Peter Geoghegan
* Peter Geoghegan (pg@heroku.com) wrote: > On Tue, Apr 28, 2015 at 9:42 AM, Stephen Frost <sfrost@snowman.net> wrote: > > I agree with that, but how are NEW and OLD ambiguous? NEW is clearly > > the tuple being added, while OLD is clearly the existing tuple. > > Yes, but EXCLUDED is neither the tuple being added, nor is it the new > tuple. It's something else entirely. I don't see that, it's exactly the tuple attempting to be inserted. I agree that it might not be the tuple originally in the INSERT statement due to before triggers, but there isn't an alias anywhere for that. Now, in 99% of cases there aren't going to be before triggers so I'm not particularly worried about that distinction, nor do I think we need to provide an alias for the tuple from the INSERT piece of the clause, but to say that EXCLUDED isn't the tuple being added doesn't make any sense to me, based on how I read the documentation proposed here: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html I'll further point out that the documentation doesn't bother to make the before-trigger distinction always either: Note that an EXCLUDED expression is used to reference values originallyproposed for insertion: Perhaps I've missed something, but that seems to go along with the notion that EXCLUDED references the tuple that we're attempting to add through the INSERT, and that's certainly what I'd consider NEW to be also. I do think there's a real issue with using OLD/NEW because of their usage in triggers, and I don't see any good solution to that issue. :( Thanks, Stephen
On Tue, Apr 28, 2015 at 7:36 AM, Petr Jelinek <petr@2ndquadrant.com> wrote: > I am also very sure that every time I'll write this statement I will have to > look into manual for the names of TARGET and EXCLUDED because they don't > seem intuitive to me at all (especially the EXCLUDED). According to wordcount.org, the word "exclude" is ranked # 5796 in the English language in terms of frequency of use. It's in the vocabulary of 6 year olds. I don't know why you find it counter-intuitive, since it perfectly describes the purpose of the tuple. -- Peter Geoghegan
* Peter Geoghegan (pg@heroku.com) wrote:
> On Tue, Apr 28, 2015 at 9:42 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > I agree with that, but how are NEW and OLD ambiguous? NEW is clearly
> > the tuple being added, while OLD is clearly the existing tuple.
>
> Yes, but EXCLUDED is neither the tuple being added, nor is it the new
> tuple. It's something else entirely.
So? I see this as a prime case for choosing practicality/functionality over precision.
If I was to pick 2 words I would probably pick "PROPOSED" and "EXISTING".
But, the syntax is already verbose and being able to use a three-letter reference has its own appeal.
I don't see that, it's exactly the tuple attempting to be inserted. I
agree that it might not be the tuple originally in the INSERT statement
due to before triggers, but there isn't an alias anywhere for that.
Now, in 99% of cases there aren't going to be before triggers so I'm not
particularly worried about that distinction, nor do I think we need to
provide an alias for the tuple from the INSERT piece of the clause, but
to say that EXCLUDED isn't the tuple being added doesn't make any sense
to me, based on how I read the documentation proposed here:
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html
This example exemplifies the poorness of the proposed wording, IMO:
[...]
SET dname = EXCLUDED.dname || ' (formerly ' || TARGET.dname || ')'NEW.dname || '(formerly ' || OLD.dname || ')' reads perfectly well.
Yes, this is an isolated example...but am I missing the fact that there is a third tuple that needs to be referenced?
If there are only two the choices of NEW and OLD seem to be both easily learned and readable.
David J.
On Tue, Apr 28, 2015 at 10:36 AM, David G. Johnston <david.g.johnston@gmail.com> wrote: > This example exemplifies the poorness of the proposed wording, IMO: > > > [...] > SET dname = EXCLUDED.dname || ' (formerly ' || TARGET.dname || ')' > > NEW.dname || '(formerly ' || OLD.dname || ')' reads perfectly well. > > Yes, this is an isolated example...but am I missing the fact that there is a > third tuple that needs to be referenced? > > If there are only two the choices of NEW and OLD seem to be both easily > learned and readable. Whatever Andres and/or Heikki want is what I'll agree to. Honestly, I just don't care anymore. -- Peter Geoghegan
On Mon, Apr 27, 2015 at 7:21 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Mon, Apr 27, 2015 at 10:20 AM, Bruce Momjian <bruce@momjian.us> wrote: >> Agreed, and I like the DO [ UPDATE | NOTHING ] too. > > Here is what I think I need to do: > > * Don't change the ON CONFLICT spelling. What I proposed originally was ON DUPLICATE. Not ON CONFLICT. And I still like that better. ON UNIQUE CONFLICT, which Andres mentioned, gets us there too, but it's > * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias > TARGET.*). Those seem fine to me as well. There seem to be a few votes for NEW and OLD. That's what I proposed originally, and (surprise, surprise) I still like that better too. > * Change the syntax to put the WHERE clause used to infer partial > indexes outside parens. +1. > * Change the syntax to make this work, by adding the fully reserved > keyword DO. Assuming you have a partial index (WHERE is_active) on the > column "key", you're left with: > > INSERT .... ON CONFLICT (key) where is_active DO UPDATE set ... WHERE ... ; > > or: > > INSERT .... ON CONFLICT (key) where is_active DO NOTHING; > > The DO keyword makes this work where it cannot otherwise, because it's > a RESERVED_KEYWORD. Seems fine. > * Finally, add "ON CONFLICT ON CONSTRAINT my_constraint" support, so > you can name (exactly one) constraint by name. Particularly useful for > unique constraints. I really don't want to make this accept multiple > constraints, even though we may infer multiple constraints, because > messy, and is probably too complex to every be put to good use > (bearing in mind that exclusion constraints, that really need this, > will still only be supported by the IGNORE/DO NOTHING variant). I still think that constraints should never be named in the syntax. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
* Robert Haas (robertmhaas@gmail.com) wrote: > On Mon, Apr 27, 2015 at 7:21 PM, Peter Geoghegan <pg@heroku.com> wrote: > > On Mon, Apr 27, 2015 at 10:20 AM, Bruce Momjian <bruce@momjian.us> wrote: > >> Agreed, and I like the DO [ UPDATE | NOTHING ] too. > > > > Here is what I think I need to do: > > > > * Don't change the ON CONFLICT spelling. > > What I proposed originally was ON DUPLICATE. Not ON CONFLICT. And I > still like that better. ON UNIQUE CONFLICT, which Andres mentioned, > gets us there too, but it's My general feeling is "keep it short" but I'm not particular beyond that. I do like the idea that we'll be able to support more options in the future. > > * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias > > TARGET.*). Those seem fine to me as well. > > There seem to be a few votes for NEW and OLD. That's what I proposed > originally, and (surprise, surprise) I still like that better too. I was promoting NEW/OLD, until I realized that we'd end up having a problem in trigger functions because NEW/OLD are already defined there, unless you have a suggestion for how to improve on that? > > * Finally, add "ON CONFLICT ON CONSTRAINT my_constraint" support, so > > you can name (exactly one) constraint by name. Particularly useful for > > unique constraints. I really don't want to make this accept multiple > > constraints, even though we may infer multiple constraints, because > > messy, and is probably too complex to every be put to good use > > (bearing in mind that exclusion constraints, that really need this, > > will still only be supported by the IGNORE/DO NOTHING variant). > > I still think that constraints should never be named in the syntax. I guess I don't see a particular problem with that..? Perhaps I'm missing something, but if there's multiple ways for something to conflict, it might be nice to be able to differentiate between them? Then again, I'm not sure if that's what the intent here is. Thanks! Stephen
On Wed, Apr 29, 2015 at 3:13 PM, Stephen Frost <sfrost@snowman.net> wrote: > My general feeling is "keep it short" but I'm not particular beyond > that. I do like the idea that we'll be able to support more options in > the future. Yeah. To me "ON CONFLICT" sounds like "if there's a war, then...". So I like ON DUPLICATE, which I think is clear as crystal. But if we settle on something else I won't cry myself to sleep. >> > * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias >> > TARGET.*). Those seem fine to me as well. >> >> There seem to be a few votes for NEW and OLD. That's what I proposed >> originally, and (surprise, surprise) I still like that better too. > > I was promoting NEW/OLD, until I realized that we'd end up having a > problem in trigger functions because NEW/OLD are already defined there, > unless you have a suggestion for how to improve on that? I don't. That's a good point. >> > * Finally, add "ON CONFLICT ON CONSTRAINT my_constraint" support, so >> > you can name (exactly one) constraint by name. Particularly useful for >> > unique constraints. I really don't want to make this accept multiple >> > constraints, even though we may infer multiple constraints, because >> > messy, and is probably too complex to every be put to good use >> > (bearing in mind that exclusion constraints, that really need this, >> > will still only be supported by the IGNORE/DO NOTHING variant). >> >> I still think that constraints should never be named in the syntax. > > I guess I don't see a particular problem with that..? Perhaps I'm > missing something, but if there's multiple ways for something to > conflict, it might be nice to be able to differentiate between them? > Then again, I'm not sure if that's what the intent here is. So, with unique indexes, people can create an index concurrently, then drop the old index concurrently, and nothing breaks. I don't think we have a similar capacity for constraints at the moment, but we should. When somebody does that dance, the object names change, but all of the DML keeps working. That's a property I'd like to preserve. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2015-04-29 15:31:59 -0400, Robert Haas wrote: > On Wed, Apr 29, 2015 at 3:13 PM, Stephen Frost <sfrost@snowman.net> wrote: > >> I still think that constraints should never be named in the syntax. > > > > I guess I don't see a particular problem with that..? Perhaps I'm > > missing something, but if there's multiple ways for something to > > conflict, it might be nice to be able to differentiate between them? > > Then again, I'm not sure if that's what the intent here is. > > So, with unique indexes, people can create an index concurrently, then > drop the old index concurrently, and nothing breaks. I don't think we > have a similar capacity for constraints at the moment, but we should. > When somebody does that dance, the object names change, but all of the > DML keeps working. That's a property I'd like to preserve. On the other hand it's way more convenient to specify a single constraint name than several columns and a predicate. I'm pretty sure there's situations where I a) rather live with a smaller chance of error during a replacement of the constraint b) if we get concurrently replaceable constraints the naming should be doable too. I don't see your argument strong enough to argue against allowing this *as an alternative*. Greetings, Andres Freund
On 25 April 2015 at 14:05, Peter Geoghegan <pg@heroku.com> wrote:
--
> a) Why is is 'CONFLICT"? We're talking about a uniquness violation. What
> if we, at some later point, also want to handle other kind of
> violations? Shouldn't it be ON UNIQUE CONFLICT/ERROR/VIOLATION ...
I think that naming unique violations alone would be wrong (not to
mention ludicrously verbose). Heikki and I both feel that the CONFLICT
keyword captures the fact that this could be a dup violation, or an
exclusion violation. The syntax has been like this for some time, and
hasn't been a point of contention for a long time, so I thought this
was settled.
I dislike the way that ignoring objections for a period leads them to be potentially discarded. I'd prefer to think that as a community we are able to listen to people even when they aren't continually present to reinforce the original objection(s).
Not supporting MySQL syntax will seem like a bizarre choice to people watching this from a distance. I accept that the patch implements useful behaviour that MySQL does not implement and we thus provide enhanced syntax, but the default should be match on PK using the MySQL syntax.
Note that the syntax is quite similar to the SQLite
syntax of the same feature, that has ON CONFLICT IGNORE (it also has
ON CONFLICT REPLACE, but not ON CONFLICT UPDATE).
Why are we not also supporting ON CONFLICT REPLACE and IGNORE then?
If we are using syntax from other products then it should be identical syntax, or the argument to use it doesn't stand.
We must think about what SQL Standard people are likely to say and do. If we act as independently, our thought may be ignored. If we act in support of other previous implementations we may draw support to adopt that as the standard. Whatever the standard says we will eventually support, so we should be acting with an eye to that future.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Apr 29, 2015 at 4:09 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > I dislike the way that ignoring objections for a period leads them to be > potentially discarded. I'd prefer to think that as a community we are able > to listen to people even when they aren't continually present to reinforce > the original objection(s). What objection was ignored? Because, I looked just now, and the only thing I could find of substance that you said on the MySQL syntax [1] seemed pretty lukewarm. You mostly argued for MERGE. > Not supporting MySQL syntax will seem like a bizarre choice to people > watching this from a distance. I accept that the patch implements useful > behaviour that MySQL does not implement and we thus provide enhanced syntax, > but the default should be match on PK using the MySQL syntax. Does it? We can't use the MERGE syntax, because this isn't MERGE. Everything else UPSERT-like some new and distinct custom syntax, for various reasons. >> Note that the syntax is quite similar to the SQLite >> syntax of the same feature, that has ON CONFLICT IGNORE (it also has >> ON CONFLICT REPLACE, but not ON CONFLICT UPDATE). > > > Why are we not also supporting ON CONFLICT REPLACE and IGNORE then? The point I was trying to make was that CONFLICT also appears as a more general term than "duplicate key" or whatever. > If we are using syntax from other products then it should be identical > syntax, or the argument to use it doesn't stand. I was making a narrow point about the keyword "CONFLICT". Nothing more. > We must think about what SQL Standard people are likely to say and do. If we > act as independently, our thought may be ignored. If we act in support of > other previous implementations we may draw support to adopt that as the > standard. Whatever the standard says we will eventually support, so we > should be acting with an eye to that future. UPSERT seems like exactly the kind of thing that the SQL standard does not concern itself with. For example, I have a "unique index inference specification". The SQL standard does not have anything to say about indexes. I would be extremely surprised if the SQL standard adopted MySQL's UPSERT thing. They omit the "SET" on the UPDATE, probably to fix some parser ambiguity issue. While there are some similarities to what I have here, it's a bit shoddy. I have requirements coming out of my ears for this patch, Simon. I think it's odd that you're taking umbrage because I supposedly ignored something you said 6 months ago. [1] http://www.postgresql.org/message-id/CA+U5nMK-efLg00FhCWk=ASbET_77iSS87EGDsptq0uKzQdrV6Q@mail.gmail.com -- Peter Geoghegan
* Andres Freund (andres@anarazel.de) wrote: > On the other hand it's way more convenient to specify a single > constraint name than several columns and a predicate. I'm pretty sure > there's situations where I a) rather live with a smaller chance of error > during a replacement of the constraint b) if we get concurrently > replaceable constraints the naming should be doable too. > > I don't see your argument strong enough to argue against allowing this > *as an alternative*. Agreed. Thanks! Stephen
On Sun, Apr 26, 2015 at 2:22 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > The ability to specify a constraint by name hasn't been implemented, but > that would read quite naturally as: > > INSERT INTO mytable ON CONFLICT ON CONSTRAINT my_constraint UPDATE ... For the record, I have made this change on Github (Andres and I have been doing a lot of clean-up. I point this change in particular out because it's a behavioral change). The INSERT documentation has been updated to reflect this, and includes an example. This copy of the documentation is current: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html -- Peter Geoghegan
On Wed, Apr 29, 2015 at 12:09 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias >> TARGET.*). Those seem fine to me as well. > > There seem to be a few votes for NEW and OLD. That's what I proposed > originally, and (surprise, surprise) I still like that better too. That makes the following valid: INSERT INTO distributors (did, dname) VALUES (5, 'Gizmo transglobal') ON CONFLICT (did) DO UPDATE SET dname = NEW.dnameRETURNING OLD.dname; So you're projecting "OLD.dname" from RETURNING, here -- so "OLD" refers to the row added back to the relation on update (or perhaps the row simply inserted). That's pretty bad. I really don't want to add a kludge to make the target relation have an alias in one context but not in the other. -- Peter Geoghegan
On 05/05/2015 12:16 AM, Peter Geoghegan wrote: > On Sun, Apr 26, 2015 at 2:22 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> The ability to specify a constraint by name hasn't been implemented, but >> that would read quite naturally as: >> >> INSERT INTO mytable ON CONFLICT ON CONSTRAINT my_constraint UPDATE ... > > For the record, I have made this change on Github ... Great, thanks. I'm a bit late to the party as I haven't paid much attention to the syntax before, but let me give some comments on this "arbiter index inference" thingie. To recap, there are three variants: A. INSERT ... ON CONFLICT DO NOTHING No arbiter is specified. This means that a conflict on any unique or exclusion constraint is not allowed (and will do nothing instead). This variant is only accepted for DO NOTHING. B. INSERT ... ON CONFLICT ON <constraint name> DO NOTHING/UPDATE In this variant, you explicitly specify the constraint by name. C. INSERT ... ON CONFLICT (<index params>) [WHERE <expression>] DO NOTHING/UPDATE This specifies an index (or indexes, in the corner case that there are several identical ones), by listing the columns/expressions and the predicate for a partial index. The list of columns and WHERE match the syntax for CREATE INDEX. That's pretty good overall. A few questions: 1. Why is the variant without specifying an index or constraint not allowed with DO UPDATE? I agree it might not make much sense, but then again, it might. If we're afraid that it's too unsafe to be the "default" if you don't specify any constraint, how about allowing it with a more verbose "ON CONFLICT ON ANY CONSTRAINT" syntax? 2. Why can't you specify multiple constraints, even though we implicitly allow "any" with the first variant? Finally, a couple of suggestions. It would be pretty handy to allow: INSERT ... ON CONFLICT ON PRIMARY KEY DO NOTHING/UPDATE Also, I wonder if we should change the B syntax to be: INSERT ... ON CONFLICT ON *CONSTRAINT* <constraint name> DO NOTHING/UPDATE That would allow the syntax can be expanded in the future to specify conflicts on other kind of things. The "ON PRIMARY KEY" syntax should be unambiguous with out, because PRIMARY is a reserved keyword, but for example, we might want to add "ON UNIQUE INDEX <index name>" later. - Heikki
I've read through all the error messages in the patch. Some comments: ==== postgres=# create table foo (id int4); CREATE TABLE postgres=# create unique index foo_y on foo (id) where id > 0; CREATE INDEX postgres=# insert into foo values (-1) on conflict (id) where id > 0 do nothing; ERROR: inferred arbiter partial unique index's predicate does not cover tuple proposed for insertion DETAIL: ON CONFLICT inference clause implies that the tuple proposed for insertion must be covered by the predicate of partial index "foo_y". I'm surprised. If the inserted value doesn't match the WHERE clause of the constraint, there is clearly no conflict, so I would assume the above to work without error. ==== postgres=# create table foo (id int4 primary key, col2 int4, constraint col2_unique unique (col2) deferrable); CREATE TABLE postgres=# insert into foo values (2) on conflict on foo_pkey do nothing; ERROR: ON CONFLICT is not supported on relations with deferred unique constraints/exclusion constraints Why not? I would understand if the specified arbiter constraint is deferred, but why does it matter if one of the other constraints is? ==== postgres=# create table foo (id int4 primary key); CREATE TABLE postgres=# begin isolation level serializable; BEGIN postgres=# select 1; -- to get the snapshot ?column? ---------- 1 (1 row) <in another session, insert 1> postgres=# insert into foo values (1) on conflict on foo_pkey do nothing; ERROR: could not serialize access due to concurrent insert or update directing alternative ON CONFLICT path What about a delete? Don't you get a serialization error, if another backend deletes the conflicting tuple, and you perform an INSERT based on the effects of the deleting transaction, which is not supposed to be visible to you? The error message is quite long. How about just giving the default "could not serialize access due to concurrent update" message? I don't think the "directing alternative ON CONFLICT path" really helps the user. ==== postgres=# insert into foo values (1), (2) on conflict on foo_pkey do update set id = 2; ERROR: ON CONFLICT DO UPDATE command could not lock/update self-inserted tuple HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values. "lock/update" doesn't sound good. If the system knows which it is, should say so. But in this case, locking the row just an implementation detail. We're performing the UPDATE when that error happens, not locking. How about: ERROR: could not update tuple that was inserted in the same command HINT: Ensure that the rows being inserted do not conflict with each other. BTW, the tuple might be an updated version of an existing tuple, i.e. the ON CONFLICT DO UPDATE ... was fired on an earlier row. Not necessarily a row that was inserted in the same command. The message is technically correct, but you need to understand the difference between tuple and a row. Actually, should we avoid using the term "tuple" altogether in user-facing errors? ==== postgres=# insert into foo values (1) on conflict (xmin) do nothing; ERROR: system columns may not appear in unique index inference specification "may" is usually not the right word to use in error messages, as it implies permission to do something (see style guide). It's not totally inappropriate in this case, but how about: ERROR: system columns cannot be used in an ON CONFLICT clause ==== postgres=# create table foo (id int4, constraint xcheck CHECK (id > 0));CREATE TABLE postgres=# insert into foo values (1) on conflict on xcheck do nothing; ERROR: constraint in ON CONFLICT clause has no associated index How about: ERROR: "xcheck" is a CHECK constraint DETAIL: Only unique or exclusion constraints can be used in ON CONFLICT ON clause. That's assuming that "xcheck" actually is a CHECK constraint. Similarly for a foreign key constraint. ==== postgres=# create table foo (id int4, constraint foo_x exclude using gist (id with =) ); CREATE TABLE postgres=# insert into foo values (1) on conflict on foo_x do update set id=-1; ERROR: "foo_x" is not a unique index DETAIL: ON CONFLICT DO UPDATE requires unique index as arbiter. I think it would be better to refer to the constraint here, rather than the index. The user specified the constraint, the fact that it's backed by an index is just an implementation detail. Actually, the user specified an exclusion constraint, and expected DO UPDATE to work with that. But we don't support that. So we should say: ERROR: ON CONFLICT ... DO UPDATE not supported with exclusion constraints ==== postgres=# create table foo (id int4 primary key, t text); CREATE TABLE postgres=# insert into foo values (1) on conflict (t) do nothing; ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT I think we should assume that the user listed the columns correctly, but there is no constraint on them. The current message implies that the list of columns was wrong. Either case is possible, of course, but I'd suggest: ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification ==== postgres=# insert into foo values (1) on conflict (t nulls first) do nothing; ERROR: ON CONFLICT does not accept ordering or NULLS FIRST/LAST specifications LINE 1: insert into foo values (1) on conflict (t nulls first) do n... ^ HINT: These factors do not affect uniqueness of indexed datums. I'd suggest: ERROR: NULLS FIRST/LAST is not allowed in ON CONFLICT clause and just leave out the hint. Or say something along the lines "You can just leave it out". Any chance the errlocation could point to the actual NULLS FIRST/LAST clause, not the paren starting the column list? And ERRCODE_SYNTAX_ERROR would be more appropriate for this. ==== postgres=# insert into foo values (1) on conflict do update set count = excluded.count + 1; ERROR: ON CONFLICT DO UPDATE requires explicit arbiter index LINE 1: insert into foo values (1) on conflict do update set count ... ^ "Hmm, so where do I stick the index then?" or "But I just created the index! Why can't the system find it?" Can we avoid exposing the term "arbiter index" to the user? From a user's point of view, he cannot specify an index directly. He can specify a list of columns and an optional WHERE clause, or a constraint name. How about: ERROR: DO UPDATE requires a constraint name or index specification HINT: For example, ON CONFLICT ON <constraint> or ON CONFLICT (<column>) ==== postgres=# insert into pg_roles values ('myrole') on conflict do nothing; ERROR: ON CONFLICT not supported with catalog relations LINE 1: insert into pg_roles values ('myrole') on conflict do nothi... ^ I think the more common term used in error messages is "system catalog table" ==== postgres=# create table foo (id int4 primary key); CREATE TABLE postgres=# create rule insert_foo as on insert to foo do also insert into bar values (-1); CREATE RULE postgres=# insert into foo values (1) on conflict on foo_pkey do update set id=1; ERROR: INSERT with ON CONFLICT clause may not target relation with INSERT or UPDATE rules That's not strictly true. A "DO INSTEAD NOTHING" rule doesn't prevent ON CONFLICT from working, for example. Not sure how to phrase that into the message without making it too long. "may" is again not very appropriate here. Should be "cannot". And the term "target relation" might not be very clear to an average user. ==== Phew, that was quite a list.. - Heikki
On 2015-05-04 14:16:42 -0700, Peter Geoghegan wrote: > On Sun, Apr 26, 2015 at 2:22 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > The ability to specify a constraint by name hasn't been implemented, but > > that would read quite naturally as: > > > > INSERT INTO mytable ON CONFLICT ON CONSTRAINT my_constraint UPDATE ... > > For the record, I have made this change on Github Theoretically this changes the pictures for FDWs, right? Right now there's + <para> + <command>INSERT</> with an <literal>ON CONFLICT</> clause is not + supported with a unique index inference specification, since a + conflict arbitrating unique index cannot meaningfully be inferred + on a foreign table (this implies that <literal>ON CONFLICT DO + UPDATE</> is never supported, since the specification is + mandatory there). + </para> but theoretically the constraint name could be meaningful on the other side... I don't think this is anyting for 9.5, but it might be interesting for later. Greetings, Andres Freund
On Tue, May 5, 2015 at 9:36 AM, Andres Freund <andres@anarazel.de> wrote: > Theoretically this changes the pictures for FDWs, right? Right now > there's > + <para> > + <command>INSERT</> with an <literal>ON CONFLICT</> clause is not > + supported with a unique index inference specification, since a > + conflict arbitrating unique index cannot meaningfully be inferred > + on a foreign table (this implies that <literal>ON CONFLICT DO > + UPDATE</> is never supported, since the specification is > + mandatory there). > + </para> > but theoretically the constraint name could be meaningful on the other > side... Well, the inference clause could be too -- in that sense, the constraint name isn't special at all. But you need to invent a way of making the optimizer infer an index on the foreign side (and even with a named constraint, we go from constraint Oid in the parser to pg_index Oid in the optimizer, so it's a similar process to regular inference). Of course, teaching the optimizer about foreign indexes is a whole new infrastructure. Note that this really is the explanation for why postgres_fdw only has limited support. Sure, I haven't added deparsing logic for ON CONFLICT UPDATE, but it would be pretty easy to do so, and that isn't the blocker at all. > I don't think this is anyting for 9.5, but it might be interesting for > later. Sure. -- Peter Geoghegan
On 2015-05-05 15:27:09 +0300, Heikki Linnakangas wrote: > I'm a bit late to the party as I haven't paid much attention to the syntax > before, but let me give some comments on this "arbiter index inference" > thingie. > > > To recap, there are three variants: > > A. INSERT ... ON CONFLICT DO NOTHING > > No arbiter is specified. This means that a conflict on any unique or > exclusion constraint is not allowed (and will do nothing instead). This > variant is only accepted for DO NOTHING. > > B. INSERT ... ON CONFLICT ON <constraint name> DO NOTHING/UPDATE > > In this variant, you explicitly specify the constraint by name. I do think it's a bit sad to not be able to specify unique indexes that aren't constraints. So I'd like to have a corresponding ON INDEX - which would be trivial. > C. INSERT ... ON CONFLICT (<index params>) [WHERE <expression>] DO > NOTHING/UPDATE > > This specifies an index (or indexes, in the corner case that there are > several identical ones), by listing the columns/expressions and the > predicate for a partial index. The list of columns and WHERE match the > syntax for CREATE INDEX. > > > That's pretty good overall. A few questions: > > 1. Why is the variant without specifying an index or constraint not allowed > with DO UPDATE? I agree it might not make much sense, but then again, it > might. If we're afraid that it's too unsafe to be the "default" if you don't > specify any constraint, how about allowing it with a more verbose "ON > CONFLICT ON ANY CONSTRAINT" syntax? I think that'd be useful. Peter seems to be against it on pureness grounds when we argued against it before, but I know that I'd wished for it before. > 2. Why can't you specify multiple constraints, even though we implicitly > allow "any" with the first variant? Yea. > Finally, a couple of suggestions. It would be pretty handy to allow: > > INSERT ... ON CONFLICT ON PRIMARY KEY DO NOTHING/UPDATE Not sure if that really has that big of a use case, but it'd also be simple. > Also, I wonder if we should change the B syntax to be: > > INSERT ... ON CONFLICT ON *CONSTRAINT* <constraint name> DO NOTHING/UPDATE Oh yes. Greetings, Andres Freund
On Tue, May 5, 2015 at 5:27 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > To recap, there are three variants: > > A. INSERT ... ON CONFLICT DO NOTHING > > No arbiter is specified. This means that a conflict on any unique or > exclusion constraint is not allowed (and will do nothing instead). This > variant is only accepted for DO NOTHING. > > B. INSERT ... ON CONFLICT ON <constraint name> DO NOTHING/UPDATE > > In this variant, you explicitly specify the constraint by name. > > C. INSERT ... ON CONFLICT (<index params>) [WHERE <expression>] DO > NOTHING/UPDATE > > This specifies an index (or indexes, in the corner case that there are > several identical ones), by listing the columns/expressions and the > predicate for a partial index. The list of columns and WHERE match the > syntax for CREATE INDEX. I would just say that there are two variants, only one of which mandates the inference clause. But I'm nitpicking. > That's pretty good overall. A few questions: Thanks. I'm glad that we are now able to cover really any conceivable use-case, while playing nice with every existing feature (now updatable views are supported with ON CONFLICT DO UPDATE -- and we're also going to be able to suppor subqueries in the UPDATE). We've been incredibly thorough. > 1. Why is the variant without specifying an index or constraint not allowed > with DO UPDATE? I agree it might not make much sense, but then again, it > might. If we're afraid that it's too unsafe to be the "default" if you don't > specify any constraint, how about allowing it with a more verbose "ON > CONFLICT ON ANY CONSTRAINT" syntax? I think it's dangerous. It's basically wrong headed to omit any constraint for DO UPDATE. I put a lot of effort into covering every possible case with the inference clause, and I think it's pretty cool that we have something that's so flexible. I don't feel bad about forcing users to be explicit about what they want, because the only conceivable downside is that they'll have to do a little extra typing (if even that - it's probably going to be ORM-generated more often than not). The upside -- not having their query break unexpectedly one day, when a new constraint is added -- is huge. > 2. Why can't you specify multiple constraints, even though we implicitly > allow "any" with the first variant? It's just an escape hatch. I don't want to encourage its over use, and I want to keep the grammar simple. > Finally, a couple of suggestions. It would be pretty handy to allow: > > INSERT ... ON CONFLICT ON PRIMARY KEY DO NOTHING/UPDATE Is that really likely to be less verbose than naming the attributes directly? > INSERT ... ON CONFLICT ON *CONSTRAINT* <constraint name> DO NOTHING/UPDATE > > That would allow the syntax can be expanded in the future to specify > conflicts on other kind of things. The "ON PRIMARY KEY" syntax should be > unambiguous with out, because PRIMARY is a reserved keyword, but for > example, we might want to add "ON UNIQUE INDEX <index name>" later. Okay, we can do that. -- Peter Geoghegan
On Wed, May 6, 2015 at 7:53 AM, Andres Freund <andres@anarazel.de> wrote: >> In this variant, you explicitly specify the constraint by name. > > I do think it's a bit sad to not be able to specify unique indexes that > aren't constraints. So I'd like to have a corresponding ON INDEX - which > would be trivial. Then what's the point of having ON CONSTRAINT? The point of it working that way was we're not exposing the "implementation detail" of the index. While I happen to think that that's a distinction without a difference anyway, that certainly was the idea. I would care about the fact that you can't name a unique index with no constraint if there wasn't already a way of doing that with inference (I'm thinking in particular of partial indexes here, which never have constraints). But there is. So what's the problem? -- Peter Geoghegan
On 2015-05-06 13:05:16 -0700, Peter Geoghegan wrote: > On Wed, May 6, 2015 at 7:53 AM, Andres Freund <andres@anarazel.de> wrote: > >> In this variant, you explicitly specify the constraint by name. > > > > I do think it's a bit sad to not be able to specify unique indexes that > > aren't constraints. So I'd like to have a corresponding ON INDEX - which > > would be trivial. > > Then what's the point of having ON CONSTRAINT? That it supports exclusion constraints? > I would care about the fact that you can't name a unique index with no > constraint if there wasn't already a way of doing that with inference > (I'm thinking in particular of partial indexes here, which never have > constraints). But there is. So what's the problem? Personally I think a complex expression index is something many people will not want to specify every time. And since partial/expression indexes can't even have constraints... Greetings, Andres Freund
On Wed, May 6, 2015 at 1:22 PM, Andres Freund <andres@anarazel.de> wrote: > That it supports exclusion constraints? But so does just naming the index. I don't think it's significant that exclusion operators are in pg_constraint -- you could just as easily name the index, since that's all you ultimately end up with anyway. >> I would care about the fact that you can't name a unique index with no >> constraint if there wasn't already a way of doing that with inference >> (I'm thinking in particular of partial indexes here, which never have >> constraints). But there is. So what's the problem? > > Personally I think a complex expression index is something many people > will not want to specify every time. And since partial/expression > indexes can't even have constraints... The downsides seem severe. A CREATE INDEX CONCURRENTLY just broke your statement, because you didn't account for the new, equivalent unique index during inference, and now you have to drop the old index and break application code. Is that really worth introducing just to prevent app devs from writing the inference specification? The specification explicitly documents their intent, which seems like a good thing. Robert really disliked the idea of even naming the constraint, let alone the index. I'm trying to balance things, here. -- Peter Geoghegan
Andres pointed out on IM that the TARGET alias is a bit crummy. In particular, adding an ON CONFLICT DO UPDATE can make a RETURNING clause invalid, because we change the alias of the target rel: create table foo (id int4 primary key, t text); This works: postgres=# insert into foo (id, t) values (1, 'x') returning foo.t; t --- x (1 row) INSERT 0 1 Same statement with ON CONFLICT DO UPDATE fails: postgres=# insert into foo (id, t) values (1, 'x') on conflict (id) do update set t = 'x' returning foo.t; ERROR: invalid reference to FROM-clause entry for table "foo" LINE 1: ...'x') on conflict (id) do update set t = 'x' returning foo.t; ^ HINT: Perhaps you meant to reference the table alias "target". I'll see about fixing that. It's not just a matter of creating another alias for the same rel, I'm afraid: "foo.t" is supposed to refer to the tuple that we attempted to insert, like it does without the ON CONFLICT. But actually, I don't much like the "target" alias in the first place. We never really completed this discussion, everyone just got tired: On 04/29/2015 10:13 PM, Stephen Frost wrote: > * Robert Haas (robertmhaas@gmail.com) wrote: >> On Mon, Apr 27, 2015 at 7:21 PM, Peter Geoghegan <pg@heroku.com> wrote: >>> * Don't change the names of the pseudo-alias EXCLUDED.* (or the alias >>> TARGET.*). Those seem fine to me as well. >> >> There seem to be a few votes for NEW and OLD. That's what I proposed >> originally, and (surprise, surprise) I still like that better too. > > I was promoting NEW/OLD, until I realized that we'd end up having a > problem in trigger functions because NEW/OLD are already defined there, > unless you have a suggestion for how to improve on that? Reading through this sub-thread, these spellings have been proposed: 1. TARGET and EXCLUDED 2. NEW and EXISTING 3. NEW and OLD 4. PROPOSED and EXISTING 5. CONFLICTING and EXISTING Did I miss any? Now, let me opine on these. EXCLUDED seems fine to me. I don't see us using that term elsewhere, and it makes me think of exclusion constraints, but nevertheless I think it's pretty easy remember what it means. TARGET, however, is totally inscrutable. Peter argued earlier that: > TARGET is also very descriptive, because it situationally describes > either the existing tuple actually present in the table, or (from a > RETURNING clause) the final tuple present in the table post-UPDATE. > We use the term "target" for that pervasively (in the docs and in the > code). but I find that totally unconvincing. It's clear that TARGET refers to the table being upserted, but it's totally unclear on *which* version of the tuple it refers to. NEW and OLD are pretty good. Like in an UPDATE trigger, NEW refers to the version after the UPDATE, and OLD to the version before. However, there's the serious problem that in a trigger function, OLD/NEW are already in use. How bad is that? At least in PL/pgSQL you can work around it by aliasing the variables, but it's a bit inconvenient. How often would INSERT .. ON CONFLICT DO UPDATE be used in a trigger? I don't have much to say about the rest. PROPOSED, EXISTING, CONFLICTING, they're all fairly descriptive, but long. - Heikki
On 2015-05-06 13:37:07 -0700, Peter Geoghegan wrote: > On Wed, May 6, 2015 at 1:22 PM, Andres Freund <andres@anarazel.de> wrote: > > That it supports exclusion constraints? > > But so does just naming the index. I don't think it's significant that > exclusion operators are in pg_constraint -- you could just as easily > name the index, since that's all you ultimately end up with anyway. The index name for constraints is generated and not trivially safely guessable for a user. > >> I would care about the fact that you can't name a unique index with no > >> constraint if there wasn't already a way of doing that with inference > >> (I'm thinking in particular of partial indexes here, which never have > >> constraints). But there is. So what's the problem? > > > > Personally I think a complex expression index is something many people > > will not want to specify every time. And since partial/expression > > indexes can't even have constraints... > > The downsides seem severe. A CREATE INDEX CONCURRENTLY just broke > your statement, because you didn't account for the new, equivalent > unique index during inference, and now you have to drop the old index > and break application code. Is that really worth introducing just to > prevent app devs from writing the inference specification? The > specification explicitly documents their intent, which seems like a > good thing. I don't find that all that severe. It might just as well be the case that the new unique constraint isn't intended to be handled by ON CONFLICT. And having a inference specification that's longer than the rest of the statement surely isn't helpful. > Robert really disliked the idea of even naming the constraint, let > alone the index. I'm trying to balance things, here. I fail to see what doing something halfhearted helps. Greetings, Andres Freund
On Wed, May 6, 2015 at 1:48 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> TARGET is also very descriptive, because it situationally describes >> either the existing tuple actually present in the table, or (from a >> RETURNING clause) the final tuple present in the table post-UPDATE. >> We use the term "target" for that pervasively (in the docs and in the >> code). > > > but I find that totally unconvincing. It's clear that TARGET refers to the > table being upserted, but it's totally unclear on *which* version of the > tuple it refers to. Then we're simply talking about 2 different things. My understanding is that it *is* the relation. And like UPDATE's RETURNING, it will be the same relation/alias but a different tuple here. Andres said this was a mutating tuple or something like that, and I suppose it is. But Vars are variables. Now, Andres (and now you) want to change it so that the TARGET alias becomes magical and expression-like, so that it really does refer to a tuple and not a relation (and so is closer to EXCLUDED.*). And you seem pretty set on that. That being the case, clearly TARGET is unsuitable for the reasons you state. I suppose that it doesn't much matter, but that's how I understood the situation all along. So I can see why you don't like "TARGET" in light of that. I would vote for EXISTING as an alternative, given that it's pretty clear that what is now TARGET.* will become a magic alias/expression thing. EXISTING is the EXISTING tuple, which goes well with EXCLUDED. -- Peter Geoghegan
On 2015-05-06 23:48:18 +0300, Heikki Linnakangas wrote: > I'll see about fixing that. It's not just a matter of creating another alias > for the same rel, I'm afraid: "foo.t" is supposed to refer to the tuple that > we attempted to insert, like it does without the ON CONFLICT. I'm not sure what you mean here? > But actually, I don't much like the "target" alias in the first place. We > never really completed this discussion, everyone just got tired: Right. But that doesn't affect the "it's not just a matter of ..." bit above, right? > Reading through this sub-thread, these spellings have been proposed: > > 1. TARGET and EXCLUDED > > 2. NEW and EXISTING > > 3. NEW and OLD > > 4. PROPOSED and EXISTING > > 5. CONFLICTING and EXISTING > > Did I miss any? Now, let me opine on these. How about 6. The tablename and EXCLUDED? Possibility with the ability to specify an AS for INSERT INTO foo AS whatever? From an implementation pov that'd be simple ;) > NEW and OLD are pretty good. Like in an UPDATE trigger, NEW refers to the > version after the UPDATE, and OLD to the version before. However, there's > the serious problem that in a trigger function, OLD/NEW are already in use. > How bad is that? At least in PL/pgSQL you can work around it by aliasing the > variables, but it's a bit inconvenient. How often would INSERT .. ON > CONFLICT DO UPDATE be used in a trigger? I personally think it's a killer. It'll be very annoying to understand mistaken usage of NEW/OLD in that case. Greetings, Andres Freund
On Wed, May 6, 2015 at 2:01 PM, Andres Freund <andres@anarazel.de> wrote: > How about > 6. The tablename and EXCLUDED? Possibility with the ability to specify > an AS for INSERT INTO foo AS whatever? > > From an implementation pov that'd be simple ;) That's what I wanted to do when I realized what Andres wanted to do with the TARGET alias. Clearly that would compel us to actually make the RETURNING clause buy into this alias, just as with a regular UPDATE. And not having the alias on the target also be magical seems like a good thing. Nothing can be broken by this scheme. No? >> NEW and OLD are pretty good. Like in an UPDATE trigger, NEW refers to the >> version after the UPDATE, and OLD to the version before. However, there's >> the serious problem that in a trigger function, OLD/NEW are already in use. >> How bad is that? At least in PL/pgSQL you can work around it by aliasing the >> variables, but it's a bit inconvenient. How often would INSERT .. ON >> CONFLICT DO UPDATE be used in a trigger? > > I personally think it's a killer. It'll be very annoying to understand > mistaken usage of NEW/OLD in that case. +1 -- Peter Geoghegan
On 05/06/2015 11:05 PM, Peter Geoghegan wrote: > On Wed, May 6, 2015 at 7:53 AM, Andres Freund <andres@anarazel.de> wrote: >>> In this variant, you explicitly specify the constraint by name. >> >> I do think it's a bit sad to not be able to specify unique indexes that >> aren't constraints. So I'd like to have a corresponding ON INDEX - which >> would be trivial. > > Then what's the point of having ON CONSTRAINT? The point of it working > that way was we're not exposing the "implementation detail" of the > index. While I happen to think that that's a distinction without a > difference anyway, that certainly was the idea. Right, that's the idea. Indexes are just an implementation detail - conceivably you could have a constraint that's backed by some other mechanism. You should not embed implementation details like index names in your queries. Unfortunately you can't create a "partial constraint" - you'll have to create a partial index. I wish we would fix that directly, by allowing partial unique constraints. That said, I wouldn't necessarily be opposed to also having the syntax to name an index directly, as long as we had some notices in the docs to tell people to avoid it. - Heikki
On 2015-05-07 00:10:22 +0300, Heikki Linnakangas wrote: > Right, that's the idea. Indexes are just an implementation detail - I think that's a distinction just about no user out there cares about. > Unfortunately you can't create a "partial constraint" - you'll have to > create a partial index. I wish we would fix that directly, by allowing > partial unique constraints. It's not just partial ones, it's also expression ones, right? Greetings, Andres Freund
On 05/07/2015 12:18 AM, Andres Freund wrote: > On 2015-05-07 00:10:22 +0300, Heikki Linnakangas wrote: >> Right, that's the idea. Indexes are just an implementation detail - > > I think that's a distinction just about no user out there cares about. > >> Unfortunately you can't create a "partial constraint" - you'll have to >> create a partial index. I wish we would fix that directly, by allowing >> partial unique constraints. > > It's not just partial ones, it's also expression ones, right? True. - Heikki
On 05/07/2015 12:01 AM, Andres Freund wrote: > On 2015-05-06 23:48:18 +0300, Heikki Linnakangas wrote: >> I'll see about fixing that. It's not just a matter of creating another alias >> for the same rel, I'm afraid: "foo.t" is supposed to refer to the tuple that >> we attempted to insert, like it does without the ON CONFLICT. > > I'm not sure what you mean here? Sorry, forget about that. I was confused and mixed up EXCLUDED and TARGET. Looks like they really aren't very good names :-). - Heikki
On 05/07/2015 12:01 AM, Andres Freund wrote:On 2015-05-06 23:48:18 +0300, Heikki Linnakangas wrote:I'll see about fixing that. It's not just a matter of creating another alias
for the same rel, I'm afraid: "foo.t" is supposed to refer to the tuple that
we attempted to insert, like it does without the ON CONFLICT.
I'm not sure what you mean here?
Sorry, forget about that. I was confused and mixed up EXCLUDED and TARGET. Looks like they really aren't very good names :-).
Could
INSERT.column and
CONFLICT.column work? So INSERT is the row that you were inserting, and CONFLICT is the row with which it conflicted?
Geoff
On 05/07/2015 12:01 AM, Andres Freund wrote: > How about > 6. The tablename and EXCLUDED? Possibility with the ability to specify > an AS for INSERT INTO foo AS whatever? > > From an implementation pov that'd be simple ;) I did this, because as you say it's simple to implement, and it resolves the problem with RETURNING. BTW, it's worth noting that the <tablename>.col (or TARGET.col before) means different things in the DO UPDATE clause, and in RETURNING. Consider this example: postgres=# create table foo (id int4 primary key, t text); CREATE TABLE postgres=# insert into foo values (1, 'original'); INSERT 0 1 postgres=# insert into foo values (1, 'inserted') on conflict (id) do update set t = excluded.t || foo.t returning foo.t; t ------------------ insertedoriginal (1 row) In the DO UPDATE, foo.t was 'original', but in the RETURNING, it was 'insertedoriginal'. That's what I was thinking yesterday, when I said that it's not straightforward to just replace "target" with "<tablename>", but got confused. This isn't new, however; it works the same in a normal UPDATE RETURNING. - Heikki
On 05/07/2015 12:01 AM, Andres Freund wrote: > 6. The tablename and EXCLUDED? Possibility with the ability to specify > an AS for INSERT INTO foo AS whatever? If we don't allow "AS whatever", and you create a table called "excluded", you're stuck with the ambiguity in the DO UPDATE statement as you can't alias either one. So we have to add support for "INSERT INTO foo AS whatever", if we go with <tablename> and EXCLUDED. Does anyone see a problem with "INSERT INTO foo AS whatever"? It seems pretty straightforward to implement. If we don't use <tablename> and go back to TARGET/EXCLUDED or whatever we call them, they must only be visible to the UPDATE statement, not RETURNING. Otherwise we're back to square one. - Heikki
On 2015-05-07 16:15:18 +0300, Heikki Linnakangas wrote: > On 05/07/2015 12:01 AM, Andres Freund wrote: > >6. The tablename and EXCLUDED? Possibility with the ability to specify > > an AS for INSERT INTO foo AS whatever? > > If we don't allow "AS whatever", and you create a table called "excluded", > you're stuck with the ambiguity in the DO UPDATE statement as you can't > alias either one. So we have to add support for "INSERT INTO foo AS > whatever", if we go with <tablename> and EXCLUDED. > > Does anyone see a problem with "INSERT INTO foo AS whatever"? It seems > pretty straightforward to implement. I don't see a problem at all, with one exception: If we want the AS to be optional like in a bunch of other places, we have to either promote VALUES to a reserved keyword, only accept unreserved keywords, or play precedence games. I think it'd be perfectly fine to not make AS optional. Greetings, Andres Freund
On Thu, May 7, 2015 at 10:37 AM, Andres Freund <andres@anarazel.de> wrote: > I don't see a problem at all, with one exception: If we want the AS to > be optional like in a bunch of other places, we have to either promote > VALUES to a reserved keyword, only accept unreserved keywords, or play > precedence games. I think it'd be perfectly fine to not make AS > optional. +1. In any case, we can't very well not support having any alias on the target. -- Peter Geoghegan
On Tue, May 5, 2015 at 8:52 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > I've read through all the error messages in the patch. Some comments: I'll work through this feedback. > postgres=# create table foo (id int4); > CREATE TABLE > postgres=# create unique index foo_y on foo (id) where id > 0; > CREATE INDEX > postgres=# insert into foo values (-1) on conflict (id) where id > 0 do > nothing; > ERROR: inferred arbiter partial unique index's predicate does not cover > tuple proposed for insertion > DETAIL: ON CONFLICT inference clause implies that the tuple proposed for > insertion must be covered by the predicate of partial index "foo_y". > > I'm surprised. If the inserted value doesn't match the WHERE clause of the > constraint, there is clearly no conflict, so I would assume the above to > work without error. I'm not particularly attached to that behavior. I could revert it. -- Peter Geoghegan
On Tue, May 5, 2015 at 8:52 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > Why not? I would understand if the specified arbiter constraint is deferred, > but why does it matter if one of the other constraints is? This has been fixed since, BTW. We now support relations that happen to have deferred constraints. The constraint indexes cannot be used as arbiters, but that makes perfect sense. -- Peter Geoghegan
On 2015-05-07 11:16:12 -0700, Peter Geoghegan wrote: > On Tue, May 5, 2015 at 8:52 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > postgres=# create table foo (id int4); > > CREATE TABLE > > postgres=# create unique index foo_y on foo (id) where id > 0; > > CREATE INDEX > > postgres=# insert into foo values (-1) on conflict (id) where id > 0 do > > nothing; > > ERROR: inferred arbiter partial unique index's predicate does not cover > > tuple proposed for insertion > > DETAIL: ON CONFLICT inference clause implies that the tuple proposed for > > insertion must be covered by the predicate of partial index "foo_y". > > > > I'm surprised. If the inserted value doesn't match the WHERE clause of the > > constraint, there is clearly no conflict, so I would assume the above to > > work without error. > > I'm not particularly attached to that behavior. I could revert it. Hm. I don't really see a point in allowing it - it seems more likely to be a mistake by the user, expecting that the insertion now works conflict free. But I don't really care much. Greetings, Andres Freund
On Thu, May 7, 2015 at 11:32 AM, Andres Freund <andres@anarazel.de> wrote: > Hm. I don't really see a point in allowing it - it seems more likely to > be a mistake by the user, expecting that the insertion now works > conflict free. But I don't really care much. The only downside I see is that it might not work in preventing the statement from proceeding. In other words, you might have a partial unique index style inference clause (i.e. with a predicate) that infers a non-partial index, because everything else matches, and no predicate (on the index) satisfies the inference predicate. So it comes down to the actual definition of indexes, as opposed to the statement that inferred those indexes, which isn't quite the same thing. It is hardly really a defect at all, but it is something I have considered as a little confusing. However, I doubt that many people will ever find themselves looking for an explanation for this. -- Peter Geoghegan
On Thu, May 7, 2015 at 2:14 PM, Peter Geoghegan <pg@heroku.com> wrote: > The only downside I see is that it might not work in preventing the > statement from proceeding. In other words, you might have a partial > unique index style inference clause (i.e. with a predicate) that > infers a non-partial index, because everything else matches, and no > predicate (on the index) satisfies the inference predicate. So it > comes down to the actual definition of indexes, as opposed to the > statement that inferred those indexes, which isn't quite the same > thing. Hmm. I think it might be inconsistent with our position on NULL values with this feature - which is that insertion will always proceed - to deny insertion from proceeding here. On reflection, it seems like a bit of a POLA violation. So I'm going to remove the error. -- Peter Geoghegan
I don't see a problem at all, with one exception: If we want the AS to
be optional like in a bunch of other places, we have to either promote
VALUES to a reserved keyword, only accept unreserved keywords, or play
precedence games. I think it'd be perfectly fine to not make AS
optional.
Although I've always used "AS" in all contexts
because I think the language is horribly
unclear without it, it seems obtuse to allow its absence
in the SQL-conforming parts of the language and not elsewhere
.
Is anyone really using VALUES as a non-keyword? It's reserved in all the SQL standards, which seems like storing up trouble for anyone using it otherwise.
Geoff