Обсуждение: Support UPDATE table SET(*)=...

Поиск
Список
Период
Сортировка

Support UPDATE table SET(*)=...

От
Atri Sharma
Дата:
Hi All,

Please find attached a patch which implements support for UPDATE table1 SET(*)=...
The patch supports both UPDATE table SET(*)=(a,b,c) and UPDATE table1 SET(*)=(SELECT a,b,c FROM...).
It solves the problem of doing UPDATE from a record variable of the same type as the table e.g. update foo set (*) = (select foorec.*) where ...;

The design is simple. It basically expands the * in transformation stage, does the necessary type checking and adds it to the parse tree. This allows for normal execution for the rest of the stages.


Thoughts/Comments?

Regards,

Atri
Вложения

Re: Support UPDATE table SET(*)=...

От
Marti Raudsepp
Дата:
Hi

On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma <atri.jiit@gmail.com> wrote:
> Please find attached a patch which implements support for UPDATE table1
> SET(*)=...

I presume you haven't read Tom Lane's proposal and discussion about
multiple column assignment in UPDATE:
http://www.postgresql.org/message-id/1783.1399054541@sss.pgh.pa.us
(Assigning all columns was also discussed there)

And there's a WIP patch:
http://www.postgresql.org/message-id/20930.1402931841@sss.pgh.pa.us

Regards,
Marti



Re: Support UPDATE table SET(*)=...

От
Atri Sharma
Дата:


On Wednesday, October 15, 2014, Marti Raudsepp <marti@juffo.org> wrote:
Hi

On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma <atri.jiit@gmail.com> wrote:
> Please find attached a patch which implements support for UPDATE table1
> SET(*)=...

I presume you haven't read Tom Lane's proposal and discussion about
multiple column assignment in UPDATE:
http://www.postgresql.org/message-id/1783.1399054541@sss.pgh.pa.us
(Assigning all columns was also discussed there)

And there's a WIP patch:
http://www.postgresql.org/message-id/20930.1402931841@sss.pgh.pa.us



Thanks for the links, but this patch only targets SET(*) case, which, if I understand correctly, the patch you mentioned doesn't directly handle (If I understand correctly, the target of the two patches is different). 

Regards,

Atri

--
Regards,
 
Atri
l'apprenant

Re: Support UPDATE table SET(*)=...

От
Atri Sharma
Дата:


On Wed, Oct 15, 2014 at 2:18 PM, Atri Sharma <atri.jiit@gmail.com> wrote:





On Wednesday, October 15, 2014, Marti Raudsepp <marti@juffo.org> wrote:
Hi

On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma <atri.jiit@gmail.com> wrote:
> Please find attached a patch which implements support for UPDATE table1
> SET(*)=...

I presume you haven't read Tom Lane's proposal and discussion about
multiple column assignment in UPDATE:
http://www.postgresql.org/message-id/1783.1399054541@sss.pgh.pa.us
(Assigning all columns was also discussed there)

And there's a WIP patch:
http://www.postgresql.org/message-id/20930.1402931841@sss.pgh.pa.us



Thanks for the links, but this patch only targets SET(*) case, which, if I understand correctly, the patch you mentioned doesn't directly handle (If I understand correctly, the target of the two patches is different). 


Digging more, I figured that the patch I posted builds on top of Tom's patch,  since it did not add whole row cases.

Regards,

Atri

Re: Support UPDATE table SET(*)=...

От
Merlin Moncure
Дата:
On Wed, Oct 15, 2014 at 3:48 AM, Atri Sharma <atri.jiit@gmail.com> wrote:
>
>
> On Wednesday, October 15, 2014, Marti Raudsepp <marti@juffo.org> wrote:
>>
>> Hi
>>
>> On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma <atri.jiit@gmail.com> wrote:
>> > Please find attached a patch which implements support for UPDATE table1
>> > SET(*)=...
>>
>> I presume you haven't read Tom Lane's proposal and discussion about
>> multiple column assignment in UPDATE:
>> http://www.postgresql.org/message-id/1783.1399054541@sss.pgh.pa.us
>> (Assigning all columns was also discussed there)
>>
>> And there's a WIP patch:
>> http://www.postgresql.org/message-id/20930.1402931841@sss.pgh.pa.us
>
> Thanks for the links, but this patch only targets SET(*) case, which, if I
> understand correctly, the patch you mentioned doesn't directly handle (If I
> understand correctly, the target of the two patches is different).

Yeah -- in fact, there was some discussion about this exact case.
This patch solves a very important problem: when doing record
operations to move data between databases with identical schema
there's currently no way to 'update' in a generic way without building
out the entire field list via complicated and nasty dynamic SQL.  I'm
not sure about the proposed syntax though; it seems a little weird to
me.  Any particular reason why you couldn't have just done:

UPDATE table1 SET * = a,b,c, ...

also,

UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);

seems cleaner than the proposed syntax for row assignment.  Tom
objected though IIRC.

merlin



Re: Support UPDATE table SET(*)=...

От
Atri Sharma
Дата:


On Fri, Oct 17, 2014 at 7:45 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Wed, Oct 15, 2014 at 3:48 AM, Atri Sharma <atri.jiit@gmail.com> wrote:
>
>
> On Wednesday, October 15, 2014, Marti Raudsepp <marti@juffo.org> wrote:
>>
>> Hi
>>
>> On Wed, Oct 15, 2014 at 11:02 AM, Atri Sharma <atri.jiit@gmail.com> wrote:
>> > Please find attached a patch which implements support for UPDATE table1
>> > SET(*)=...
>>
>> I presume you haven't read Tom Lane's proposal and discussion about
>> multiple column assignment in UPDATE:
>> http://www.postgresql.org/message-id/1783.1399054541@sss.pgh.pa.us
>> (Assigning all columns was also discussed there)
>>
>> And there's a WIP patch:
>> http://www.postgresql.org/message-id/20930.1402931841@sss.pgh.pa.us
>
> Thanks for the links, but this patch only targets SET(*) case, which, if I
> understand correctly, the patch you mentioned doesn't directly handle (If I
> understand correctly, the target of the two patches is different).

Yeah -- in fact, there was some discussion about this exact case.
This patch solves a very important problem: when doing record
operations to move data between databases with identical schema
there's currently no way to 'update' in a generic way without building
out the entire field list via complicated and nasty dynamic SQL. 

Thanks!
 
I'm
not sure about the proposed syntax though; it seems a little weird to
me.  Any particular reason why you couldn't have just done:

UPDATE table1 SET * = a,b,c, ...

also,

UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);

I honestly have not spent a lot of time thinking about the exact syntax that may be acceptable. If we have arguments for or against a specific syntax, I will be glad to incorporate them. 

Re: Support UPDATE table SET(*)=...

От
Marko Tiikkaja
Дата:
On 10/17/14 4:15 PM, Merlin Moncure wrote:
> Any particular reason why you couldn't have just done:
>
> UPDATE table1 SET * = a,b,c, ...

That just looks wrong to me.  I'd prefer  (*) = ..  over that any day.

> UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);
>
> seems cleaner than the proposed syntax for row assignment.  Tom
> objected though IIRC.

I don't know about Tom, but I didn't like that: 
http://www.postgresql.org/message-id/5364C982.7060003@joh.to


.marko



Re: Support UPDATE table SET(*)=...

От
Merlin Moncure
Дата:
On Fri, Oct 17, 2014 at 9:55 AM, Marko Tiikkaja <marko@joh.to> wrote:
> On 10/17/14 4:15 PM, Merlin Moncure wrote:
>>
>> Any particular reason why you couldn't have just done:
>>
>> UPDATE table1 SET * = a,b,c, ...
>
>
> That just looks wrong to me.  I'd prefer  (*) = ..  over that any day.
>
>> UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);
>>
>> seems cleaner than the proposed syntax for row assignment.  Tom
>> objected though IIRC.
>
> I don't know about Tom, but I didn't like that:
> http://www.postgresql.org/message-id/5364C982.7060003@joh.to

Hm, I didn't understand your objection:

<quoting>
So e.g.:  UPDATE foo f SET f = ..;

would resolve to the table, despite there being a column called "f"?
That would break backwards compatibility.
</quoting>

That's not correct: it should work exactly as 'select' does; given a
conflict resolve the field name, so no backwards compatibility issue.

merlin



Re: Support UPDATE table SET(*)=...

От
Tom Lane
Дата:
Merlin Moncure <mmoncure@gmail.com> writes:
> On Wed, Oct 15, 2014 at 3:48 AM, Atri Sharma <atri.jiit@gmail.com> wrote:
>> Thanks for the links, but this patch only targets SET(*) case, which, if I
>> understand correctly, the patch you mentioned doesn't directly handle (If I
>> understand correctly, the target of the two patches is different).

> Yeah -- in fact, there was some discussion about this exact case.
> This patch solves a very important problem: when doing record
> operations to move data between databases with identical schema
> there's currently no way to 'update' in a generic way without building
> out the entire field list via complicated and nasty dynamic SQL.  I'm
> not sure about the proposed syntax though; it seems a little weird to
> me.  Any particular reason why you couldn't have just done:

> UPDATE table1 SET * = a,b,c, ...

> also,

> UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);

> seems cleaner than the proposed syntax for row assignment.  Tom
> objected though IIRC.

That last proposal is no good because it would be ambiguous if the
table contains a column by that name.  The "(*)" idea actually seems
not bad, since it's shorthand for a parenthesized column list.

I'm not sure about the patch itself though --- in particular, it
doesn't seem to me that it should be touching transformTargetList,
since that doesn't have anything to do with expansion of multiassignments
today.  Probably this is a symptom of having chosen a poor representation
of the construct in the raw grammar output.  However, I've not exactly
wrapped my head around what that representation is ... the lack of any
comments explaining it doesn't help.

A larger question is whether it's appropriate to do the *-expansion
at parse time, or whether we'd need to postpone it to later in order
to handle reasonable use-cases.  Since we expand "SELECT *" at parse
time (and are mandated to do so by the SQL spec, I believe), it seems
consistent to do this at parse time as well; but perhaps there is a
counter argument.
        regards, tom lane



Re: Support UPDATE table SET(*)=...

От
Marko Tiikkaja
Дата:
On 10/17/14 5:03 PM, Merlin Moncure wrote:
> Hm, I didn't understand your objection:
>
> <quoting>
> So e.g.:
>     UPDATE foo f SET f = ..;
>
> would resolve to the table, despite there being a column called "f"?
> That would break backwards compatibility.
> </quoting>
>
> That's not correct: it should work exactly as 'select' does; given a
> conflict resolve the field name, so no backwards compatibility issue.

local:marko=# show server_version; server_version
---------------- 9.1.13
(1 row)

local:marko=#* create table foo(f int);
CREATE TABLE
local:marko=#* update foo f set f=1;
UPDATE 0

This query would change meaning with your suggestion.

I'm not saying it would be a massive problem in practice, but I think we 
should first consider options which don't break backwards compatibility, 
even if some consider them "less clean".


.marko



Re: Support UPDATE table SET(*)=...

От
Tom Lane
Дата:
Merlin Moncure <mmoncure@gmail.com> writes:
> On Fri, Oct 17, 2014 at 9:55 AM, Marko Tiikkaja <marko@joh.to> wrote:
>> I don't know about Tom, but I didn't like that:
>> http://www.postgresql.org/message-id/5364C982.7060003@joh.to

> Hm, I didn't understand your objection:

> <quoting>
> So e.g.:
>    UPDATE foo f SET f = ..;

> would resolve to the table, despite there being a column called "f"?
> That would break backwards compatibility.
> </quoting>

> That's not correct: it should work exactly as 'select' does; given a
> conflict resolve the field name, so no backwards compatibility issue.

The point is that it's fairly messy (and bug-prone) to have a syntax
where we have to make an arbitrary choice between two reasonable
interpretations.

If you look back at the whole thread Marko's above-cited message is in,
we'd considered a bunch of different possible syntaxes for this, and
none of them had much support.  The "(*)" idea actually is starting to
look pretty good to me.
        regards, tom lane



Re: Support UPDATE table SET(*)=...

От
Merlin Moncure
Дата:
On Fri, Oct 17, 2014 at 10:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> On Fri, Oct 17, 2014 at 9:55 AM, Marko Tiikkaja <marko@joh.to> wrote:
>>> I don't know about Tom, but I didn't like that:
>>> http://www.postgresql.org/message-id/5364C982.7060003@joh.to
>
>> Hm, I didn't understand your objection:
>
>> <quoting>
>> So e.g.:
>>    UPDATE foo f SET f = ..;
>
>> would resolve to the table, despite there being a column called "f"?
>> That would break backwards compatibility.
>> </quoting>
>
>> That's not correct: it should work exactly as 'select' does; given a
>> conflict resolve the field name, so no backwards compatibility issue.
>
> The point is that it's fairly messy (and bug-prone) to have a syntax
> where we have to make an arbitrary choice between two reasonable
> interpretations.
>
> If you look back at the whole thread Marko's above-cited message is in,
> we'd considered a bunch of different possible syntaxes for this, and
> none of them had much support.  The "(*)" idea actually is starting to
> look pretty good to me.

Hm, I'll take it then.

merlin



Re: Support UPDATE table SET(*)=...

От
Tom Lane
Дата:
Marko Tiikkaja <marko@joh.to> writes:
> local:marko=#* create table foo(f int);
> CREATE TABLE
> local:marko=#* update foo f set f=1;
> UPDATE 0

> This query would change meaning with your suggestion.

I think it wouldn't; Merlin is proposing that f would be taken as the
field name.  A more realistic objection goes like this:

create table foo(f int, g int);
update foo x set x = (1,2);  -- works
alter table foo add column x int;
update foo x set x = (1,2,3);  -- no longer works

It's not a real good thing if a column addition or renaming can
so fundamentally change the nature of a query.
        regards, tom lane



Re: Support UPDATE table SET(*)=...

От
Merlin Moncure
Дата:
On Fri, Oct 17, 2014 at 10:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Marko Tiikkaja <marko@joh.to> writes:
>> local:marko=#* create table foo(f int);
>> CREATE TABLE
>> local:marko=#* update foo f set f=1;
>> UPDATE 0
>
>> This query would change meaning with your suggestion.
>
> I think it wouldn't; Merlin is proposing that f would be taken as the
> field name.  A more realistic objection goes like this:
>
> create table foo(f int, g int);
> update foo x set x = (1,2);  -- works
> alter table foo add column x int;
> update foo x set x = (1,2,3);  -- no longer works
>
> It's not a real good thing if a column addition or renaming can
> so fundamentally change the nature of a query.

That's exactly how SELECT works.  I also dispute that the user should
be surprised in such cases.

merlin



Re: Support UPDATE table SET(*)=...

От
Tom Lane
Дата:
Merlin Moncure <mmoncure@gmail.com> writes:
> On Fri, Oct 17, 2014 at 10:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think it wouldn't; Merlin is proposing that f would be taken as the
>> field name.  A more realistic objection goes like this:
>> 
>> create table foo(f int, g int);
>> update foo x set x = (1,2);  -- works
>> alter table foo add column x int;
>> update foo x set x = (1,2,3);  -- no longer works
>> 
>> It's not a real good thing if a column addition or renaming can
>> so fundamentally change the nature of a query.

> That's exactly how SELECT works.  I also dispute that the user should
> be surprised in such cases.

Well, the reason we have a problem in SELECT is that we support an
unholy miscegenation of SQL-spec and PostQUEL syntax: the idea that
"SELECT foo FROM foo" could represent a whole-row selection is nowhere
to be found in the SQL standard, for good reason IMO.  But we've never
had the courage to break cleanly with this Berkeley leftover and
insist that people spell it SQL-style as "SELECT ROW(foo.*) FROM foo".
I'd just as soon not introduce the same kind of ambiguity into UPDATE
if we have a reasonable alternative.
        regards, tom lane



Re: Support UPDATE table SET(*)=...

От
Merlin Moncure
Дата:
On Fri, Oct 17, 2014 at 10:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> On Fri, Oct 17, 2014 at 10:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I think it wouldn't; Merlin is proposing that f would be taken as the
>>> field name.  A more realistic objection goes like this:
>>>
>>> create table foo(f int, g int);
>>> update foo x set x = (1,2);  -- works
>>> alter table foo add column x int;
>>> update foo x set x = (1,2,3);  -- no longer works
>>>
>>> It's not a real good thing if a column addition or renaming can
>>> so fundamentally change the nature of a query.
>
>> That's exactly how SELECT works.  I also dispute that the user should
>> be surprised in such cases.
>
> Well, the reason we have a problem in SELECT is that we support an
> unholy miscegenation of SQL-spec and PostQUEL syntax: the idea that
> "SELECT foo FROM foo" could represent a whole-row selection is nowhere
> to be found in the SQL standard, for good reason IMO.  But we've never
> had the courage to break cleanly with this Berkeley leftover and
> insist that people spell it SQL-style as "SELECT ROW(foo.*) FROM foo".
> I'd just as soon not introduce the same kind of ambiguity into UPDATE
> if we have a reasonable alternative.

Ah, interesting point (I happen to like the terse syntax and use it
often).  This is for posterity anyways since you guys seem to like
Atri's proposal, which surprised me.  However, I think you're over
simplifying things here.  Syntax aside: I think
SELECT f FROM foo f;
and a hypothetical
SELECT row(f.*) FROM foo f;

give different semantics.  The former gives an object of type 'f' and
the latter gives type 'row'.  To get parity, you'd have to add an
extra cast which means you'd have to play tricky games to avoid extra
performance overhead besides being significantly more verbose.  I'm
aware some of the other QUELisms are pretty dodgy and have burned us
in the past (like that whole function as record reference thing) but
pulling a record as a field in select is pretty nice.  It's also
widely used and quite useful in json serialization.

merlin



Re: Support UPDATE table SET(*)=...

От
Marti Raudsepp
Дата:
<p dir="ltr"><br /> On Oct 17, 2014 6:16 PM, "Tom Lane" <<a
href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>>wrote:<br /> > A more realistic objection goes like
this:<br/> ><br /> > create table foo(f int, g int);<br /> > update foo x set x = (1,2);  -- works<br /> >
altertable foo add column x int;<br /> > update foo x set x = (1,2,3);  -- no longer works<br /> ><br /> >
It'snot a real good thing if a column addition or renaming can<br /> > so fundamentally change the nature of a
query.<pdir="ltr">I think a significant use case for this feature is when you already have a row-value and want to
persistit in the database, like you can do with INSERT:<br /> insert into foo select * from
populate_record_json(null::foo,'{...}');<p dir="ltr">In this case the opposite is true: requiring explicit column names
wouldbreak the query if you add columns to the table. The fact that you can't reasonably use populate_record/_json with
UPDATEis a significant omission. IMO this really speaks for supporting shorthand whole-row assignment, whatever the
syntax.<pdir="ltr">Regards,<br /> Marti 

Re: Support UPDATE table SET(*)=...

От
Merlin Moncure
Дата:
On Fri, Oct 17, 2014 at 10:47 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Fri, Oct 17, 2014 at 10:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Merlin Moncure <mmoncure@gmail.com> writes:
>>> On Fri, Oct 17, 2014 at 10:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> I think it wouldn't; Merlin is proposing that f would be taken as the
>>>> field name.  A more realistic objection goes like this:
>>>>
>>>> create table foo(f int, g int);
>>>> update foo x set x = (1,2);  -- works
>>>> alter table foo add column x int;
>>>> update foo x set x = (1,2,3);  -- no longer works
>>>>
>>>> It's not a real good thing if a column addition or renaming can
>>>> so fundamentally change the nature of a query.
>>
>>> That's exactly how SELECT works.  I also dispute that the user should
>>> be surprised in such cases.
>>
>> Well, the reason we have a problem in SELECT is that we support an
>> unholy miscegenation of SQL-spec and PostQUEL syntax: the idea that
>> "SELECT foo FROM foo" could represent a whole-row selection is nowhere
>> to be found in the SQL standard, for good reason IMO.  But we've never
>> had the courage to break cleanly with this Berkeley leftover and
>> insist that people spell it SQL-style as "SELECT ROW(foo.*) FROM foo".
>> I'd just as soon not introduce the same kind of ambiguity into UPDATE
>> if we have a reasonable alternative.
>
> Ah, interesting point (I happen to like the terse syntax and use it
> often).  This is for posterity anyways since you guys seem to like
> Atri's proposal, which surprised me.  However, I think you're over
> simplifying things here.  Syntax aside: I think
> SELECT f FROM foo f;
> and a hypothetical
> SELECT row(f.*) FROM foo f;
>
> give different semantics.  The former gives an object of type 'f' and
> the latter gives type 'row'.  To get parity, you'd have to add an
> extra cast which means you'd have to play tricky games to avoid extra
> performance overhead besides being significantly more verbose.  I'm
> aware some of the other QUELisms are pretty dodgy and have burned us
> in the past (like that whole function as record reference thing) but
> pulling a record as a field in select is pretty nice.  It's also
> widely used and quite useful in json serialization.

Been thinking about this in the back of my mind the last couple of
days.  There are some things you can do with the QUEL 'table as
column' in select syntax that are impossible otherwise, at least
today, and its usage is going to proliferate because of that.  Row
construction via () or row() needs to be avoided whenever the column
names are important and there is no handy type to cast to. For
example, especially during json serialization it's convenient to do
things like:

select array_agg((select q from (select a, b) q) order by ...)
from foo;

...where a,b are fields of foo.  FWICT, this is the only way to do
json serialization of arbitrary runtime row constructions in a way
that does not anonymize the type.  Until I figured out this trick I
used to create lots of composite types that served no purpose other
than to give me a type to cast to which is understandably annoying.

if:
select (x).* from (select (1, 2) as x) q;

worked and properly expanded x to given names should they exist and:
SELECT row(f.*) FROM foo f;

worked and did same, and:
SELECT (row(f.*)).* FROM foo f;

was as reasonably performant and gave the same results as:
SELECT (f).* FROM foo f;

...then IMSNHO you'd have a reasonable path to deprecating the QUEL
inspired syntax.  Food for thought.

merlin



Re: Support UPDATE table SET(*)=...

От
Marko Tiikkaja
Дата:
Hi Atri,

Sorry for the delay.  With pgconf.eu and all, it's been very hard to 
find the time to look at this.

On 10/15/14, 10:02 AM, Atri Sharma wrote:
> Please find attached a patch which implements support for UPDATE table1
> SET(*)=...
> The patch supports both UPDATE table SET(*)=(a,b,c) and UPDATE table1
> SET(*)=(SELECT a,b,c FROM...).
> It solves the problem of doing UPDATE from a record variable of the same
> type as the table e.g. update foo set (*) = (select foorec.*) where ...;

Excellent!  This is a very welcome change.

I've had a few looks at this patch and I have a few comments:
  1) This doesn't work for the zero-column table case at all:       CREATE TABLE foo();       UPDATE foo SET (*) =
(SELECT);      ERROR:  number of columns does not match number of values  2) What's the purpose of the second condition
here?      if (!(origTarget) || !(origTarget->name))  3) The extra parentheses around everything make this code for
some
 
reason very hard to read.  4) transformTargetList() is a mess right now.  If this is the 
approach we want to take, the common code should probably be refactored 
into a function.  But the usage of List as a somehow magical way to 
represent the SET (*) case makes me feel weird inside.  5) The complete lack of regression tests make it hard to poke
around
 
the code to try and figure out what each line/condition is trying to do.

I feel like I understand what this code is doing and some details feel a 
bit icky, but I'm not the right person to comment on whether the broad 
strokes are on the right canvas or not.  Maybe someone else wants to 
take a closer look before Atri spends too much time on this approach? 
After all, this patch has a very distinctive WIP feel to it, so I guess 
feedback on the general approach is what's being sought after here, and 
in that area I consider my skills and knowledge lacking.

> The design is simple. It basically expands the * in transformation stage,
> does the necessary type checking and adds it to the parse tree. This allows
> for normal execution for the rest of the stages.

I can't poke any big holes into this approach (disregarding the details 
of this implementation), but perhaps someone else can?



.marko



Re: Support UPDATE table SET(*)=...

От
Tom Lane
Дата:
Marko Tiikkaja <marko@joh.to> writes:
> On 10/15/14, 10:02 AM, Atri Sharma wrote:
>> Please find attached a patch which implements support for UPDATE table1
>> SET(*)=...

> I've had a few looks at this patch and I have a few comments:

>    1) This doesn't work for the zero-column table case at all:
>         CREATE TABLE foo();
>         UPDATE foo SET (*) = (SELECT);
>         ERROR:  number of columns does not match number of values
>    2) What's the purpose of the second condition here?
>         if (!(origTarget) || !(origTarget->name))
>    3) The extra parentheses around everything make this code for some 
> reason very hard to read.
>    4) transformTargetList() is a mess right now.  If this is the 
> approach we want to take, the common code should probably be refactored 
> into a function.  But the usage of List as a somehow magical way to 
> represent the SET (*) case makes me feel weird inside.
>    5) The complete lack of regression tests make it hard to poke around 
> the code to try and figure out what each line/condition is trying to do.

> I feel like I understand what this code is doing and some details feel a 
> bit icky, but I'm not the right person to comment on whether the broad 
> strokes are on the right canvas or not.  Maybe someone else wants to 
> take a closer look before Atri spends too much time on this approach? 

FWIW, I opined upthread that transformTargetList was not the place to
be handling this; it should be done in the same place(s) that support
the existing MultiAssignRef feature, and transformTargetList is not
that.

I think what's likely missing here is a clear design for the raw parse
tree representation (what's returned by the bison grammar).  The patch
seems to be trying to skate by without creating any new parse node types
or fields, but that may well be a bad idea.  At the very least there
needs to be some commentary added to parsenodes.h explaining what the
representation actually is (cf commentary there for MultiAssignRef).

Also, I think it's a mistake not to be following the MultiAssignRef
model for the case of "(*) = ctext_row".  We optimize the ROW-source
case at the grammar stage when there's a fixed number of target columns,
because that's a very easy transformation --- but you can't do it like
that when there's not.  It's possible that that optimization should be
delayed till later even in the existing case; in general, optimizing
in gram.y is a bad habit that's better avoided ...
        regards, tom lane



Re: Support UPDATE table SET(*)=...

От
Michael Paquier
Дата:
On Wed, Nov 26, 2014 at 4:26 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think what's likely missing here is a clear design for the raw parse
> tree representation (what's returned by the bison grammar).  The patch
> seems to be trying to skate by without creating any new parse node types
> or fields, but that may well be a bad idea.  At the very least there
> needs to be some commentary added to parsenodes.h explaining what the
> representation actually is (cf commentary there for MultiAssignRef).
>
> Also, I think it's a mistake not to be following the MultiAssignRef
> model for the case of "(*) = ctext_row".  We optimize the ROW-source
> case at the grammar stage when there's a fixed number of target columns,
> because that's a very easy transformation --- but you can't do it like
> that when there's not.  It's possible that that optimization should be
> delayed till later even in the existing case; in general, optimizing
> in gram.y is a bad habit that's better avoided ...
Marking as "returned with feedback" based on those comments.
-- 
Michael



Re: Support UPDATE table SET(*)=...

От
Atri Sharma
Дата:


On Mon, Dec 8, 2014 at 6:06 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Wed, Nov 26, 2014 at 4:26 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think what's likely missing here is a clear design for the raw parse
> tree representation (what's returned by the bison grammar).  The patch
> seems to be trying to skate by without creating any new parse node types
> or fields, but that may well be a bad idea.  At the very least there
> needs to be some commentary added to parsenodes.h explaining what the
> representation actually is (cf commentary there for MultiAssignRef).
>
> Also, I think it's a mistake not to be following the MultiAssignRef
> model for the case of "(*) = ctext_row".  We optimize the ROW-source
> case at the grammar stage when there's a fixed number of target columns,
> because that's a very easy transformation --- but you can't do it like
> that when there's not.  It's possible that that optimization should be
> delayed till later even in the existing case; in general, optimizing
> in gram.y is a bad habit that's better avoided ...
Marking as "returned with feedback" based on those comments.


Thank you everybody for your comments.

I am indeed trying to avoid a new node since my intuitive feeling says that we do not need a new node for a functionality which is present in other commands albeit in different form. However, I agree that documentation explaining parse representation is lacking and shall improve that. Also, in accordance to Tom's comments, I shall look for not touching target list directly and follow the path of MultiAssignRef.

I have moved patch to current CF and marked it as Waiting on Author since I plan to resubmit after addressing the concerns.

Regards,

Atri


--
Regards,
 
Atri
l'apprenant

Re: Support UPDATE table SET(*)=...

От
Michael Paquier
Дата:
On Mon, Dec 15, 2014 at 7:50 PM, Atri Sharma <atri.jiit@gmail.com> wrote:
> I have moved patch to current CF and marked it as Waiting on Author since I
> plan to resubmit after addressing the concerns.
Nothing happened in the last month, so marking as returned with feedback.
-- 
Michael



Re: Support UPDATE table SET(*)=...

От
Atri Sharma
Дата:
Hi all,

Sorry for the delay.

Please find attached latest version of UPDATE (*) patch.The patch implements review comments and Tom's gripe about touching transformTargetList is addressed now. I have added regression tests and simplified parser representation a bit.

Regards,

Atri
Вложения

Re: Support UPDATE table SET(*)=...

От
Tom Lane
Дата:
Atri Sharma <atri.jiit@gmail.com> writes:
> Please find attached latest version of UPDATE (*) patch.The patch
> implements review comments and Tom's gripe about touching
> transformTargetList is addressed now. I have added regression tests and
> simplified parser representation a bit.

I spent a fair amount of time cleaning this patch up to get it into
committable shape, but as I was working on the documentation I started
to lose enthusiasm for it, because I was having a hard time coming up
with compelling examples.  The originally proposed motivation was

>> It solves the problem of doing UPDATE from a record variable of the same
>> type as the table e.g. update foo set (*) = (select foorec.*) where ...;

but it feels to me that this is not actually a very good solution to that
problem --- even granting that the problem needs to be solved, a claim
that still requires some justification IMO.  Here is a possible use-case:

regression=# create table src (f1 int, f2 text, f3 real);
CREATE TABLE
regression=# create table dst (f1 int, f2 text, f3 real);
CREATE TABLE
regression=# create rule r1 as on update to src do instead
regression-# update dst set (*) = (new.*) where dst.f1 = old.f1;
ERROR:  number of columns does not match number of values
LINE 2: update dst set (*) = (new.*) where dst.f1 = old.f1;
                              ^

So that's annoying.  You can work around it with this very unobvious
(and undocumented) syntax hack:

regression=# create rule r1 as on update to src do instead
regression-# update dst set (*) = (select new.*) where dst.f1 = old.f1;
CREATE RULE

But what happens if dst has no matching row?  Your data goes into the
bit bucket, that's what.  What you really wanted here was some kind of
UPSERT.  There's certainly a possible use-case for a notation like this
in UPSERT, when we get around to implementing that; but I'm less than
convinced that we need it in plain UPDATE.

What's much worse though is that the rule that actually gets stored is:

regression=# \d+ src
                          Table "public.src"
 Column |  Type   | Modifiers | Storage  | Stats target | Description
--------+---------+-----------+----------+--------------+-------------
 f1     | integer |           | plain    |              |
 f2     | text    |           | extended |              |
 f3     | real    |           | plain    |              |
Rules:
    r1 AS
    ON UPDATE TO src DO INSTEAD  UPDATE dst SET (f1, f2, f3) = ( SELECT new.f1,
            new.f2,
            new.f3)
  WHERE dst.f1 = old.f1

That is, you don't actually have any protection at all against future
additions of columns, which seems like it would be most of the point
of using a notation like this.

We could possibly address that by postponing expansion of the *-notation
to rule rewrite time, but that seems like it'd be a complete disaster in
terms of modularity, or even usability (since column-mismatch errors
wouldn't get raised till then either).  And it'd certainly be a far more
invasive and complex patch than this.

So I'm feeling that this may not be a good idea, or at least not a good
implementation of the idea.  I'm inclined to reject the patch rather than
lock us into something that is not standard and doesn't really do what
people would be likely to want.

Attached is the updated patch that I had before arriving at this
discouraging conclusion.

            regards, tom lane

diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml
index 35b0699..adb4473 100644
*** a/doc/src/sgml/ref/update.sgml
--- b/doc/src/sgml/ref/update.sgml
*************** PostgreSQL documentation
*** 25,31 ****
  UPDATE [ ONLY ] <replaceable class="PARAMETER">table_name</replaceable> [ * ] [ [ AS ] <replaceable
class="parameter">alias</replaceable>] 
      SET { <replaceable class="PARAMETER">column_name</replaceable> = { <replaceable
class="PARAMETER">expression</replaceable>| DEFAULT } | 
            ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) = ( { <replaceable
class="PARAMETER">expression</replaceable>| DEFAULT } [, ...] ) | 
!           ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) = ( <replaceable
class="PARAMETER">sub-SELECT</replaceable>) 
          } [, ...]
      [ FROM <replaceable class="PARAMETER">from_list</replaceable> ]
      [ WHERE <replaceable class="PARAMETER">condition</replaceable> | WHERE CURRENT OF <replaceable
class="PARAMETER">cursor_name</replaceable>] 
--- 25,33 ----
  UPDATE [ ONLY ] <replaceable class="PARAMETER">table_name</replaceable> [ * ] [ [ AS ] <replaceable
class="parameter">alias</replaceable>] 
      SET { <replaceable class="PARAMETER">column_name</replaceable> = { <replaceable
class="PARAMETER">expression</replaceable>| DEFAULT } | 
            ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) = ( { <replaceable
class="PARAMETER">expression</replaceable>| DEFAULT } [, ...] ) | 
!           ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) = ( <replaceable
class="PARAMETER">sub-SELECT</replaceable>) | 
!           (*) = ( { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } [, ...] ) |
!           (*) = ( <replaceable class="PARAMETER">sub-SELECT</replaceable> )
          } [, ...]
      [ FROM <replaceable class="PARAMETER">from_list</replaceable> ]
      [ WHERE <replaceable class="PARAMETER">condition</replaceable> | WHERE CURRENT OF <replaceable
class="PARAMETER">cursor_name</replaceable>] 
*************** UPDATE [ ONLY ] <replaceable class="PARA
*** 164,169 ****
--- 166,181 ----
     </varlistentry>

     <varlistentry>
+     <term><literal>(*)</literal></term>
+     <listitem>
+      <para>
+       In the <literal>SET</> clause, <literal>(*)</literal> stands for a
+       parenthesized column list of all the table's columns, in order.
+      </para>
+     </listitem>
+    </varlistentry>
+
+    <varlistentry>
      <term><replaceable class="PARAMETER">from_list</replaceable></term>
      <listitem>
       <para>
*************** UPDATE summary s SET (sum_x, sum_y, avg_
*** 374,379 ****
--- 386,398 ----
    </para>

    <para>
+    Update complete rows from another table having the same set of columns:
+ <programlisting>
+ UPDATE dest_table d SET (*) = (SELECT * FROM src_table s WHERE s.id = d.id);
+ </programlisting>
+   </para>
+
+   <para>
     Attempt to insert a new stock item along with the quantity of stock. If
     the item already exists, instead update the stock count of the existing
     item. To do this without failing the entire transaction, use savepoints:
*************** UPDATE films SET kind = 'Dramatic' WHERE
*** 407,413 ****
     This command conforms to the <acronym>SQL</acronym> standard, except
     that the <literal>FROM</literal> and <literal>RETURNING</> clauses
     are <productname>PostgreSQL</productname> extensions, as is the ability
!    to use <literal>WITH</> with <command>UPDATE</>.
    </para>

    <para>
--- 426,434 ----
     This command conforms to the <acronym>SQL</acronym> standard, except
     that the <literal>FROM</literal> and <literal>RETURNING</> clauses
     are <productname>PostgreSQL</productname> extensions, as is the ability
!    to use <literal>WITH</> with <command>UPDATE</>.  Also, the <literal>(*)</>
!    syntax to specify all columns as a multiple-assignment target is a
!    <productname>PostgreSQL</productname> extension.
    </para>

    <para>
*************** UPDATE films SET kind = 'Dramatic' WHERE
*** 419,427 ****
    </para>

    <para>
!    According to the standard, the source value for a parenthesized sub-list of
!    column names can be any row-valued expression yielding the correct number
!    of columns.  <productname>PostgreSQL</productname> only allows the source
     value to be a parenthesized list of expressions (a row constructor) or a
     sub-<literal>SELECT</>.  An individual column's updated value can be
     specified as <literal>DEFAULT</> in the row-constructor case, but not
--- 440,449 ----
    </para>

    <para>
!    According to the standard, the source value for a multiple-assignment
!    target (that is, a parenthesized sub-list of column names) can be any
!    row-valued expression yielding the correct number of
!    columns.  <productname>PostgreSQL</productname> only allows the source
     value to be a parenthesized list of expressions (a row constructor) or a
     sub-<literal>SELECT</>.  An individual column's updated value can be
     specified as <literal>DEFAULT</> in the row-constructor case, but not
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 4a5a520..901bd14 100644
*** a/src/backend/parser/analyze.c
--- b/src/backend/parser/analyze.c
*************** transformUpdateStmt(ParseState *pstate,
*** 1899,1906 ****
--- 1899,1908 ----
  {
      Query       *qry = makeNode(Query);
      ParseNamespaceItem *nsitem;
+     Relation    target_rel;
      RangeTblEntry *target_rte;
      Node       *qual;
+     List       *targetList;
      ListCell   *origTargetList;
      ListCell   *tl;

*************** transformUpdateStmt(ParseState *pstate,
*** 1919,1924 ****
--- 1921,1927 ----
                                    interpretInhOption(stmt->relation->inhOpt),
                                           true,
                                           ACL_UPDATE);
+     target_rel = pstate->p_target_relation;

      /* grab the namespace item made by setTargetTable */
      nsitem = (ParseNamespaceItem *) llast(pstate->p_namespace);
*************** transformUpdateStmt(ParseState *pstate,
*** 1937,1943 ****
      nsitem->p_lateral_only = false;
      nsitem->p_lateral_ok = true;

!     qry->targetList = transformTargetList(pstate, stmt->targetList,
                                            EXPR_KIND_UPDATE_SOURCE);

      qual = transformWhereClause(pstate, stmt->whereClause,
--- 1940,2025 ----
      nsitem->p_lateral_only = false;
      nsitem->p_lateral_ok = true;

!     /*
!      * Before starting the main processing of the target list, we must expand
!      * any "(*)" multicolumn target clauses, replacing them with the actual
!      * column names of the result relation.  Since most UPDATEs won't use that
!      * feature, make a quick check to see if it's used and avoid expending the
!      * cycles to rebuild the target list if not.
!      */
!     targetList = stmt->targetList;
!
!     foreach(tl, targetList)
!     {
!         ResTarget  *res = (ResTarget *) lfirst(tl);
!
!         if (res->name == NULL)
!             break;
!     }
!     if (tl)
!     {
!         /* Yes, we have "(*)" ... expand the targetlist */
!         List       *newTargetList = NIL;
!
!         foreach(tl, targetList)
!         {
!             ResTarget  *res = (ResTarget *) lfirst(tl);
!             int            i,
!                         j,
!                         ncolumns;
!
!             /* If not "(*)", just keep it in the list */
!             if (res->name != NULL)
!             {
!                 newTargetList = lappend(newTargetList, res);
!                 continue;
!             }
!
!             /*
!              * Expand "(*)" into actual column names.  First we must count the
!              * number of non-dropped target columns.
!              */
!             ncolumns = 0;
!             for (i = 0; i < target_rel->rd_rel->relnatts; i++)
!             {
!                 Form_pg_attribute att = target_rel->rd_att->attrs[i];
!
!                 if (att->attisdropped)
!                     continue;
!
!                 ncolumns++;
!             }
!             j = 0;
!             for (i = 0; i < target_rel->rd_rel->relnatts; i++)
!             {
!                 Form_pg_attribute att = target_rel->rd_att->attrs[i];
!                 MultiAssignRef *r;
!                 ResTarget  *newres;
!
!                 if (att->attisdropped)
!                     continue;
!
!                 /* Make a MultiAssignRef for each column */
!                 r = makeNode(MultiAssignRef);
!                 r->source = res->val;
!                 r->colno = ++j;
!                 r->ncolumns = ncolumns;
!
!                 /* ... and a ResTarget */
!                 newres = makeNode(ResTarget);
!                 newres->name = pstrdup(NameStr(att->attname));
!                 newres->indirection = NIL;
!                 newres->val = (Node *) r;
!                 newres->location = res->location;
!
!                 newTargetList = lappend(newTargetList, newres);
!             }
!         }
!         targetList = newTargetList;
!     }
!
!     /* Now we can run the main processing of the target list */
!     qry->targetList = transformTargetList(pstate, targetList,
                                            EXPR_KIND_UPDATE_SOURCE);

      qual = transformWhereClause(pstate, stmt->whereClause,
*************** transformUpdateStmt(ParseState *pstate,
*** 1956,1967 ****
       */

      /* Prepare to assign non-conflicting resnos to resjunk attributes */
!     if (pstate->p_next_resno <= pstate->p_target_relation->rd_rel->relnatts)
!         pstate->p_next_resno = pstate->p_target_relation->rd_rel->relnatts + 1;

      /* Prepare non-junk columns for assignment to target table */
      target_rte = pstate->p_target_rangetblentry;
!     origTargetList = list_head(stmt->targetList);

      foreach(tl, qry->targetList)
      {
--- 2038,2049 ----
       */

      /* Prepare to assign non-conflicting resnos to resjunk attributes */
!     if (pstate->p_next_resno <= target_rel->rd_rel->relnatts)
!         pstate->p_next_resno = target_rel->rd_rel->relnatts + 1;

      /* Prepare non-junk columns for assignment to target table */
      target_rte = pstate->p_target_rangetblentry;
!     origTargetList = list_head(targetList);

      foreach(tl, qry->targetList)
      {
*************** transformUpdateStmt(ParseState *pstate,
*** 1986,1999 ****
          origTarget = (ResTarget *) lfirst(origTargetList);
          Assert(IsA(origTarget, ResTarget));

!         attrno = attnameAttNum(pstate->p_target_relation,
!                                origTarget->name, true);
          if (attrno == InvalidAttrNumber)
              ereport(ERROR,
                      (errcode(ERRCODE_UNDEFINED_COLUMN),
                       errmsg("column \"%s\" of relation \"%s\" does not exist",
                              origTarget->name,
!                          RelationGetRelationName(pstate->p_target_relation)),
                       parser_errposition(pstate, origTarget->location)));

          updateTargetListEntry(pstate, tle, origTarget->name,
--- 2068,2080 ----
          origTarget = (ResTarget *) lfirst(origTargetList);
          Assert(IsA(origTarget, ResTarget));

!         attrno = attnameAttNum(target_rel, origTarget->name, true);
          if (attrno == InvalidAttrNumber)
              ereport(ERROR,
                      (errcode(ERRCODE_UNDEFINED_COLUMN),
                       errmsg("column \"%s\" of relation \"%s\" does not exist",
                              origTarget->name,
!                             RelationGetRelationName(target_rel)),
                       parser_errposition(pstate, origTarget->location)));

          updateTargetListEntry(pstate, tle, origTarget->name,
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 88ec83c..56d2af5 100644
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
*************** single_set_clause:
*** 9536,9561 ****
  multiple_set_clause:
              '(' set_target_list ')' '=' ctext_row
                  {
                      ListCell *col_cell;
-                     ListCell *val_cell;

                      /*
!                      * Break the ctext_row apart, merge individual expressions
!                      * into the destination ResTargets.  This is semantically
!                      * equivalent to, and much cheaper to process than, the
!                      * general case.
                       */
!                     if (list_length($2) != list_length($5))
!                         ereport(ERROR,
!                                 (errcode(ERRCODE_SYNTAX_ERROR),
!                                  errmsg("number of columns does not match number of values"),
!                                  parser_errposition(@5)));
!                     forboth(col_cell, $2, val_cell, $5)
                      {
                          ResTarget *res_col = (ResTarget *) lfirst(col_cell);
!                         Node *res_val = (Node *) lfirst(val_cell);

!                         res_col->val = res_val;
                      }

                      $$ = $2;
--- 9536,9562 ----
  multiple_set_clause:
              '(' set_target_list ')' '=' ctext_row
                  {
+                     int ncolumns = list_length($2);
+                     int i = 1;
                      ListCell *col_cell;

                      /*
!                      * We create a MultiAssignRef source for each target, all
!                      * of them pointing to the ctext_row List.  The ctext_row
!                      * will get split apart during parse analysis.  (We could
!                      * split it apart here, but it's simpler to share code
!                      * with the "(*)" target case this way.)
                       */
!                     foreach(col_cell, $2)
                      {
                          ResTarget *res_col = (ResTarget *) lfirst(col_cell);
!                         MultiAssignRef *r = makeNode(MultiAssignRef);

!                         r->source = (Node *) $5;
!                         r->colno = i;
!                         r->ncolumns = ncolumns;
!                         res_col->val = (Node *) r;
!                         i++;
                      }

                      $$ = $2;
*************** multiple_set_clause:
*** 9590,9595 ****
--- 9591,9635 ----

                      $$ = $2;
                  }
+             | '(' '*' ')' '=' ctext_row
+                 {
+                     ResTarget *res_col = makeNode(ResTarget);
+
+                     /*
+                      * Create a ResTarget with null name to represent "(*)".
+                      * We don't bother with MultiAssignRef yet.
+                      */
+                     res_col->name = NULL;
+                     res_col->indirection = NIL;
+                     res_col->val = (Node *) $5;
+                     res_col->location = @2;
+
+                     $$ = list_make1(res_col);
+                 }
+             | '(' '*' ')' '=' select_with_parens
+                 {
+                     SubLink *sl = makeNode(SubLink);
+                     ResTarget *res_col = makeNode(ResTarget);
+
+                     /* First, convert bare SelectStmt into a SubLink */
+                     sl->subLinkType = MULTIEXPR_SUBLINK;
+                     sl->subLinkId = 0;        /* will be assigned later */
+                     sl->testexpr = NULL;
+                     sl->operName = NIL;
+                     sl->subselect = $5;
+                     sl->location = @5;
+
+                     /*
+                      * Create a ResTarget with null name to represent "(*)".
+                      * We don't bother with MultiAssignRef yet.
+                      */
+                     res_col->name = NULL;
+                     res_col->indirection = NIL;
+                     res_col->val = (Node *) sl;
+                     res_col->location = @2;
+
+                     $$ = list_make1(res_col);
+                 }
          ;

  set_target:
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f759606..c5ce95f 100644
*** a/src/backend/parser/parse_expr.c
--- b/src/backend/parser/parse_expr.c
*************** transformMultiAssignRef(ParseState *psta
*** 1446,1451 ****
--- 1446,1470 ----
      /* We should only see this in first-stage processing of UPDATE tlists */
      Assert(pstate->p_expr_kind == EXPR_KIND_UPDATE_SOURCE);

+     /*
+      * The grammar should only generate Lists or MULTIEXPR SubLinks as inputs
+      * to a MultiAssignRef.  If it's a List, we need only pick out the n'th
+      * list element and transform that.
+      */
+     if (IsA(maref->source, List))
+     {
+         List       *slist = (List *) maref->source;
+         Node       *source;
+
+         if (list_length(slist) != maref->ncolumns)
+             ereport(ERROR,
+                     (errcode(ERRCODE_SYNTAX_ERROR),
+                  errmsg("number of columns does not match number of values"),
+                    parser_errposition(pstate, exprLocation(maref->source))));
+         source = (Node *) list_nth(slist, maref->colno - 1);
+         return transformExprRecurse(pstate, source);
+     }
+
      /* We only need to transform the source if this is the first column */
      if (maref->colno == 1)
      {
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 0e257ac..6456f02 100644
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
*************** typedef struct A_ArrayExpr
*** 400,406 ****
   *
   * In an UPDATE target list, 'name' is the name of the destination column,
   * 'indirection' stores any subscripts attached to the destination, and
!  * 'val' is the expression to assign.
   *
   * See A_Indirection for more info about what can appear in 'indirection'.
   */
--- 400,408 ----
   *
   * In an UPDATE target list, 'name' is the name of the destination column,
   * 'indirection' stores any subscripts attached to the destination, and
!  * 'val' is the expression to assign.  If 'name' is NULL then the ResTarget
!  * represents a "(*)" multi-column assignment target (and its 'val' field
!  * is either a List of value expressions, or a MULTIEXPR SubLink).
   *
   * See A_Indirection for more info about what can appear in 'indirection'.
   */
diff --git a/src/test/regress/expected/update.out b/src/test/regress/expected/update.out
index 1de2a86..554aa35 100644
*** a/src/test/regress/expected/update.out
--- b/src/test/regress/expected/update.out
*************** SELECT * FROM update_test;
*** 129,134 ****
--- 129,200 ----
      |     |
  (4 rows)

+ --
+ -- Test (*) syntax
+ --
+ INSERT INTO update_test VALUES (1001, 1002);
+ UPDATE update_test SET (*) = (SELECT a, b, 'foo' FROM update_test WHERE b = 101)
+   WHERE a = 1001;
+ SELECT * FROM update_test;
+  a  |  b  |  c
+ ----+-----+-----
+  21 | 101 |
+  41 |  12 | car
+  42 |  12 | car
+     |     |
+  21 | 101 | foo
+ (5 rows)
+
+ UPDATE update_test SET (*) = (DEFAULT, DEFAULT, 'zap')
+   WHERE c = 'foo';
+ SELECT * FROM update_test;
+  a  |  b  |  c
+ ----+-----+-----
+  21 | 101 |
+  41 |  12 | car
+  42 |  12 | car
+     |     |
+  10 |     | zap
+ (5 rows)
+
+ DELETE FROM update_test WHERE c = 'zap';
+ -- error cases (too many/few columns)
+ UPDATE update_test SET (a, b, c) = (1, 2);
+ ERROR:  number of columns does not match number of values
+ LINE 1: UPDATE update_test SET (a, b, c) = (1, 2);
+                                             ^
+ UPDATE update_test SET (a, b, c) = (1, 2, 3, 4);
+ ERROR:  number of columns does not match number of values
+ LINE 1: UPDATE update_test SET (a, b, c) = (1, 2, 3, 4);
+                                             ^
+ UPDATE update_test SET (a, b, c) = (SELECT 1, 2 FROM update_test);
+ ERROR:  number of columns does not match number of values
+ LINE 1: UPDATE update_test SET (a, b, c) = (SELECT 1, 2 FROM update_...
+                                            ^
+ UPDATE update_test SET (a, b, c) = (SELECT 1, 2, 3, 4 FROM update_test);
+ ERROR:  number of columns does not match number of values
+ LINE 1: UPDATE update_test SET (a, b, c) = (SELECT 1, 2, 3, 4 FROM u...
+                                            ^
+ UPDATE update_test SET (*) = (1, 2);
+ ERROR:  number of columns does not match number of values
+ LINE 1: UPDATE update_test SET (*) = (1, 2);
+                                       ^
+ UPDATE update_test SET (*) = (1, 2, 3, 4);
+ ERROR:  number of columns does not match number of values
+ LINE 1: UPDATE update_test SET (*) = (1, 2, 3, 4);
+                                       ^
+ UPDATE update_test SET (*) = (SELECT 1, 2 FROM update_test);
+ ERROR:  number of columns does not match number of values
+ LINE 1: UPDATE update_test SET (*) = (SELECT 1, 2 FROM update_test);
+                                      ^
+ UPDATE update_test SET (*) = (SELECT 1, 2, 3, 4 FROM update_test);
+ ERROR:  number of columns does not match number of values
+ LINE 1: UPDATE update_test SET (*) = (SELECT 1, 2, 3, 4 FROM update_...
+                                      ^
+ -- this is a semantic error, but not a syntactic one:
+ UPDATE update_test SET (*) = (DEFAULT, DEFAULT, DEFAULT), c = 'zap'
+   WHERE c = 'foo';
+ ERROR:  multiple assignments to same column "c"
  -- if an alias for the target table is specified, don't allow references
  -- to the original table name
  UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a = 10;
diff --git a/src/test/regress/sql/update.sql b/src/test/regress/sql/update.sql
index e71128c..e78de79 100644
*** a/src/test/regress/sql/update.sql
--- b/src/test/regress/sql/update.sql
*************** UPDATE update_test SET (b,a) = (select a
*** 66,71 ****
--- 66,101 ----
    WHERE a = 11;
  SELECT * FROM update_test;

+ --
+ -- Test (*) syntax
+ --
+ INSERT INTO update_test VALUES (1001, 1002);
+
+ UPDATE update_test SET (*) = (SELECT a, b, 'foo' FROM update_test WHERE b = 101)
+   WHERE a = 1001;
+
+ SELECT * FROM update_test;
+
+ UPDATE update_test SET (*) = (DEFAULT, DEFAULT, 'zap')
+   WHERE c = 'foo';
+
+ SELECT * FROM update_test;
+
+ DELETE FROM update_test WHERE c = 'zap';
+
+ -- error cases (too many/few columns)
+ UPDATE update_test SET (a, b, c) = (1, 2);
+ UPDATE update_test SET (a, b, c) = (1, 2, 3, 4);
+ UPDATE update_test SET (a, b, c) = (SELECT 1, 2 FROM update_test);
+ UPDATE update_test SET (a, b, c) = (SELECT 1, 2, 3, 4 FROM update_test);
+ UPDATE update_test SET (*) = (1, 2);
+ UPDATE update_test SET (*) = (1, 2, 3, 4);
+ UPDATE update_test SET (*) = (SELECT 1, 2 FROM update_test);
+ UPDATE update_test SET (*) = (SELECT 1, 2, 3, 4 FROM update_test);
+ -- this is a semantic error, but not a syntactic one:
+ UPDATE update_test SET (*) = (DEFAULT, DEFAULT, DEFAULT), c = 'zap'
+   WHERE c = 'foo';
+
  -- if an alias for the target table is specified, don't allow references
  -- to the original table name
  UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a = 10;

Re: Support UPDATE table SET(*)=...

От
Tom Lane
Дата:
I wrote:
> So I'm feeling that this may not be a good idea, or at least not a good
> implementation of the idea.  I'm inclined to reject the patch rather than
> lock us into something that is not standard and doesn't really do what
> people would be likely to want.

BTW, a potentially workable fix to the problem of not wanting to lock
down column lists in stored rules is to create a syntax that represents
whole-row, record-oriented assignment directly.  Then we need not be
concerned with individual columns at parse time at all.  So imagine
something like this:
UPDATE dst SET * = new WHERE ...;
UPDATE dst SET * = (SELECT src FROM src WHERE ...);

or if you needed to construct a row value at runtime you could write
UPDATE dst SET * = ROW(x,y,z) WHERE ...;
UPDATE dst SET * = (SELECT ROW(x,y,z) FROM src WHERE ...);

The main bit of functionality that would be lost compared to the current
patch is the ability to use DEFAULT for some of the row members.  But I am
not sure there is a compelling use-case for that: seems like if you have
some DEFAULTs in there then it's unlikely that you don't know the column
list accurately, so the existing (col1,col2,...) syntax will serve fine.

This seems like it might not be unduly complex to implement, although
it would have roughly nothing in common with the current patch.

If we were to go in this direction, it would be nice to at the same time
add a similar whole-record syntax for INSERT.  I'm not sure exactly what
that should look like though.  Also, again, we ought to be paying
attention to how this would match up with UPSERT syntax.
        regards, tom lane



Re: Support UPDATE table SET(*)=...

От
Alvaro Herrera
Дата:
Tom Lane wrote:

> I spent a fair amount of time cleaning this patch up to get it into
> committable shape, but as I was working on the documentation I started
> to lose enthusiasm for it, because I was having a hard time coming up
> with compelling examples.  The originally proposed motivation was
> 
> >> It solves the problem of doing UPDATE from a record variable of the same
> >> type as the table e.g. update foo set (*) = (select foorec.*) where ...;
> 
> but it feels to me that this is not actually a very good solution to that
> problem --- even granting that the problem needs to be solved, a claim
> that still requires some justification IMO.

How about an UPDATE ran inside a plpgsql function, which is using a row
variable of the table type?  You could assign values to individual
columns of q row variable, and run the multicolumn UPDATE last.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Support UPDATE table SET(*)=...

От
Peter Geoghegan
Дата:
On Tue, Apr 7, 2015 at 11:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If we were to go in this direction, it would be nice to at the same time
> add a similar whole-record syntax for INSERT.  I'm not sure exactly what
> that should look like though.  Also, again, we ought to be paying
> attention to how this would match up with UPSERT syntax.

I expressed concern about allowing this for UPSERT [1].

To be fair, VoltDB's new UPSERT statement allows something similar (or
rather mandates it, since you cannot just update some columns), and
that doesn't look wholly unreasonable. I still don't like the idea of
supporting this, though. I'm not aware of any other system allowing
something like this for either MERGE or a non-standard UPSERT.

[1] http://www.postgresql.org/message-id/CAM3SWZT=VXBJ7QKAidAmYbU40aP10udSqOOqhViX3Ykj7WBv9A@mail.gmail.com
-- 
Peter Geoghegan



Re: Support UPDATE table SET(*)=...

От
Peter Geoghegan
Дата:
On Tue, Apr 7, 2015 at 12:01 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I still don't like the idea of
> supporting this, though. I'm not aware of any other system allowing
> something like this for either MERGE or a non-standard UPSERT.

That includes MySQL, BTW. Only their REPLACE statement (which is a
disaster for various reasons) can do something like this. Their INSERT
... ON DUPLICATE KEY UPDATE statement (which is roughly comparable to
the proposed UPSERT patch) cannot update the entire row using a terse
expression that references a row excluded from insertion (following
the implementation taking the UPDATE path).

-- 
Peter Geoghegan



Re: Support UPDATE table SET(*)=...

От
Tom Lane
Дата:
Peter Geoghegan <pg@heroku.com> writes:
> On Tue, Apr 7, 2015 at 11:19 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If we were to go in this direction, it would be nice to at the same time
>> add a similar whole-record syntax for INSERT.  I'm not sure exactly what
>> that should look like though.  Also, again, we ought to be paying
>> attention to how this would match up with UPSERT syntax.

> I expressed concern about allowing this for UPSERT [1].

Yeah, your analogy to "SELECT *" being considered dangerous in production
is not without merit.  However, to the extent that the syntax is used to
assign from a composite variable of the same (or compatible) data type,
it seems like it would be safe enough.  IOW, I think that analogy holds
for the syntax implemented by the current patch, but not what I suggested
in my followup.
        regards, tom lane



Re: Support UPDATE table SET(*)=...

От
Jim Nasby
Дата:
On 4/7/15 2:00 PM, Alvaro Herrera wrote:
> Tom Lane wrote:
>
>> I spent a fair amount of time cleaning this patch up to get it into
>> committable shape, but as I was working on the documentation I started
>> to lose enthusiasm for it, because I was having a hard time coming up
>> with compelling examples.  The originally proposed motivation was
>>
>>>> It solves the problem of doing UPDATE from a record variable of the same
>>>> type as the table e.g. update foo set (*) = (select foorec.*) where ...;
>>
>> but it feels to me that this is not actually a very good solution to that
>> problem --- even granting that the problem needs to be solved, a claim
>> that still requires some justification IMO.
>
> How about an UPDATE ran inside a plpgsql function, which is using a row
> variable of the table type?  You could assign values to individual
> columns of q row variable, and run the multicolumn UPDATE last.

Along similar lines, I've often wanted something akin to *, but allowing 
finer control over what you got. Generally when I want this it's because 
I really do want everything (as in, don't want to re-code a bunch of 
stuff if a column is added), but perhaps not the surrogate key field. Or 
I want everything, but rename some field to something else.

Basically, another way to do what Alvaro is suggesting (though, the 
ability to rename something is new...)

If we had that ability I think UPDATE * would become a lot more useful.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Support UPDATE table SET(*)=...

От
Andrew Gierth
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> I spent a fair amount of time cleaning this patch up to get itTom> into committable shape, but as I was working on
thedocumentationTom> I started to lose enthusiasm for it, because I was having a hardTom> time coming up with
compellingexamples.  The originally proposedTom> motivation was
 
>>> It solves the problem of doing UPDATE from a record variable of the same>>> type as the table e.g. update foo set
(*)= (select foorec.*) where ...;
 

There are a number of motivating examples for this (which have nothing
to do with rules; I doubt anyone cares much about that).

The fundamental point is that currently, given a table "foo" and some
column or variable of foo's rowtype, you can do this:

insert into foo select foorec.* [from ...]

but there is no comparable way to do an update without naming every
column explicitly, the closest being

update foo set (a,b,...) = (foorec.a, foorec.b, ...) where ...

One example that comes up occasionally (and that I've had to do myself
more than once) is this: given a table "foo" and another with identical
schema "reference_foo", apply appropriate inserts, updates and deletes
to table "foo" to make the content of the two tables identical. This can
be done these days with wCTEs:

with t_diff as (select o.id as o_id, n.id as n_id, o, n              from foo o full outer join reference_foo n on
(o.id=n.id)            where (o.*) is distinct from (n.*)), ins as (insert into foo select (n).* from t_diff where o_id
isnull), del as (delete from foo          where id in (select o_id from t_diff where n_id is null)), upd as (update foo
          set (col1,col2,...) = ((n).col1,(n).col2,...)  -- XXX           from t_diff          where foo.id = n_id and
o_id= n_id)
 
select count(*) filter (where o_id is null) as num_ins,      count(*) filter (where o_id = n_id) as num_upd,
count(*)filter (where n_id is null) as num_del from t_diff;
 

(This would be preferred over simply replacing the table content if the
table is large and the changes few, you want to audit the changes, you
need to avoid interfering with concurrent selects, etc.)

The update part of that would be much improved by simply being able to
say "update all columns of foo with values from (n)". The exact syntax
isn't a big deal - though since SET (cols...) = ...  is in the spec, it
seems reasonable to at least keep some kind of consistency with it.

Other examples arise from things one might want to do in plpgsql; for
example to update a record from an hstore or json value, one can use
[json_]populate_record to construct a record variable, but then it's
back to naming all the columns in order to actually perform the update
statement.

[My connection with this patch is only that I suggested it to Atri as a
possible project for him to do, because I wanted the feature and knew
others did also, and helped explain how the existing MultiAssign worked
and some of the criticism. I did not contribute any code.]

-- 
Andrew (irc:RhodiumToad)



Re: Support UPDATE table SET(*)=...

От
Tom Lane
Дата:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> I spent a fair amount of time cleaning this patch up to get it
>  Tom> into committable shape, but as I was working on the documentation
>  Tom> I started to lose enthusiasm for it, because I was having a hard
>  Tom> time coming up with compelling examples.

> One example that comes up occasionally (and that I've had to do myself
> more than once) is this: given a table "foo" and another with identical
> schema "reference_foo", apply appropriate inserts, updates and deletes
> to table "foo" to make the content of the two tables identical. This can
> be done these days with wCTEs:

> with
>   t_diff as (select o.id as o_id, n.id as n_id, o, n
>                from foo o full outer join reference_foo n on (o.id=n.id)
>               where (o.*) is distinct from (n.*)),
>   ins as (insert into foo select (n).* from t_diff where o_id is null),
>   del as (delete from foo
>            where id in (select o_id from t_diff where n_id is null)),
>   upd as (update foo
>              set (col1,col2,...) = ((n).col1,(n).col2,...)  -- XXX
>             from t_diff
>            where foo.id = n_id and o_id = n_id)
> select count(*) filter (where o_id is null) as num_ins,
>        count(*) filter (where o_id = n_id) as num_upd,
>        count(*) filter (where n_id is null) as num_del
>   from t_diff;

While I agree that the UPDATE part of that desperately needs improvement,
I don't agree that the INSERT part is entirely fine.  You're still relying
on a parse-time expansion of the (n).* notation, which is inefficient and
not at all robust against schema changes (the same problem as with the
patch's approach to UPDATE).  So if we're taking this as a motivating
example, I'd want to see a fix that allows both INSERT and UPDATE directly
from a composite value of proper rowtype, without any expansion to
individual columns at all.

Perhaps we could adopt some syntax likeINSERT INTO table (*) values-or-select
to represent the case that the values-or-select delivers a single
composite column of the appropriate type.

> Other examples arise from things one might want to do in plpgsql; for
> example to update a record from an hstore or json value, one can use
> [json_]populate_record to construct a record variable, but then it's
> back to naming all the columns in order to actually perform the update
> statement.

Sure, but the patch as given didn't work very well for that either,
at least not if you wanted to avoid multiple evaluation of the
composite-returning function.  You'd have to adopt some obscure syntax
like "UPDATE target SET (*) = (SELECT * FROM composite_function(...))".
With what I'm thinking about now you could doUPDATE target SET * = composite_function(...)
which is a good deal less notation, and with a bit of luck it would not
require disassembling and reassembling the function's output tuple.
        regards, tom lane



Re: Support UPDATE table SET(*)=...

От
Andrew Gierth
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>> One example that comes up occasionally (and that I've had to do>> myself more than once) is this: given a table
"foo"and another with>> identical schema "reference_foo", apply appropriate inserts, updates>> and deletes to table
"foo"to make the content of the two tables>> identical. This can be done these days with wCTEs:
 
>> with>>   t_diff as (select o.id as o_id, n.id as n_id, o, n>>                from foo o full outer join
reference_foon on (o.id=n.id)>>               where (o.*) is distinct from (n.*)),>>   ins as (insert into foo select
(n).*from t_diff where o_id is null),>>   del as (delete from foo>>            where id in (select o_id from t_diff
wheren_id is null)),>>   upd as (update foo>>              set (col1,col2,...) = ((n).col1,(n).col2,...)  -- XXX>>
      from t_diff>>            where foo.id = n_id and o_id = n_id)>> select count(*) filter (where o_id is null) as
num_ins,>>       count(*) filter (where o_id = n_id) as num_upd,>>        count(*) filter (where n_id is null) as
num_del>>       from t_diff;
 
Tom> While I agree that the UPDATE part of that desperately needsTom> improvement, I don't agree that the INSERT part
isentirely fine.Tom> You're still relying on a parse-time expansion of the (n).*Tom> notation, which is inefficient
 

Not in my experience a huge deal given what else is going on...
Tom> and not at all robust against schema changes (the same problem asTom> with the patch's approach to UPDATE).

Now this I think is wrong; I think it's just as robust against schema
changes as using the composite value directly would be. Consider the
case where foo and reference_foo match with the exception of dropped
columns; the code as it is should just work, while a variant that used
the composite values would have to explicitly deal with that.

(When I've used this kind of operation in practice, reference_foo has
just been created using CREATE TEMP TABLE reference_foo (LIKE foo); and
then populated via COPY from an external data source. Even if
reference_foo were a non-temp table, the logic of the situation requires
it to have the same schema as foo as far as SQL statements are
concerned.)
Tom> So if we're taking this as a motivating example, I'd want to see aTom> fix that allows both INSERT and UPDATE
directlyfrom a compositeTom> value of proper rowtype, without any expansion to individualTom> columns at all.
 

I would argue that this is a case of the perfect being the enemy of the
good.
Tom> Perhaps we could adopt some syntax likeTom>     INSERT INTO table (*) values-or-selectTom> to represent the case
thatthe values-or-select delivers a singleTom> composite column of the appropriate type.
 

We could, but I think in all practical cases it'll be nothing more than
a small performance optimization rather than something that really
benefits people in terms of enhanced functionality.
>> Other examples arise from things one might want to do in plpgsql; for>> example to update a record from an hstore or
jsonvalue, one can use>> [json_]populate_record to construct a record variable, but then it's>> back to naming all the
columnsin order to actually perform the update>> statement.
 
Tom> Sure, but the patch as given didn't work very well for thatTom> either,

Partly that's a limitation resulting from how much can be done with the
existing SET (...) = syntax and implementation without ripping it all
out and starting over. An incremental improvement seemed to be a more
readily achievable goal.

In practice it would indeed probably look like:
 declare   new_id integer;   new_values hstore; begin   /* do stuff */   update foo      set (*) = (select * from
populate_record(foo,new_values))    where foo.id = new_id;
 

A agree that it would be nicer to do
   update foo      set (*) = populate_record(foo, new_values)    where foo.id = new_id;

but that would be a substantially larger project. The alternative of
      set * = populate_record(foo, new_values)

is less satisfactory since it introduces inconsistencies with the case
where you _do_ want to specify explicit columns, unless you also allow
      set (a,b) = row_value

which is required by the spec for T641 but which we don't currently
have.

-- 
Andrew (irc:RhodiumToad)



Re: Support UPDATE table SET(*)=...

От
Tom Lane
Дата:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> and not at all robust against schema changes (the same problem as
>  Tom> with the patch's approach to UPDATE).

> Now this I think is wrong; I think it's just as robust against schema
> changes as using the composite value directly would be. Consider the
> case where foo and reference_foo match with the exception of dropped
> columns; the code as it is should just work, while a variant that used
> the composite values would have to explicitly deal with that.

AFAICS you've got that backwards.

As I illustrated upthread, after parse-time expansion we would have a
command that explicitly tells the system to insert or update only the
enumerated columns.  That will *not* work as desired if columns are added
later, and (if it's in a rule) I would expect the system to have to fail
a DROP command trying to drop one of those named columns, the same way
you can't drop a column referenced in a view/rule today.

On the other hand, if it's a composite-type assignment, then dealing
with things like dropped columns becomes the system's responsibility,
and we already have code for that sort of thing.

> ... The alternative of
>        set * = populate_record(foo, new_values)
> is less satisfactory since it introduces inconsistencies with the case
> where you _do_ want to specify explicit columns, unless you also allow
>        set (a,b) = row_value
> which is required by the spec for T641 but which we don't currently
> have.

Right, but we should be trying to move in that direction.  I see your
point though that (*) is more notationally consistent with that case.
Maybe we should be looking at trying to implement T641 in full and then
including (*) as a special case of that.

Anyway, my core point here is that we should avoid parse-time expansion
of the column set.  The *only* reason for doing that is that it makes the
patch simpler, not that there is some functional advantage; in fact there
are considerable functional disadvantages as we've just been through.
So I do not want to commit a patch that institutionalizes those
disadvantages just for short-term implementation simplicity.
        regards, tom lane



Re: Support UPDATE table SET(*)=...

От
Andrew Gierth
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Now this I think is wrong; I think it's just as robust against>> schema changes as using the composite value
directlywould>> be. Consider the case where foo and reference_foo match with the>> exception of dropped columns; the
codeas it is should just work,>> while a variant that used the composite values would have to>> explicitly deal with
that.
Tom> AFAICS you've got that backwards.
Tom> As I illustrated upthread, after parse-time expansion we wouldTom> have a command that explicitly tells the system
toinsert orTom> update only the enumerated columns.  That will *not* work asTom> desired if columns are added later,
 

Where "later" is between parse analysis and execution - and if this
query is not in a rule, then any such schema change will force a
re-analysis if it's a prepared statement, no? and otherwise, we have the
tables locked against schema changes anyway? Is there a failure case
here that doesn't involve rules?
Tom> and (if it's in a rule)

well, the example I gave is not something that anyone in their right
minds would try and put in a rule.
>> ... The alternative of>> set * = populate_record(foo, new_values)>> is less satisfactory since it introduces
inconsistencieswith the case>> where you _do_ want to specify explicit columns, unless you also allow>> set (a,b) =
row_value>>which is required by the spec for T641 but which we don't currently>> have.
 
Tom> Right, but we should be trying to move in that direction.  I seeTom> your point though that (*) is more
notationallyconsistent withTom> that case.  Maybe we should be looking at trying to implement T641Tom> in full and then
including(*) as a special case of that.
 

I would have liked to have done that, but that would have raised the
complexity of the project from "Atri can take a stab at this one with
negligible supervision" to "Andrew will have to spend more time than he
has conveniently available staring at the raw parser to try and make it
work".

As I said, the perfect is the enemy of the good.
Tom> Anyway, my core point here is that we should avoid parse-timeTom> expansion of the column set.

Parse-time expansion of * is pretty widespread elsewhere. Changing that
for this one specific case seems a bit marginal to me - and if the main
motivation to do so is to support cases in DML rules, which are already
a foot-bazooka, I think it's honestly not worth it.

-- 
Andrew (irc:RhodiumToad)



Re: Support UPDATE table SET(*)=...

От
Alvaro Herrera
Дата:
Andrew Gierth wrote:
> >>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

>  Tom> Right, but we should be trying to move in that direction.  I see
>  Tom> your point though that (*) is more notationally consistent with
>  Tom> that case.  Maybe we should be looking at trying to implement T641
>  Tom> in full and then including (*) as a special case of that.
> 
> I would have liked to have done that, but that would have raised the
> complexity of the project from "Atri can take a stab at this one with
> negligible supervision" to "Andrew will have to spend more time than he
> has conveniently available staring at the raw parser to try and make it
> work".

Not to mention that, at this stage, we should be looking at reducing the
scope of patches in commitfest rather than enlarge it.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Support UPDATE table SET(*)=...

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Andrew Gierth wrote:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>>> Tom> Right, but we should be trying to move in that direction.  I see
>>> Tom> your point though that (*) is more notationally consistent with
>>> Tom> that case.  Maybe we should be looking at trying to implement T641
>>> Tom> in full and then including (*) as a special case of that.

>> I would have liked to have done that, but that would have raised the
>> complexity of the project from "Atri can take a stab at this one with
>> negligible supervision" to "Andrew will have to spend more time than he
>> has conveniently available staring at the raw parser to try and make it
>> work".

Well, we've never considered implementation convenience to be more
important than getting it right, and this doesn't seem like a place
to start.

(FWIW, the raw-parser changes would be a negligible fraction of the work
involved to do it like that.)

> Not to mention that, at this stage, we should be looking at reducing the
> scope of patches in commitfest rather than enlarge it.

I already took it out of the current commitfest ;-).
        regards, tom lane



Re: Support UPDATE table SET(*)=...

От
Andrew Gierth
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> Well, we've never considered implementation convenience to be moreTom> important than getting it right, and this
doesn'tseem like aTom> place to start.
 

I quote your own words from less than a year ago:
   The standard actually says that the source for a multi-column assignment   could be any row-valued expression.  The
implementationused here is   tightly tied to our existing sub-SELECT support and can't handle other   cases; the Bison
grammarwould have some issues with them too.  However,   I don't feel too bad about this since other cases can be
convertedinto   sub-SELECTs.  For instance, "SET (a,b,c) = row_valued_function(x)" could   be written "SET (a,b,c) =
(SELECT* FROM row_valued_function(x))".
 

(from the commit message for 8f889b108)
Tom> (FWIW, the raw-parser changes would be a negligible fraction ofTom> the work involved to do it like that.)

/** Ideally, we'd accept any row-valued a_expr as RHS of a multiple_set_clause.* However, per SQL spec the
row-constructorcase must allow DEFAULT as a row* member, and it's pretty unclear how to do that (unless perhaps we
allow*DEFAULT in any a_expr and let parse analysis sort it out later?).  For the* moment, the planner/executor only
supporta subquery as a multiassignment* source anyhow, so we need only accept ctext_row and subqueries here.*/
 

(The "punt to parse analysis" solution looks reasonable to me, but I'm
just as in the dark as you were when you wrote that as to any other
possible solution.)

-- 
Andrew (irc:RhodiumToad)