Обсуждение: GSOC13 proposal - extend RETURNING syntax

Поиск
Список
Период
Сортировка

GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
Hello,
I'm student who want to participate in Google Summer of Code. I want to
implement feature which allows to get old values directly from update
statement. I mean there should be possibility to choose the value
immedietly before or after update in RETURNING statement. The syntax may
be realized as "aliases". That means: OLD keywordswould be alias to row
before update and NEW to row after update. The conclusion of syntax is:
UPDATE foo SET bar=bar+1 RETURNING OLD.bar AS old_bar, NEW.bar AS new_bar;
UPDATE foo SET ... RETURNING NEW.* will be equivalent to UPDATE foo SET
... RETURNING foo.*
It may be possible to add similar syntax to DELETE and INSERT statements
but I'm not sure if it makes a sense (OLD for DELETE will be alias to
row before delete, NEW for INSERT will be alias to row after insert and
all triggers - however what about NEW for delete and OLD for INSERT?).
Additionally NEW and OLD values will be reserved keywords (it might be
some capability problem since in new PostgreSQL it isn't reserved -
however standard says it is and in old PgSQL it was).
I'd like to hear (read) yours feedback about syntax and/or implement
issues related to this proposal.
Regards,
Karol Trzcionka



Re: GSOC13 proposal - extend RETURNING syntax

От
David Fetter
Дата:
On Thu, May 02, 2013 at 11:04:15AM +0200, Karol Trzcionka wrote:
> Hello,
> I'm student who want to participate in Google Summer of Code. I want to
> implement feature which allows to get old values directly from update
> statement. I mean there should be possibility to choose the value
> immedietly before or after update in RETURNING statement. The syntax may
> be realized as "aliases". That means: OLD keywordswould be alias to row
> before update and NEW to row after update. The conclusion of syntax is:
> UPDATE foo SET bar=bar+1 RETURNING OLD.bar AS old_bar, NEW.bar AS new_bar;
> UPDATE foo SET ... RETURNING NEW.* will be equivalent to UPDATE foo SET
> ... RETURNING foo.*
> It may be possible to add similar syntax to DELETE and INSERT statements
> but I'm not sure if it makes a sense (OLD for DELETE will be alias to
> row before delete, NEW for INSERT will be alias to row after insert and
> all triggers - however what about NEW for delete and OLD for INSERT?).
> Additionally NEW and OLD values will be reserved keywords (it might be
> some capability problem since in new PostgreSQL it isn't reserved -
> however standard says it is and in old PgSQL it was).
> I'd like to hear (read) yours feedback about syntax and/or implement
> issues related to this proposal.
> Regards,
> Karol Trzcionka

I would like to include the proposal as we've hammered it out together
on IRC and on GSoC site below.

Cheers,
David.

1.  As the SQL standard mandates that OLD and NEW be reserved words, we'll re-reserve them.

2.  Let's make OLD and NEW have the same meaning that INSERT/UPDATE/DELETE have when returning rows from the changed
table. In particular
 
   INSERT INTO foo (...) RETURNING NEW.*

will be equivalent to
   INSERT INTO foo(...) RETURNING foo.*

Similarly for UPDATE and DELETE:
   UPDATE foo SET ... RETURNING NEW.*

will be equivalent to
   UPDATE foo SET ... RETURNING foo.*

and
   DELETE FROM foo ... RETURNING OLD.*

will be equivalent to
   DELETE FROM foo ... RETURNING foo.*

As RETURNING clauses have access to everything in the FROM/USING clause, it is important to limit the NEW/OLD rows as
beingonly those in the table being written to in the statement.
 

3. Let's add an option to UPDATE so that it can RETURN OLD with the same characteristics as above, namely that it
refersonly to constants and columns in the updated table and not to everything available from the USING clause if
included.

-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: GSOC13 proposal - extend RETURNING syntax

От
Tom Lane
Дата:
David Fetter <david@fetter.org> writes:
> 1.  As the SQL standard mandates that OLD and NEW be reserved words, we'll re-reserve them.

As I mentioned yesterday, I'm not exactly thrilled with re-reserving
those, and especially not NEW as it is utterly unnecessary (since the
default would already be to return the NEW column).

It should in any case be possible to do this without converting them to
reserved words; rather the implementation could be that those table
aliases are made available when parsing the UPDATE RETURNING expressions.
(This is, in fact, the way that rules use these names now.)  Probably it
should work something like "add these aliases if they don't already
exist in the query", so as to avoid breaking existing applications.

I don't really see a lot of value in hacking the behavior of either
INSERT RETURNING or DELETE RETURNING.
        regards, tom lane



Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
W dniu 02.05.2013 17:17, Tom Lane pisze:
> It should in any case be possible to do this without converting them
> to reserved words; rather the implementation could be that those table
> aliases are made available when parsing the UPDATE RETURNING
> expressions. (This is, in fact, the way that rules use these names
> now.) Probably it should work something like "add these aliases if
> they don't already exist in the query", so as to avoid breaking
> existing applications. I don't really see a lot of value in hacking
> the behavior of either INSERT RETURNING or DELETE RETURNING. regards,
> tom lane 
I'm not sure about it. If it is not reserved keyword how can user get
old value from table named "old". The new value is not a problem
(doesn't conflict) but what should happened in statement:
UPDATE old SET foo=foo+1 RETURNING old.foo;
If it returns old value, it'll break capability. If it returns new value
(as now), there is no possibility to get old value (and user cries
because of broken feature).
Regards,
Karol Trzcionka



Re: GSOC13 proposal - extend RETURNING syntax

От
Marko Tiikkaja
Дата:
On 2013-05-02 17:32, Karol Trzcionka wrote:
> W dniu 02.05.2013 17:17, Tom Lane pisze:
>> It should in any case be possible to do this without converting them
>> to reserved words; rather the implementation could be that those table
>> aliases are made available when parsing the UPDATE RETURNING
>> expressions. (This is, in fact, the way that rules use these names
>> now.) Probably it should work something like "add these aliases if
>> they don't already exist in the query", so as to avoid breaking
>> existing applications. I don't really see a lot of value in hacking
>> the behavior of either INSERT RETURNING or DELETE RETURNING.
>
> what should happened in statement:
> UPDATE old SET foo=foo+1 RETURNING old.foo;
> If it returns old value, it'll break capability. If it returns new value
> (as now), there is no possibility to get old value (and user cries
> because of broken feature).

In Tom's proposal that would give you the "new" value.

Personally I would maybe prefer reserving NEW/OLD, but if we go through 
with Tom's idea, this should work:

UPDATE old myold SET foo=foo+1 RETURNING myold.foo, old.foo;


What I'm more interested in is: how can we make this feature work in 
PL/PgSQL where OLD means something different?


Regards,
Marko Tiikkaja



Re: GSOC13 proposal - extend RETURNING syntax

От
Tom Lane
Дата:
Marko Tiikkaja <marko@joh.to> writes:
> What I'm more interested in is: how can we make this feature work in 
> PL/PgSQL where OLD means something different?

That's a really good point: if you follow this approach then you're
creating fundamental conflicts for use of the feature in trigger
functions or rules, which will necessarily have conflicting uses of
those names.  Yeah, we could define scoping rules such that there's
an unambiguous interpretation, but then the user is just out of luck
if he wants to reference the other definition.  (This problem is
probably actually worse if you implement with reserved words rather
than aliases.)

I'm thinking it would be better to invent some other notation for access
to old-row values.
        regards, tom lane



Re: GSOC13 proposal - extend RETURNING syntax

От
Andres Freund
Дата:
On 2013-05-02 12:23:19 -0400, Tom Lane wrote:
> Marko Tiikkaja <marko@joh.to> writes:
> > What I'm more interested in is: how can we make this feature work in 
> > PL/PgSQL where OLD means something different?
> 
> That's a really good point: if you follow this approach then you're
> creating fundamental conflicts for use of the feature in trigger
> functions or rules, which will necessarily have conflicting uses of
> those names.  Yeah, we could define scoping rules such that there's
> an unambiguous interpretation, but then the user is just out of luck
> if he wants to reference the other definition.  (This problem is
> probably actually worse if you implement with reserved words rather
> than aliases.)
> 
> I'm thinking it would be better to invent some other notation for access
> to old-row values.

prior/after? Both are unreserved keywords atm and it seems far less
likely to have conflicts than new/old.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: GSOC13 proposal - extend RETURNING syntax

От
David Fetter
Дата:
On Thu, May 02, 2013 at 06:28:53PM +0200, Andres Freund wrote:
> On 2013-05-02 12:23:19 -0400, Tom Lane wrote:
> > Marko Tiikkaja <marko@joh.to> writes:
> > > What I'm more interested in is: how can we make this feature work in 
> > > PL/PgSQL where OLD means something different?
> > 
> > That's a really good point: if you follow this approach then you're
> > creating fundamental conflicts for use of the feature in trigger
> > functions or rules, which will necessarily have conflicting uses of
> > those names.  Yeah, we could define scoping rules such that there's
> > an unambiguous interpretation, but then the user is just out of luck
> > if he wants to reference the other definition.  (This problem is
> > probably actually worse if you implement with reserved words rather
> > than aliases.)
> > 
> > I'm thinking it would be better to invent some other notation for access
> > to old-row values.
> 
> prior/after? Both are unreserved keywords atm and it seems far less
> likely to have conflicts than new/old.

BEFORE/AFTER seems more logical to me.  Yes, those words both have
meaning in, for example, a trigger definition, but they're clearly
separable by context.

Yay, bike-shedding!

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: GSOC13 proposal - extend RETURNING syntax

От
Pavel Stehule
Дата:



2013/5/2 Tom Lane <tgl@sss.pgh.pa.us>
Marko Tiikkaja <marko@joh.to> writes:
> What I'm more interested in is: how can we make this feature work in
> PL/PgSQL where OLD means something different?

That's a really good point: if you follow this approach then you're
creating fundamental conflicts for use of the feature in trigger
functions or rules, which will necessarily have conflicting uses of
those names.  Yeah, we could define scoping rules such that there's
an unambiguous interpretation, but then the user is just out of luck
if he wants to reference the other definition.  (This problem is
probably actually worse if you implement with reserved words rather
than aliases.)

I'm thinking it would be better to invent some other notation for access
to old-row values.

I am not sure, but I am thinking so NEW and OLD are used in some statements and features ANSI SQL 2012, so probably we should to do keywords from these words if we would to support modern ANSI SQL

Regards

Pavel
 

                        regards, tom lane


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: GSOC13 proposal - extend RETURNING syntax

От
Tom Lane
Дата:
David Fetter <david@fetter.org> writes:
> On Thu, May 02, 2013 at 06:28:53PM +0200, Andres Freund wrote:
>> prior/after? Both are unreserved keywords atm and it seems far less
>> likely to have conflicts than new/old.

> BEFORE/AFTER seems more logical to me.

Works for me.
        regards, tom lane



Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
W dniu 02.05.2013 19:40, Tom Lane pisze:
>> BEFORE/AFTER seems more logical to me.
> Works for me.
>
What do you think about function- or cast-like syntax. I mean:
RETURNING OLD(foo.bar)
or RETURNING OLD(foo).bar
or RETURNING (foo::OLD).bar ?
I think none of them should conflict with any other statements.
Regards,
Karol Trzcionka



Re: GSOC13 proposal - extend RETURNING syntax

От
David Fetter
Дата:
On Thu, May 02, 2013 at 01:40:59PM -0400, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > On Thu, May 02, 2013 at 06:28:53PM +0200, Andres Freund wrote:
> >> prior/after? Both are unreserved keywords atm and it seems far less
> >> likely to have conflicts than new/old.
> 
> > BEFORE/AFTER seems more logical to me.
> 
> Works for me.
> 
>             regards, tom lane

Maybe we can make BEFORE and AFTER implied aliases rather than
keywords.  What say?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: GSOC13 proposal - extend RETURNING syntax

От
Atri Sharma
Дата:

Sent from my iPad

On 03-May-2013, at 0:07, David Fetter <david@fetter.org> wrote:

> On Thu, May 02, 2013 at 01:40:59PM -0400, Tom Lane wrote:
>> David Fetter <david@fetter.org> writes:
>>> On Thu, May 02, 2013 at 06:28:53PM +0200, Andres Freund wrote:
>>>> prior/after? Both are unreserved keywords atm and it seems far less
>>>> likely to have conflicts than new/old.
>> 
>>> BEFORE/AFTER seems more logical to me.
>> 
>> Works for me.
>> 
>>            regards, tom lane
> 
> Maybe we can make BEFORE and AFTER implied aliases rather than
> keywords.  What say?
> 
> 

I agree.Overall,I like the concept.

Regards,

Atri



Re: GSOC13 proposal - extend RETURNING syntax

От
Tom Lane
Дата:
David Fetter <david@fetter.org> writes:
> Maybe we can make BEFORE and AFTER implied aliases rather than
> keywords.  What say?

Well, they still have to be unreserved keywords for their use in
trigger-related commands, but as far as this feature is concerned
I think they're best off being handled as automatically-supplied
aliases.  IOW the patch needn't touch gram.y at all.
        regards, tom lane



Re: GSOC13 proposal - extend RETURNING syntax

От
Tom Lane
Дата:
Karol Trzcionka <karlikt@gmail.com> writes:
> What do you think about function- or cast-like syntax. I mean:
> RETURNING OLD(foo.bar)
> or RETURNING OLD(foo).bar
> or RETURNING (foo::OLD).bar ?
> I think none of them should conflict with any other statements.

I think you'll find those alternatives are at least as ugly to
implement as they are to look at ...
        regards, tom lane



Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
W dniu 02.05.2013 20:45, Tom Lane pisze:
> David Fetter <david@fetter.org> writes:
>> Maybe we can make BEFORE and AFTER implied aliases rather than
>> keywords.  What say?
> Well, they still have to be unreserved keywords for their use in
> trigger-related commands, but as far as this feature is concerned
> I think they're best off being handled as automatically-supplied
> aliases.  IOW the patch needn't touch gram.y at all.
Well... the syntax which wouldn't conflict with Pl/PgSQL is (draft):
INSERT INTO foo ... RETURNING AFTER.*
UPDATE foo SET ... RETURNING AFTER.*, BEFORE.*
DELETE FROM foo ... RETURNING BEFORE.*
It will not solve the problems:
1. How to access to old rows if the table is named "BEFORE"?
2. Should AFTER for DELETE and BEFORE for INSERT be allowed? If yes what
should they return? I mean: what should be result of:
INSERT INTO foo ... RETURNING BEFORE.*
and
DELETE FROM foo ... RETURNING AFTER.*
?
The best summarize of dropping NEW/OLD keywords for this proposal was
proposed while IRC meeting:
create or replace function ft1(integer) returns integer language plpgsql
as $f$ <<x>> declare r rt1; begin select 1 as a into r; update rt1 r set
a = a + 1 returning XXX into r; raise notice 'r = %', r; return null;
end; $f$;
Regards,
Karol Trzcionka



Re: GSOC13 proposal - extend RETURNING syntax

От
Gavin Flower
Дата:
<div class="moz-cite-prefix">On 03/05/13 04:52, David Fetter wrote:<br /></div><blockquote
cite="mid:20130502165238.GB12887@fetter.org"type="cite"><pre wrap="">On Thu, May 02, 2013 at 06:28:53PM +0200, Andres
Freundwrote:
 
</pre><blockquote type="cite"><pre wrap="">On 2013-05-02 12:23:19 -0400, Tom Lane wrote:
</pre><blockquote type="cite"><pre wrap="">Marko Tiikkaja <a class="moz-txt-link-rfc2396E"
href="mailto:marko@joh.to"><marko@joh.to></a>writes:
 
</pre><blockquote type="cite"><pre wrap="">What I'm more interested in is: how can we make this feature work in 
PL/PgSQL where OLD means something different?
</pre></blockquote><pre wrap="">
That's a really good point: if you follow this approach then you're
creating fundamental conflicts for use of the feature in trigger
functions or rules, which will necessarily have conflicting uses of
those names.  Yeah, we could define scoping rules such that there's
an unambiguous interpretation, but then the user is just out of luck
if he wants to reference the other definition.  (This problem is
probably actually worse if you implement with reserved words rather
than aliases.)

I'm thinking it would be better to invent some other notation for access
to old-row values.
</pre></blockquote><pre wrap="">
prior/after? Both are unreserved keywords atm and it seems far less
likely to have conflicts than new/old.
</pre></blockquote><pre wrap="">
BEFORE/AFTER seems more logical to me.  Yes, those words both have
meaning in, for example, a trigger definition, but they're clearly
separable by context.

Yay, bike-shedding!

Cheers,
David.
</pre></blockquote><font size="-1">I prefer 'PRIOR & 'AFTER'</font> as the both have the same length <br /> - but
perhapsthat is just me!  :-)<br /><br /><br /> Cheers,<br /> Gavin<br /><br /><br /><br /> 

Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
<div class="moz-cite-prefix">W dniu 02.05.2013 23:34, Gavin Flower pisze:<br /></div><blockquote
cite="mid:5182DBD9.5050501@archidevsys.co.nz"type="cite"><font size="-1">I prefer 'PRIOR & 'AFTER'</font> as the
bothhave the same length <br /> - but perhaps that is just me!  :-)<br /></blockquote> I think it doesn't matter at now
becausePRIOR has the same problems as BEFORE ;)<br /> Regards,<br /> Karol<br /> 

Re: GSOC13 proposal - extend RETURNING syntax

От
Tom Lane
Дата:
Karol Trzcionka <karlikt@gmail.com> writes:
> It will not solve the problems:
> 1. How to access to old rows if the table is named "BEFORE"?

The user can simply choose to use a different table alias, as Andres
explained upthread.  If any table's active alias is "before" (or
"after"), we just don't activate the corresponding implicit alias.

> 2. Should AFTER for DELETE and BEFORE for INSERT be allowed? If yes what
> should they return?

These cases should certainly fail.  Now, IMO there's no very good reason
to alter the behavior at all for INSERT/DELETE; only UPDATE has an issue
here.  But if we were going to support the extra aliases in those
commands, only the ones that actually make sense should be allowed.
        regards, tom lane



Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
Hello,
according to my mentor's suggestion, I send first PoC patch of
"RETURNING AFTER/BEFORE" statement. Some info:
- it is early version - more hack PoC than RC patch
- AFTER in this version works as decribed before but it returns row
after update but before post-update before (to be fixed or switch with
current "*" behaviour - I don't know yet)
- patch passes all regression tests
- the code is uncommented already, to be fixed
I'm not sure what may fail so you use it on your risk ;)
Regards,
Karol

Вложения

Proposal - Support for National Characters functionality

От
"Arulappan, Arul Shaji"
Дата:
This is a proposal to implement functionalities for the handling of
National Characters.

[Introduction]

The aim of this proposal is to eventually have a way to represent
'National Characters' in a uniform way, even in non-UTF8 encoded
databases. Many of our customers in the Asian region who are now, as
part of their platform modernization, are moving away from mainframes
where they have used National Characters representation in COBOL and
other databases. Having stronger support for national characters
representation will also make it easier for these customers to look at
PostgreSQL more favourably when migrating from other well known RDBMSs
who all have varying degrees of NCHAR/NVARCHAR support.

[Specifications]

Broadly speaking, the national characters implementation ideally will
include the following
- Support for NCHAR/NVARCHAR data types
- Representing NCHAR and NVARCHAR columns in UTF-8 encoding in non-UTF8
databases
- Support for UTF16 column encoding and representing NCHAR and NVARCHAR
columns in UTF16 encoding in all databases.
- Support for NATIONAL_CHARACTER_SET GUC variable that will determine
the encoding that will be used in NCHAR/NVARCHAR columns.

The above points are at the moment a 'wishlist' only. Our aim is to
tackle them one-by-one as we progress. I will send a detailed proposal
later with more technical details.

The main aim at the moment is to get some feedback on the above to know
if this feature is something that would benefit PostgreSQL in general,
and if users maintaining DBs in non-English speaking regions will find
this beneficial.

Rgds,
Arul Shaji



P.S.: It has been quite some time since I send a correspondence to this
list. Our mail server adds a standard legal disclaimer to all outgoing
mails, which I know that this list is not a huge fan of. I used to have
an exemption for the mails I send to this list. If the disclaimer
appears, apologies in advance. I will rectify that on the next one.












Re: Proposal - Support for National Characters functionality

От
Tatsuo Ishii
Дата:
Arul Shaji,

NCHAR support is on our TODO list for some time and I would like to
welcome efforts trying to implement it. However I have a few
questions:

> This is a proposal to implement functionalities for the handling of
> National Characters. 
> 
> [Introduction]
> 
> The aim of this proposal is to eventually have a way to represent
> 'National Characters' in a uniform way, even in non-UTF8 encoded
> databases. Many of our customers in the Asian region who are now, as
> part of their platform modernization, are moving away from mainframes
> where they have used National Characters representation in COBOL and
> other databases. Having stronger support for national characters
> representation will also make it easier for these customers to look at
> PostgreSQL more favourably when migrating from other well known RDBMSs
> who all have varying degrees of NCHAR/NVARCHAR support.
> 
> [Specifications]
> 
> Broadly speaking, the national characters implementation ideally will
> include the following 
> - Support for NCHAR/NVARCHAR data types
> - Representing NCHAR and NVARCHAR columns in UTF-8 encoding in non-UTF8
> databases

I think this is not a trivial work because we do not have framework to
allow mixed encodings in a database. I'm interested in how you are
going to solve the problem.

> - Support for UTF16 column encoding and representing NCHAR and NVARCHAR
> columns in UTF16 encoding in all databases.

Why do yo need UTF-16 as the database encoding? UTF-8 is already
supported, and any UTF-16 character can be represented in UTF-8 as far
as I know.

> - Support for NATIONAL_CHARACTER_SET GUC variable that will determine
> the encoding that will be used in NCHAR/NVARCHAR columns.

You said NCHAR's encoding is UTF-8. Why do you need the GUC if NCHAR's
encoding is fixed to UTF-8?

> The above points are at the moment a 'wishlist' only. Our aim is to
> tackle them one-by-one as we progress. I will send a detailed proposal
> later with more technical details.
> 
> The main aim at the moment is to get some feedback on the above to know
> if this feature is something that would benefit PostgreSQL in general,
> and if users maintaining DBs in non-English speaking regions will find
> this beneficial.
> 
> Rgds,
> Arul Shaji
> 
> 
> 
> P.S.: It has been quite some time since I send a correspondence to this
> list. Our mail server adds a standard legal disclaimer to all outgoing
> mails, which I know that this list is not a huge fan of. I used to have
> an exemption for the mails I send to this list. If the disclaimer
> appears, apologies in advance. I will rectify that on the next one.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: Proposal - Support for National Characters functionality

От
Claudio Freire
Дата:
On Fri, Jul 5, 2013 at 2:02 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:
>> - Support for NATIONAL_CHARACTER_SET GUC variable that will determine
>> the encoding that will be used in NCHAR/NVARCHAR columns.
>
> You said NCHAR's encoding is UTF-8. Why do you need the GUC if NCHAR's
> encoding is fixed to UTF-8?


Not only that, but I don't think it can be a GUC. Maybe a compile-time
switch, but if it were a GUC, how do you handle an existing database
in UTF-8 when the setting is switched to UTF-16? Re-encode everything?
Store the encoding along each value? It's a mess.

Either fix it at UTF-8, or make it a compile-time thing, I'd say.



Re: Proposal - Support for National Characters functionality

От
"Arulappan, Arul Shaji"
Дата:
Ishii san,

Thank you for your positive and early response.

> -----Original Message-----
> From: Tatsuo Ishii [mailto:ishii@postgresql.org]
> Sent: Friday, 5 July 2013 3:02 PM
> To: Arulappan, Arul Shaji
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal - Support for National Characters
> functionality
>
> Arul Shaji,
>
> NCHAR support is on our TODO list for some time and I would like to
welcome
> efforts trying to implement it. However I have a few
> questions:
>
> > This is a proposal to implement functionalities for the handling of
> > National Characters.
> >
> > [Introduction]
> >
> > The aim of this proposal is to eventually have a way to represent
> > 'National Characters' in a uniform way, even in non-UTF8 encoded
> > databases. Many of our customers in the Asian region who are now, as
> > part of their platform modernization, are moving away from
mainframes
> > where they have used National Characters representation in COBOL and
> > other databases. Having stronger support for national characters
> > representation will also make it easier for these customers to look
at
> > PostgreSQL more favourably when migrating from other well known
RDBMSs
> > who all have varying degrees of NCHAR/NVARCHAR support.
> >
> > [Specifications]
> >
> > Broadly speaking, the national characters implementation ideally
will
> > include the following
> > - Support for NCHAR/NVARCHAR data types
> > - Representing NCHAR and NVARCHAR columns in UTF-8 encoding in
> > non-UTF8 databases
>
> I think this is not a trivial work because we do not have framework to
allow
> mixed encodings in a database. I'm interested in how you are going to
solve
> the problem.
>

I would be lying if I said I have the design already speced out. I will
be working on this in the coming weeks and hope to design a working
solution consulting with the community.

> > - Support for UTF16 column encoding and representing NCHAR and
> > NVARCHAR columns in UTF16 encoding in all databases.
>
> Why do yo need UTF-16 as the database encoding? UTF-8 is already
supported,
> and any UTF-16 character can be represented in UTF-8 as far as I know.
>

Yes, that's correct. However there are advantages in using UTF-16
encoding for those characters that are always going to take atleast
two-bytes to represent.

Having said that, my intention is to use UTF-8 for NCHAR as well.
Supporting UTF-16 will be even more complicated as it is not supported
natively in some Linux platforms. I only included it to give an option.

> > - Support for NATIONAL_CHARACTER_SET GUC variable that will
determine
> > the encoding that will be used in NCHAR/NVARCHAR columns.
>
> You said NCHAR's encoding is UTF-8. Why do you need the GUC if NCHAR's
> encoding is fixed to UTF-8?
>

If we are going to only support UTF-8 for NCHAR, then we don't need the
GUC variable obviously.

Rgds,
Arul Shaji



> > The above points are at the moment a 'wishlist' only. Our aim is to
> > tackle them one-by-one as we progress. I will send a detailed
proposal
> > later with more technical details.
> >
> > The main aim at the moment is to get some feedback on the above to
> > know if this feature is something that would benefit PostgreSQL in
> > general, and if users maintaining DBs in non-English speaking
regions
> > will find this beneficial.
> >
> > Rgds,
> > Arul Shaji
> >
> >
> >
> > P.S.: It has been quite some time since I send a correspondence to
> > this list. Our mail server adds a standard legal disclaimer to all
> > outgoing mails, which I know that this list is not a huge fan of. I
> > used to have an exemption for the mails I send to this list. If the
> > disclaimer appears, apologies in advance. I will rectify that on the
next
> one.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp





Re: Proposal - Support for National Characters functionality

От
"Arulappan, Arul Shaji"
Дата:

> -----Original Message-----
> From: Claudio Freire [mailto:klaussfreire@gmail.com]
> Sent: Friday, 5 July 2013 3:41 PM
> To: Tatsuo Ishii
> Cc: Arulappan, Arul Shaji; PostgreSQL-Dev
> Subject: Re: [HACKERS] Proposal - Support for National Characters
> functionality
>
> On Fri, Jul 5, 2013 at 2:02 AM, Tatsuo Ishii <ishii@postgresql.org>
wrote:
> >> - Support for NATIONAL_CHARACTER_SET GUC variable that will
determine
> >> the encoding that will be used in NCHAR/NVARCHAR columns.
> >
> > You said NCHAR's encoding is UTF-8. Why do you need the GUC if
NCHAR's
> > encoding is fixed to UTF-8?
>
>
> Not only that, but I don't think it can be a GUC. Maybe a compile-time
switch,
> but if it were a GUC, how do you handle an existing database in UTF-8
when the
> setting is switched to UTF-16? Re-encode everything?
> Store the encoding along each value? It's a mess.
>
> Either fix it at UTF-8, or make it a compile-time thing, I'd say.

Agreed, that to begin with we only support UTF-8 encoding for NCHAR
columns. If that is the case, do we still need a compile time option to
turn on/off NCHAR functionality ? ?

Rgds,
Arul Shaji





Re: Proposal - Support for National Characters functionality

От
Andrew Dunstan
Дата:
On 07/05/2013 02:12 AM, Arulappan, Arul Shaji wrote:
> - Support for UTF16 column encoding and representing NCHAR and
> NVARCHAR columns in UTF16 encoding in all databases.
>> Why do yo need UTF-16 as the database encoding? UTF-8 is already
> supported,
>> and any UTF-16 character can be represented in UTF-8 as far as I know.
>>
> Yes, that's correct. However there are advantages in using UTF-16
> encoding for those characters that are always going to take atleast
> two-bytes to represent.
>



Any suggestion to store data in utf-16 is likely to be a complete 
non-starter. I suggest you research our previously stated requirements 
for server side encodings.

cheers

andrew



Re: GSOC13 proposal - extend RETURNING syntax

От
David Fetter
Дата:
On Fri, Jul 05, 2013 at 03:22:33AM +0200, Karol Trzcionka wrote:
> Hello,
> according to my mentor's suggestion, I send first PoC patch of
> "RETURNING AFTER/BEFORE" statement. Some info:
> - it is early version - more hack PoC than RC patch
> - AFTER in this version works as decribed before but it returns row
> after update but before post-update before (to be fixed or switch with
> current "*" behaviour - I don't know yet)
> - patch passes all regression tests
> - the code is uncommented already, to be fixed
> I'm not sure what may fail so you use it on your risk ;)
> Regards,
> Karol

Karol,

Per discussion in IRC, please follow up with your patch that includes
such documentation, new regression tests, etc., you've written for the
feature.

Thanks! :)

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Proposal - Support for National Characters functionality

От
Peter Eisentraut
Дата:
On 7/4/13 10:11 PM, Arulappan, Arul Shaji wrote:
> The main aim at the moment is to get some feedback on the above to know
> if this feature is something that would benefit PostgreSQL in general,
> and if users maintaining DBs in non-English speaking regions will find
> this beneficial.

For European languages, I think everyone has moved to using Unicode, so
the demand for supporting multiple encodings is approaching zero.  The
CJK realm might have difference requirements.




Re: Proposal - Support for National Characters functionality

От
Pavel Stehule
Дата:
<p dir="ltr">Yes, what I know almost all use utf8 without problems. Long time I didn't see any request for multi
encodingsupport. <div class="gmail_quote">Dne 5.7.2013 20:28 "Peter Eisentraut" <<a
href="mailto:peter_e@gmx.net">peter_e@gmx.net</a>>napsal(a):<br type="attribution" /><blockquote class="gmail_quote"
style="margin:00 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On 7/4/13 10:11 PM, Arulappan, Arul Shaji
wrote:<br/> > The main aim at the moment is to get some feedback on the above to know<br /> > if this feature is
somethingthat would benefit PostgreSQL in general,<br /> > and if users maintaining DBs in non-English speaking
regionswill find<br /> > this beneficial.<br /><br /> For European languages, I think everyone has moved to using
Unicode,so<br /> the demand for supporting multiple encodings is approaching zero.  The<br /> CJK realm might have
differencerequirements.<br /><br /><br /><br /> --<br /> Sent via pgsql-hackers mailing list (<a
href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br/> To make changes to your
subscription:<br/><a href="http://www.postgresql.org/mailpref/pgsql-hackers"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></blockquote></div> 

Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
Updated patch:
- include sgml
- fix all compiler warnings
- some cleanup
- fix correctness of feature
Regards,
Karol

Вложения

Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
Next version:
- cleanup
- regression test
- fix issue reported by johto (invalid values in parallel transactions)
I would like more feedback and comments about the patch, as some parts
may be too hacky.
In particular, is it a problem that I update a pointer to planSlot? In
my patch, it points to tuple after all updates done between planner and
executor (in fact it is not planSlot right now). I don't know whether
the tuple could be deleted in the intervening time and if the pointer
doesn't point to "unreserved" memory (I mean - memory which may be
overwritten by something meanwhile).
Regards,
Karol

Вложения

Re: GSOC13 proposal - extend RETURNING syntax

От
David Fetter
Дата:
On Sat, Jul 13, 2013 at 12:49:45AM +0200, Karol Trzcionka wrote:
> Next version:
> - cleanup
> - regression test
> - fix issue reported by johto (invalid values in parallel transactions)
> I would like more feedback and comments about the patch, as some parts
> may be too hacky.
> In particular, is it a problem that I update a pointer to planSlot? In
> my patch, it points to tuple after all updates done between planner and
> executor (in fact it is not planSlot right now). I don't know whether
> the tuple could be deleted in the intervening time and if the pointer
> doesn't point to "unreserved" memory (I mean - memory which may be
> overwritten by something meanwhile).

Thanks for the updated patch!

Anybody care to look this over for vulnerabilities as described above?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Proposal - Support for National Characters functionality

От
Robert Haas
Дата:
On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Yes, what I know almost all use utf8 without problems. Long time I didn't
> see any request for multi encoding support.

Well, not *everything* can be represented as UTF-8; I think this is
particularly an issue with Asian languages.

If we chose to do it, I think that per-column encoding support would
end up looking a lot like per-column collation support: it would be
yet another per-column property along with typoid, typmod, and
typcollation.  I'm not entirely sure it's worth it, although FWIW I do
believe Oracle has something like this.  At any rate, it seems like
quite a lot of work.

Another idea would be to do something like what we do for range types
- i.e. allow a user to declare a type that is a differently-encoded
version of some base type.  But even that seems pretty hard.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Proposal - Support for National Characters functionality

От
"Arulappan, Arul Shaji"
Дата:
>
> On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule
<pavel.stehule@gmail.com> wrote:
> > Yes, what I know almost all use utf8 without problems. Long time I
> > didn't see any request for multi encoding support.
>
> Well, not *everything* can be represented as UTF-8; I think this is
> particularly an issue with Asian languages.
>
> If we chose to do it, I think that per-column encoding support would
end up
> looking a lot like per-column collation support: it would be yet
another per-
> column property along with typoid, typmod, and typcollation.  I'm not
entirely
> sure it's worth it, although FWIW I do believe Oracle has something
like this.

Yes, the idea is that users will be able to declare columns of type
NCHAR or NVARCHAR which will use the pre-determined encoding type. If we
say that NCHAR is UTF-8 then the NCHAR column will be of UTF-8 encoding
irrespective of the database encoding. It will be up to us to restrict
what Unicode encodings we want to support for NCHAR/NVARCHAR columns.
This is based on my interpretation of the SQL standard. As you allude to
above, Oracle has a similar behaviour (they support UTF-16 as well).

Support for UTF-16 will be difficult without linking with some external
libraries such as ICU.


> At any rate, it seems like quite a lot of work.

Thanks for putting my mind at ease ;-)

Rgds,
Arul Shaji


>
> Another idea would be to do something like what we do for range types
> - i.e. allow a user to declare a type that is a differently-encoded
version of
> some base type.  But even that seems pretty hard.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
Company





Re: Proposal - Support for National Characters functionality

От
Peter Geoghegan
Дата:
On Mon, Jul 15, 2013 at 4:37 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> Yes, what I know almost all use utf8 without problems. Long time I didn't
>> see any request for multi encoding support.
>
> Well, not *everything* can be represented as UTF-8; I think this is
> particularly an issue with Asian languages.

What cannot be represented as UTF-8? UTF-8 can represent every
character in the Unicode character set, whereas UTF-16 can encode
characters 0 to 0x10FFFF.

Does support for alternative multi-byte encodings have something to do
with the Han unification controversy? I don't know terribly much about
this, so apologies if that's just wrong.

-- 
Peter Geoghegan



Re: Proposal - Support for National Characters functionality

От
Tatsuo Ishii
Дата:
>> On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule
> <pavel.stehule@gmail.com> wrote:
>> > Yes, what I know almost all use utf8 without problems. Long time I
>> > didn't see any request for multi encoding support.
>> 
>> Well, not *everything* can be represented as UTF-8; I think this is
>> particularly an issue with Asian languages.
>> 
>> If we chose to do it, I think that per-column encoding support would
> end up
>> looking a lot like per-column collation support: it would be yet
> another per-
>> column property along with typoid, typmod, and typcollation.  I'm not
> entirely
>> sure it's worth it, although FWIW I do believe Oracle has something
> like this.
> 
> Yes, the idea is that users will be able to declare columns of type
> NCHAR or NVARCHAR which will use the pre-determined encoding type. If we
> say that NCHAR is UTF-8 then the NCHAR column will be of UTF-8 encoding
> irrespective of the database encoding. It will be up to us to restrict
> what Unicode encodings we want to support for NCHAR/NVARCHAR columns.
> This is based on my interpretation of the SQL standard. As you allude to
> above, Oracle has a similar behaviour (they support UTF-16 as well).  
> 
> Support for UTF-16 will be difficult without linking with some external
> libraries such as ICU. 

Can you please elaborate more on this? Why do you exactly need ICU?

Also I don't understand why you need UTF-16 support as a database
encoding because UTF-8 and UTF-16 are logically equivalent, they are
just different represention (encoding) of Unicode. That means if we
already support UTF-8 (I'm sure we already do), there's no particular
reason we need to add UTF-16 support.

Maybe you just want to support UTF-16 as a client encoding?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: Proposal - Support for National Characters functionality

От
Peter Geoghegan
Дата:
On Mon, Jul 15, 2013 at 8:58 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:
> Also I don't understand why you need UTF-16 support as a database
> encoding because UTF-8 and UTF-16 are logically equivalent, they are
> just different represention (encoding) of Unicode. That means if we
> already support UTF-8 (I'm sure we already do), there's no particular
> reason we need to add UTF-16 support.

To be fair, there is a small reason to support UTF-16 even with UTF-8
available. I personally do not find it compelling, but perhaps I am
not best placed to judge such things. As Wikipedia says on the the
English UTF-8 article:

"Characters U+0800 through U+FFFF use three bytes in UTF-8, but only
two in UTF-16. As a result, text in (for example) Chinese, Japanese or
Hindi could take more space in UTF-8 if there are more of these
characters than there are ASCII characters. This happens for pure text
but rarely for HTML documents. For example, both the Japanese UTF-8
and the Hindi Unicode articles on Wikipedia take more space in UTF-16
than in UTF-8."

This is the only advantage of UTF-16 over UTF-8 as a server encoding.
I'm inclined to take the fact that there has been so few (no?)
complaints from PostgreSQL's large Japanese user-base about the lack
of UTF-16 support as suggesting that that isn't considered to be a
compelling feature in the CJK realm.

-- 
Peter Geoghegan



Re: Proposal - Support for National Characters functionality

От
Tatsuo Ishii
Дата:
> On Mon, Jul 15, 2013 at 4:37 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, Jul 5, 2013 at 2:35 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> Yes, what I know almost all use utf8 without problems. Long time I didn't
>>> see any request for multi encoding support.
>>
>> Well, not *everything* can be represented as UTF-8; I think this is
>> particularly an issue with Asian languages.
> 
> What cannot be represented as UTF-8? UTF-8 can represent every
> character in the Unicode character set, whereas UTF-16 can encode
> characters 0 to 0x10FFFF.
> 
> Does support for alternative multi-byte encodings have something to do
> with the Han unification controversy? I don't know terribly much about
> this, so apologies if that's just wrong.

There's a famous problem regarding conversion between Unicode and other
encodings, such as Shift Jis.

There are lots of discussion on this. Here is the one from Microsoft:

http://support.microsoft.com/kb/170559/EN-US
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: Proposal - Support for National Characters functionality

От
Peter Eisentraut
Дата:
On 7/15/13 1:26 AM, Arulappan, Arul Shaji wrote:
> Yes, the idea is that users will be able to declare columns of type
> NCHAR or NVARCHAR which will use the pre-determined encoding type. If we
> say that NCHAR is UTF-8 then the NCHAR column will be of UTF-8 encoding
> irrespective of the database encoding. It will be up to us to restrict
> what Unicode encodings we want to support for NCHAR/NVARCHAR columns.

I would try implementing this as an extension at first, with a new data
type that is internally encoded differently.  We have citext as
precedent for successfully implementing text-like data types in user space.




Re: Proposal - Support for National Characters functionality

От
Martijn van Oosterhout
Дата:
On Mon, Jul 15, 2013 at 05:11:40PM +0900, Tatsuo Ishii wrote:
> > Does support for alternative multi-byte encodings have something to do
> > with the Han unification controversy? I don't know terribly much about
> > this, so apologies if that's just wrong.
>
> There's a famous problem regarding conversion between Unicode and other
> encodings, such as Shift Jis.
>
> There are lots of discussion on this. Here is the one from Microsoft:
>
> http://support.microsoft.com/kb/170559/EN-US

Apart from Shift-JIS not being a well defined (it's more a family of
encodings) it has the unusual feature of providing multiple ways to
encode the same character.  This is not even a Han unification issue,
they have largely been addressed.  For example, the square-root symbol
exists twice (0x8795 and 0x81E3) and many other mathmatical symbols
also.

Here's the code page which you can browse online:

http://msdn.microsoft.com/en-us/goglobal/cc305152

Which means to be round-trippable Unicode would have to double those
characters, but this would make it hard/impossible to round-trip with
any other character set that had those characters.  No easy solution
here.

Something that has been done before [1] is to map the doubles to the
custom area of the unicode space (0xe000-0xffff).  It gives you
round-trip support at the expense of having to handle those characters
yourself.  But since postgres doesn't do anything meaningful with
unicode characters this might be acceptable.

[1] Python does a similar trick to handle filenames coming from disk in
an unknown encoding:
http://docs.python.org/3/howto/unicode.html#files-in-an-unknown-encoding

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer

Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
New version:
- fix returning "after" values if there are not "before"
- add more regression tests
I'd like to hear/read any feedback ;)
Regards,
Karol

Вложения

Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
I've noticed problem with "UPDATE ... FROM" statement. Fix in new version.
Regards,
Karol

Вложения

Re: GSOC13 proposal - extend RETURNING syntax

От
David Fetter
Дата:
On Mon, Jul 22, 2013 at 09:52:14PM +0200, Karol Trzcionka wrote:
> I've noticed problem with "UPDATE ... FROM" statement. Fix in new version.
> Regards,
> Karol

What problem or problems did you notice, and what did you change to
fix them?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
W dniu 23.07.2013 06:22, David Fetter pisze:
> What problem or problems did you notice, and what did you change to
> fix them?
"UPDATE ... FROM" generated "ERROR:  variable not found in subplan
target lists". I've added some workaround in add_vars_to_targetlist:
- if it is an "after" - simple change var->varno to base RTE (it should
always exists, the value is temporary, it will change to OUTER_VAR)
- if it is a "before" - add to targetlist new var independently from
rel->attr_needed[attno].
Additionally I've change build_joinrel_tlist to ignore any "BEFORE"
RTEs. The regression tests are updated.
Regards,
Karol



Re: Proposal - Support for National Characters functionality

От
"Arulappan, Arul Shaji"
Дата:
> -----Original Message-----
> From: Tatsuo Ishii [mailto:ishii@postgresql.org]
>
>
> Also I don't understand why you need UTF-16 support as a database
encoding
> because UTF-8 and UTF-16 are logically equivalent, they are just
different
> represention (encoding) of Unicode. That means if we already support
UTF-8
> (I'm sure we already do), there's no particular reason we need to add
UTF-16
> support.
>
> Maybe you just want to support UTF-16 as a client encoding?

Given below is a design draft for this functionality:

Core new functionality (new code):
1)Create and register independent NCHAR/NVARCHAR/NTEXT data types.

2)Provide support for the new GUC nchar_collation to provide the
database with information about the default collation that needs to be
used for the new data types.

3)Create encoding conversion subroutines to convert strings between the
database encoding and UTF8 (from national strings to regular strings and
back).
PostgreSQL already have all required support (used for conversion
between the database encoding and client_encoding), so amount of the new
code will be minimal there.

4)Because all symbols from non-UTF8 encodings could be represented as
UTF8 (but the reverse is not true) comparison between N* types and the
regular string types inside database will be performed in UTF8 form. To
achieve this feature the new IMPLICIT casts may need to be created:
NCHAR -> CHAR
NVARCHAR -> VARCHAR
NTEXT -> TEXT.

Casting in the reverse direction will be available too but only as
EXPLICIT.
However, these casts could fail if national strings could not be
represented in the used database encoding.

All these casts will use subroutines created in 3).

Casting/conversion between N* types will follow the same rules/mechanics
as used for casting/conversion between usual (CHAR(N)/VARCHAR(N)/TEXT)
string types.


5)Comparison between NATIONAL string values will be performed via
specialized UTF8 optimized functions (with respect of the
nchar_collation setting).


6)Client input/output of NATIONAL strings - NATIONAL strings will
respect the client_encoding setting, and their values will be
transparently converted to the requested client_encoding before
sending(receiving) to client (the same mechanics as used for usual
string types).
So no mixed encoding in client input/output will be supported/available.

7)Create set of the regression tests for these new data types.


Additional changes:
1)ECPG support for these new types
2) Support in the database drivers for the data types.

Rgds,
Arul Shaji




> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp





Re: Proposal - Support for National Characters functionality

От
Tom Lane
Дата:
"Arulappan, Arul Shaji" <arul@fast.au.fujitsu.com> writes:
> Given below is a design draft for this functionality:

> Core new functionality (new code):
> 1)Create and register independent NCHAR/NVARCHAR/NTEXT data types.

> 2)Provide support for the new GUC nchar_collation to provide the
> database with information about the default collation that needs to be
> used for the new data types.

A GUC seems like completely the wrong tack to be taking.  In the first
place, that would mandate just one value (at a time anyway) of
collation, which is surely not much of an advance over what's already
possible.  In the second place, what happens if you change the value?
All your indexes on nchar columns are corrupt, that's what.  Actually
the data itself would be corrupt, if you intend that this setting
determines the encoding and not just the collation.  If you really are
speaking only of collation, it's not clear to me exactly what this
proposal offers that can't be achieved today (with greater security,
functionality and spec compliance) by using COLLATE clauses on plain
text columns.

Actually, you really haven't answered at all what it is you want to do
that COLLATE can't do.

> 4)Because all symbols from non-UTF8 encodings could be represented as
> UTF8 (but the reverse is not true) comparison between N* types and the
> regular string types inside database will be performed in UTF8 form.

I believe that in some Far Eastern character sets there are some
characters that map to the same Unicode glyph, but that some people
would prefer to keep separate.  So transcoding to UTF8 isn't necessarily
lossless.  This is one of the reasons why we've resisted adopting ICU or
standardizing on UTF8 as the One True Database Encoding.  Now this may
or may not matter for comparison to strings that were in some other
encoding to start with --- but as soon as you base your design on the
premise that UTF8 is a universal encoding, you are sliding down a
slippery slope to a design that will meet resistance.

> 6)Client input/output of NATIONAL strings - NATIONAL strings will
> respect the client_encoding setting, and their values will be
> transparently converted to the requested client_encoding before
> sending(receiving) to client (the same mechanics as used for usual
> string types).
> So no mixed encoding in client input/output will be supported/available.

If you have this restriction, then I'm really failing to see what
benefit there is over what can be done today with COLLATE.
        regards, tom lane



Re: Proposal - Support for National Characters functionality

От
"Boguk, Maksym"
Дата:
Hi everyone,

I will try answer on all questions related to proposed National
Characters support.

>> 2)Provide support for the new GUC nchar_collation to provide the
>> database with information about the default collation that needs to
be
>> used for the new data types.

>A GUC seems like completely the wrong tack to be taking.  In the first
place, that would mandate just one value (at a time anyway) of
collation, which is surely not much of an advance over what's already
possible.  In the second place, what happens if you change the value?
>All your indexes on nchar columns are corrupt, that's what.  Actually
the data itself would be corrupt, if you intend that this setting
determines the encoding and not just the collation.  If you really are
speaking only of collation, it's not clear to me exactly what this
proposal offers that can't be achieved today (with greater security,
>functionality and spec compliance) by using COLLATE clauses on plain
text columns.
>Actually, you really haven't answered at all what it is you want to do
that COLLATE can't do.

I think I give a wrong description there... it will be not GUC but
GUC-type value which will be initialized during CREATE DATABASE and will
be read only after, very similar to the lc_collate.
So I think name  national_lc_collate will be better.
Function of this value - provide information about the default collation
for the NATIONAL CHARACTERS inside the database.
That's not limits user ability of use an alternative collation for
NATIONAL CHARACTERS during create table via COLLATE keyword.

E.g. if we have second encoding inside the database - we should have
information about used collation somewhere.

>> 4)Because all symbols from non-UTF8 encodings could be represented as
>> UTF8 (but the reverse is not true) comparison between N* types and
the
>> regular string types inside database will be performed in UTF8 form.

>I believe that in some Far Eastern character sets there are some
characters that map to the same Unicode glyph, but that some people
would prefer to keep separate.  So transcoding to UTF8 isn't necessarily
lossless.  This is one of the reasons why we've resisted adopting ICU or
standardizing on UTF8 as the One True Database Encoding.  >Now this may
or may not matter for comparison to strings that were in some other
encoding to start with --- but as soon as you base your design on the
premise that UTF8 is a universal encoding, you are sliding down a
slippery slope to a design that will meet resistance.

Will the conversion of both sides to the pg_wchar before comparison fix
this problem?
Anyway, if the database going to use more than one encoding, a some
universal encoding should be used to allow comparison between them.
After some analyse I think pg_wchar is better candidate to this role
than UTF8.

>> 6)Client input/output of NATIONAL strings - NATIONAL strings will
>> respect the client_encoding setting, and their values will be
>> transparently converted to the requested client_encoding before
>> sending(receiving) to client (the same mechanics as used for usual
>> string types).
>> So no mixed encoding in client input/output will be
supported/available.

>If you have this restriction, then I'm really failing to see what
benefit there is over what can be done today with COLLATE.

There are two targets for this project:

1. Legacy database with non-utf8 encoding, which should support old
non-utf8 applications and new UTF8 applications.
In that case the old applications will use the legacy database encoding
(and because these applications are legacy they doesn't work with new
NATIONAL CHARACTERS data/tables).
And the new applications will use client-side UTF8 encoding and they
will be able store international texts in NATIONAL CHARACTER columns.
Dump/restore of the whole database to change the database encoding to
UTF8 not always possible, so there necessity of the some easy to use
workaround.

2.Better compatibility with the ANSI SQL standard.


Kind Regards,
Maksym




Re: Proposal - Support for National Characters functionality

От
Alvaro Herrera
Дата:
Boguk, Maksym escribió:

> I think I give a wrong description there... it will be not GUC but
> GUC-type value which will be initialized during CREATE DATABASE and will
> be read only after, very similar to the lc_collate.
> So I think name  national_lc_collate will be better.
> Function of this value - provide information about the default collation
> for the NATIONAL CHARACTERS inside the database.
> That's not limits user ability of use an alternative collation for
> NATIONAL CHARACTERS during create table via COLLATE keyword.

This seems a bit odd.  I mean, if I want the option for differing
encodings, surely I need to be able to set them for each column, not at
the database level.

Also, as far as I understand what we want to control here is the
encoding that the strings are in (the mapping of bytes to characters),
not the collation (the way a set of strings are ordered).  So it doesn't
make sense to set the NATIONAL CHARACTER option using the COLLATE
keyword.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Proposal - Support for National Characters functionality

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Also, as far as I understand what we want to control here is the
> encoding that the strings are in (the mapping of bytes to characters),
> not the collation (the way a set of strings are ordered).  So it doesn't
> make sense to set the NATIONAL CHARACTER option using the COLLATE
> keyword.

My thought is that we should simply ignore the NATIONAL CHARACTER syntax,
which is not the first nor the last brain-damaged feature design in the
SQL standard.  It's basically useless for what we want because there's
noplace to specify which encoding you mean.  Instead, let's consider that
COLLATE can define not only the collation but also the encoding of a
string datum.  Contrary to what I think you meant above, that seems
perfectly sensible to me, because after all a collation is necessarily a
bunch of rules about how to order a particular set of characters.  If the
data representation you use is unable to represent that set of characters,
it's not a very meaningful combination is it?

There's still the problem of how do you get a string of a nondefault
encoding into the database in the first place.  If you have to convert
to DB encoding to get it in there, then what's the use of a further
conversion?  This consideration may well kill the whole concept.
(It certainly kills NATIONAL CHARACTER syntax just as much.)
        regards, tom lane



Re: Proposal - Support for National Characters functionality

От
"Arulappan, Arul Shaji"
Дата:
> From: Alvaro Herrera [mailto:alvherre@2ndquadrant.com]
>
> Boguk, Maksym escribió:
>
> > I think I give a wrong description there... it will be not GUC but
> > GUC-type value which will be initialized during CREATE DATABASE and
> > will be read only after, very similar to the lc_collate.
> > So I think name  national_lc_collate will be better.
> > Function of this value - provide information about the default
> > collation for the NATIONAL CHARACTERS inside the database.
> > That's not limits user ability of use an alternative collation for
> > NATIONAL CHARACTERS during create table via COLLATE keyword.
>
> This seems a bit odd.  I mean, if I want the option for differing encodings,
> surely I need to be able to set them for each column, not at the database
> level.
>
> Also, as far as I understand what we want to control here is the encoding that
> the strings are in (the mapping of bytes to characters), not the collation

Yes, that is our idea too. For the sql syntax

Create table tbl1 (col1 nchar);

What should be the encoding and collation for col1? Because the idea is to have them in separate encoding and collation
(ifneeded) from that of the rest of the table.  We have options of  

a) Having guc variables that will determine the default encoding and collation for nchar/nvarchar columns. Note that
thecollate variable is default only. Users can still override them per column. 
b) Having the encoding name and collation as part of the syntax. For ex., (col1 nchar encoding UTF-8 COLLATE "C").
Ugly,but..... 
c) Be rigid and say nchar/nvarchar columns are by default UTF-8 (or something else). One cannot change the default. But
theycan override it when declaring the column by having a syntax similar to (b) 


Rgds,
Arul Shaji



> (the way a set of strings are ordered).  So it doesn't make sense to set the
> NATIONAL CHARACTER option using the COLLATE keyword.




>
> --
> Álvaro Herrera                http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services





Re: Proposal - Support for National Characters functionality

От
"Arulappan, Arul Shaji"
Дата:
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Also, as far as I understand what we want to control here is the
> > encoding that the strings are in (the mapping of bytes to
characters),
> > not the collation (the way a set of strings are ordered).  So it
> > doesn't make sense to set the NATIONAL CHARACTER option using the
> > COLLATE keyword.
>
> My thought is that we should simply ignore the NATIONAL CHARACTER
syntax,
> which is not the first nor the last brain-damaged feature design in
the SQL
> standard.  It's basically useless for what we want because there's
noplace to
> specify which encoding you mean.  Instead, let's consider that COLLATE
can
> define not only the collation but also the encoding of a string datum.

Yes, don't have a problem with this. If I understand you correctly, this
will be simpler syntax wise, but still get nchar/nvarchar data types
into a table, in different encoding from the rest of the table.

>
> There's still the problem of how do you get a string of a nondefault
encoding
> into the database in the first place.

Yes, that is the bulk of the work. Will need change in a whole lot of
places.

Is a step-by-step approach worth exploring ? Something similar to:

Step 1: Support nchar/nvarchar data types. Restrict them only to UTF-8
databases to begin with.
Step 2: Support multiple encodings in a database. Remove the restriction
imposed in step1.


Rgds,
Arul Shaji





Re: GSOC13 proposal - extend RETURNING syntax

От
Boszormenyi Zoltan
Дата:
Hi,

mini-review follows.


2013-07-22 21:52 keltezéssel, Karol Trzcionka írta:
> I've noticed problem with "UPDATE ... FROM" statement. Fix in new version.
> Regards,
> Karol

* Does it apply cleanly to the current git master?

No. There's a reject in src/backend/optimizer/plan/initsplan.c

* Does it include reasonable tests?

Yes but the test fails after trying to fix the rejected chunk of the patch.

Best regards,
Zoltán Böszörményi

-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/




Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
W dniu 19.08.2013 19:56, Boszormenyi Zoltan pisze:
> * Does it apply cleanly to the current git master?
>
> No. There's a reject in src/backend/optimizer/plan/initsplan.c
Thank you, merged in attached version.
>
> * Does it include reasonable tests?
>
> Yes but the test fails after trying to fix the rejected chunk of the
> patch.
It fails because the "HINT" was changed, fixed.
That version merges some nested "ifs" left over from earlier work.
Regards,
Karol Trzcionka

Вложения

Re: GSOC13 proposal - extend RETURNING syntax

От
Boszormenyi Zoltan
Дата:
2013-08-19 21:21 keltezéssel, Karol Trzcionka írta:
> W dniu 19.08.2013 19:56, Boszormenyi Zoltan pisze:
>> * Does it apply cleanly to the current git master?
>>
>> No. There's a reject in src/backend/optimizer/plan/initsplan.c
> Thank you, merged in attached version.
>> * Does it include reasonable tests?
>>
>> Yes but the test fails after trying to fix the rejected chunk of the
>> patch.
> It fails because the "HINT" was changed, fixed.
> That version merges some nested "ifs" left over from earlier work.

I tried to compile your v5 patch and I got:

initsplan.c: In function ‘add_vars_to_targetlist’:
initsplan.c:216:26: warning: ‘rel’ may be used uninitialized in this function 
[-Wmaybe-uninitialized]
rel->reltargetlist = lappend(rel->reltargetlist,
^

You shouldn't change the assignment at declaration:

- RelOptInfo *rel = find_base_rel(root, var->varno);
+ RelOptInfo *rel;
...
+ if (root->parse->commandType == CMD_UPDATE)
+ {
... (code using rel)
+ }
+ rel = find_base_rel(root, varno);

Best regards,
Zoltán Böszörményi

> Regards,
> Karol Trzcionka


-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/




Re: GSOC13 proposal - extend RETURNING syntax

От
Boszormenyi Zoltan
Дата:
Hi,

2013-08-19 21:52 keltezéssel, Boszormenyi Zoltan írta:
> 2013-08-19 21:21 keltezéssel, Karol Trzcionka írta:
>> W dniu 19.08.2013 19:56, Boszormenyi Zoltan pisze:
>>> * Does it apply cleanly to the current git master?
>>>
>>> No. There's a reject in src/backend/optimizer/plan/initsplan.c
>> Thank you, merged in attached version.
>>> * Does it include reasonable tests?
>>>
>>> Yes but the test fails after trying to fix the rejected chunk of the
>>> patch.
>> It fails because the "HINT" was changed, fixed.
>> That version merges some nested "ifs" left over from earlier work.
>
> I tried to compile your v5 patch and I got:
>
> initsplan.c: In function ‘add_vars_to_targetlist’:
> initsplan.c:216:26: warning: ‘rel’ may be used uninitialized in this function
> [-Wmaybe-uninitialized]
> rel->reltargetlist = lappend(rel->reltargetlist,
> ^
>
> You shouldn't change the assignment at declaration:
>
> - RelOptInfo *rel = find_base_rel(root, var->varno);
> + RelOptInfo *rel;
> ...
> + if (root->parse->commandType == CMD_UPDATE)
> + {
> ... (code using rel)
> + }
> + rel = find_base_rel(root, varno);

Let me say it again: the new code in initsplan.c::add_vars_to_targetlist() is fishy.
The compiler says that "rel" used on line 216 may be uninitialized.

Keeping it that way passes "make check", perhaps "rel" was initialized
in a previous iteration of "foreach(temp, vars)", possibly in the
     else if (IsA(node, PlaceHolderVar))
branch, which means that "PlaceHolderInfo *phinfo" may be interpreted
as RelOptInfo *, stomping on memory.

Moving the assignment back to the declaration makes "make check"
fail with the attached regression.diffs file.

Initializing it as "RelOptInfo *rel = NULL;" makes the regression check
die with a segfault, obviously.

Change the code to avoid the warning and still produce the wanted effect.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Вложения

Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
Thank you for the review and tests. New version introduce a lot of
improvements:
- Fix regression test for view (wrong table_name)
- Add regression test for inheritance
- Delete hack in initsplan.c (now we ignore all RTE_BEFORE) - the
uninitialized issue
- Revert changing varno in add_vars_to_targetlist
- Add all "before" variables to targetlist
- Avoid adding variables to slot for AFTER.
- Treat varnoold like a flag - prevent from adjustment if RTE_BEFORE
- All before/after are now set on OUTER_VAR
- Rename fix_varno_varattno to bind_returning_variables
- Add comment about bind_returning_variables
- Remove unneeded code in fix_join_expr_mutator (it was changing varno
        of RTE_BEFORE - now there is not any var with varno assigned to it)
Regards,
Karol Trzcionka

Вложения

Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
W dniu 20.08.2013 16:47, Karol Trzcionka pisze:
> Thank you for the review and tests. New version introduce a lot of
> improvements:
> - Fix regression test for view (wrong table_name)
> - Add regression test for inheritance
> - Delete hack in initsplan.c (now we ignore all RTE_BEFORE) - the
> uninitialized issue
> - Revert changing varno in add_vars_to_targetlist
> - Add all "before" variables to targetlist
> - Avoid adding variables to slot for AFTER.
> - Treat varnoold like a flag - prevent from adjustment if RTE_BEFORE
> - All before/after are now set on OUTER_VAR
> - Rename fix_varno_varattno to bind_returning_variables
> - Add comment about bind_returning_variables
> - Remove unneeded code in fix_join_expr_mutator (it was changing varno
>         of RTE_BEFORE - now there is not any var with varno assigned to it)
I've just realized the prepare_returning_before() is unneeded right now
so I've removed it. Version 7, hopefully the last. ;)
Regards,
Karol Trzcionka

Вложения

Re: GSOC13 proposal - extend RETURNING syntax

От
Boszormenyi Zoltan
Дата:
<div class="moz-cite-prefix">2013-08-20 17:30 keltezéssel, Karol Trzcionka írta:<br /></div><blockquote
cite="mid:52138BA4.1060600@gmail.com"type="cite"><pre wrap="">W dniu 20.08.2013 16:47, Karol Trzcionka pisze:
 
</pre><blockquote type="cite"><pre wrap="">Thank you for the review and tests. New version introduce a lot of
improvements:
- Fix regression test for view (wrong table_name)
- Add regression test for inheritance
- Delete hack in initsplan.c (now we ignore all RTE_BEFORE) - the
uninitialized issue
- Revert changing varno in add_vars_to_targetlist
- Add all "before" variables to targetlist
- Avoid adding variables to slot for AFTER.
- Treat varnoold like a flag - prevent from adjustment if RTE_BEFORE
- All before/after are now set on OUTER_VAR
- Rename fix_varno_varattno to bind_returning_variables
- Add comment about bind_returning_variables
- Remove unneeded code in fix_join_expr_mutator (it was changing varno       of RTE_BEFORE - now there is not any var
withvarno assigned to it)
 
</pre></blockquote><pre wrap="">I've just realized the prepare_returning_before() is unneeded right now
so I've removed it. Version 7, hopefully the last. ;)</pre></blockquote><br /> Here's a new one, for v7:<br /><br />
setrefs.c:In function ‘set_plan_refs’:<br /> setrefs.c:2001:26: warning: ‘before_index’ may be used uninitialized in
thisfunction [-Wmaybe-uninitialized]<br />   bind_returning_variables(rlist, before_index, after_index);<br />
                         ^<br /> setrefs.c:1957:21: note: ‘before_index’ was declared here<br />   int after_index=0,
before_index;<br/>                      ^<br /><br /> Best regards,<br /> Zoltán Böszörményi<br /><br /><blockquote
cite="mid:52138BA4.1060600@gmail.com"type="cite"><pre wrap="">
 
Regards,
Karol Trzcionka
</pre><br /><fieldset class="mimeAttachmentHeader"></fieldset><br /><pre wrap="">
</pre></blockquote><br /><br /><pre class="moz-signature" cols="90">-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a>
<aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a>
 
</pre>

Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
W dniu 20.08.2013 20:55, Boszormenyi Zoltan pisze:
Here's a new one, for v7:

setrefs.c: In function ‘set_plan_refs’:
setrefs.c:2001:26: warning: ‘before_index’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  bind_returning_variables(rlist, before_index, after_index);
                          ^
setrefs.c:1957:21: note: ‘before_index’ was declared here
  int after_index=0, before_index;
                     ^
Right, my mistake. Sorry and thanks. Fixed.
Regards,
Karol Trzcionka
Вложения

Re: GSOC13 proposal - extend RETURNING syntax

От
Boszormenyi Zoltan
Дата:
<div class="moz-cite-prefix">Hi,<br /><br /> 2013-08-20 21:06 keltezéssel, Karol Trzcionka írta:<br /></div><blockquote
cite="mid:5213BE1D.1000705@gmail.com"type="cite"><div class="moz-cite-prefix">W dniu 20.08.2013 20:55, Boszormenyi
Zoltanpisze:<br /></div><blockquote cite="mid:5213BB9D.4020204@cybertec.at" type="cite"> Here's a new one, for v7:<br
/><br/> setrefs.c: In function ‘set_plan_refs’:<br /> setrefs.c:2001:26: warning: ‘before_index’ may be used
uninitializedin this function [-Wmaybe-uninitialized]<br />   bind_returning_variables(rlist, before_index,
after_index);<br/>                           ^<br /> setrefs.c:1957:21: note: ‘before_index’ was declared here<br />  
intafter_index=0, before_index;<br />                      ^<br /></blockquote> Right, my mistake. Sorry and thanks.
Fixed.<br/> Regards,<br /> Karol Trzcionka<br /></blockquote><br /> With this fixed, a more complete review:<br /><br
/>* Is the patch in a patch format which has context? (eg: context diff format)<br /><br /> Yes.<br /><br /> * Does it
applycleanly to the current git master?<br /><br /> Yes.<br /><br /> * Does it include reasonable tests, necessary doc
patches,etc? <br /><br /> There is a new regression test (returning_before_after.sql) covering<br /> this feature.
However,I think it should be added to the group<br /> where "returning.sql" resides currently. There is a value in
runningit<br /> in parallel to other tests. Sometimes a subtle bug is uncovered<br /> because of this and your v2 patch
fixedsuch a bug already.<br /><br /> doc/src/sgml/ref/update.sgml describes this feature.<br /><br />
doc/src/sgml/dml.sgmlshould also be touched to cover this feature.<br /><br /> * Does the patch actually implement what
it'ssupposed to do?<br /><br /> Yes.<br /><br /> * Do we want that?<br /><br /> Yes.<br /><br /> * Do we already have
it?<br/><br /> No.<br /><br /> * Does it follow SQL spec, or the community-agreed behavior?<br /><br /> RETURNING is a
PostgreSQLextension, so the SQL-spec part<br /> of the question isn't applicable.<br /><br /> It implements the
community-agreedbehaviour, according to<br /> the new regression test coverage.<br /><br /> * Does it include pg_dump
support(if applicable)?<br /><br /> n/a<br /><br /> * Are there dangers?<br /><br /> I don't think so.<br /><br /> *
Haveall the bases been covered?<br /><br /> It seems so. I have also tried mixing before/after columns in<br />
differentorders and the query didn't fail:<br /><br /> zozo=# create table t1 (id serial primary key, i1 int4, i2 int4,
t1text, t2 text);<br /> CREATE TABLE<br /> zozo=# insert into t1 (i1, i2, t1, t2) values (1, 1, 'a', 'a');<br /> INSERT
01<br /> zozo=# insert into t1 (i1, i2, t1, t2) values (2, 2, 'b', 'b');<br /> INSERT 0 1<br /> zozo=# insert into t1
(i1,i2, t1, t2) values (3, 3, 'c', 'c');<br /> INSERT 0 1<br /> zozo=# select * from t1;<br />  id | i1 | i2 | t1 | t2
<br/> ----+----+----+----+----<br />   1 |  1 |  1 | a  | a<br />   2 |  2 |  2 | b  | b<br />   3 |  3 |  3 | c  |
c<br/> (3 rows)<br /><br /> zozo=# begin;<br /> BEGIN<br /> zozo=# update t1 set i2 = i2*2, t2 = t2 || 'x2' where id =
2returning before.i1, after.i1, before.i2, after.i2, before.t1, after.t1, before.t2, after.t2;<br />  i1 | i1 | i2 | i2
|t1 | t1 | t2 | t2  <br /> ----+----+----+----+----+----+----+-----<br />   2 |  2 |  2 |  4 | b  | b  | b  | bx2<br />
(1row)<br /><br /> UPDATE 1<br /> zozo=# update t1 set i1 = i1 * 3, i2 = i2*2, t1 = t1 || 'x3', t2 = t2 || 'x2' where
id= 3 returning before.i1, before.i2, after.i1, after.i2, before.t1, before.t2, after.t1, after.t2; i1 | i2 | i1 | i2 |
t1| t2 | t1  | t2  <br /> ----+----+----+----+----+----+-----+-----<br />   3 |  3 |  9 |  6 | c  | c  | cx3 | cx2<br
/>(1 row)<br /><br /> UPDATE 1<br /><br /><br /><br /> * Does the feature work as advertised?<br /><br /> Yes.<br /><br
/>* Are there corner cases the author has failed to consider?<br /><br /> I don't know.<br /><br /> * Are there any
assertionfailures or crashes?<br /><br /> No.<br /><br /> * Does the patch slow down simple tests?<br /><br /> No.<br
/><br/> * If it claims to improve performance, does it?<br /><br /> n/a<br /><br /> * Does it slow down other things?
<br/><br /> No.<br /><br /> * Does it follow the project coding guidelines?<br /><br /> Mostly.<br /><br /> In the
src/test/regress/parallel_schedulecontains an extra<br /> new line at the end, it shouldn't.<br /><br /> In
b/src/backend/optimizer/plan/setrefs.c:<br/><br /> +static void bind_returning_variables(List *rlist, int bef, int
aft);<br/><br /> but later it becomes not public:<br /><br /> + */<br /> +void bind_returning_variables(List *rlist,
intbef, int aft)<br /> +{<br /><br /> Strange, but GCC 4.8.1 -Wall doesn't catch it. But the forward<br /> declaration
isnot needed, the function is called only from<br /> later functions.<br /><br /> Similar for parse_clause.c:<br /><br
/>+extern void addAliases(ParseState *pstate);<br />  <br /> +void addAliases(ParseState *pstate)<br /><br /> This
externaldeclaration is not needed since it is already<br /> in src/include/parser/parse_clause.h<br /><br /> In
setrefs.c,bind_returning_variables() I would also rename<br /> the function arguments, so "before" and "after" are
spelledout.<br /> These are not C keywords.<br /><br /> Some assignments, like:<br /><br /> +                      
var=(Var*)tle;<br/> and<br /> +       int index_rel=1;<br /><br /> in setrefs.c need spaces.<br /><br /> "if()"
statementsneed a space before the "(" and not after.<br /><br /> Add spaces in the {} list in addAliases():<br />
+      char    *aliases[] = {"before","after"};<br /> like this: { "before", "after" }<br /><br /> Spaces are needed
here,too:<br /> +       for(i=0 ; i<noal; i++)<br /><br /> This "noal" should be "naliases" or "n_aliases" if you
reallywant<br /> a variable. I would simply use the constant "2" for the two for()<br /> loops in addAliases() instead,
itspurpose is obvious enough.<br /><br /> In setrefs.c, bind_returning_variables():<br /> +       Var *var = NULL;<br
/>+       foreach(temp, rlist){<br /> Add an empty line after the declaration block.<br /><br /><br /> * Are there
portabilityissues?<br /><br /> No.<br /><br /> * Will it work on Windows/BSD etc?<br /><br /> Yes.<br /><br /> * Are
thecomments sufficient and accurate?<br /><br /><br /><br /> * Does it do what it says, correctly?<br /><br /> Yes.<br
/><br/> * Does it produce compiler warnings?<br /><br /> No.<br /><br /> * Can you make it crash?<br /><br /> No.<br
/><br/> * Is everything done in a way that fits together coherently with other features/modules?<br /><br /> I think
so,mostly. Comments below.<br /><br /> * Are there interdependencies that can cause problems?<br /><br /> I don't think
so.<br/><br /> Other comments:<br /><br /> This #define in pg_class:<br /><br /> diff --git
a/src/include/catalog/pg_class.hb/src/include/catalog/pg_class.h<br /> index 49c4f6f..1b09994 100644<br /> ---
a/src/include/catalog/pg_class.h<br/> +++ b/src/include/catalog/pg_class.h<br /> @@ -154,6 +154,7 @@ DESCR("");<br />
 #define                 RELKIND_COMPOSITE_TYPE  'c'           /* composite type */<br />  #define                 
RELKIND_FOREIGN_TABLE  'f'           /* foreign table */<br />  #define                 
RELKIND_MATVIEW                'm'           /* materialized view */<br /> +#define                 
RELKIND_BEFORE                 'b'           /* virtual table for before/after statements */<br />  <br />
 #define                 RELPERSISTENCE_PERMANENT      'p'             /* regular table */<br />
 #define                 RELPERSISTENCE_UNLOGGED       'u'             /* unlogged permanent table */<br /><br /> The
"RELKIND_*"values all show up in the pg_class table except<br /> this new one. I don't think pg_class.h should be
modifiedat all.<br /> addAliases() should use RELKIND_RELATION together with<br /> RTE_BEFORE. Then checks like:<br
/><br/> +               if (rte->relkind == RELKIND_BEFORE)<br /> +                       continue;<br /><br />
shouldbecome<br /><br /> +               if (rte->relkind == RELKIND_RELATION && rte->rtekind ==
RTE_BEFORE)<br/> +                       continue;<br /><br /> Speaking of which, RTE_BEFORE is more properly named<br
/>RTE_RETURNING_ALIAS or something like that because it<br /> covers both "before" and "after". Someone may have a
better<br/> idea for naming this symbol.<br /><br /> I feel like I understand what the code does and it looks sane to
me.<br/><br /> One question, though, about this part:<br /><br /> ----------------------------------------<br /> @@
-1900,7+1954,27 @@ set_returning_clause_references(PlannerInfo *root,<br />
                                                               int rtoffset)<br />  {<br />         indexed_tlist
*itlist;<br/> +       int after_index=0, before_index=0;<br /> +       Query      *parse = root->parse;<br />  <br
/>+       ListCell   *rt;<br /> +       RangeTblEntry *bef;<br /> +<br /> +       int index_rel=1;<br /> +<br />
+      foreach(rt,parse->rtable)<br /> +       {<br /> +               bef = (RangeTblEntry *)lfirst(rt);<br />
+              if(strcmp(bef->eref->aliasname,"after") == 0 && bef->rtekind == RTE_BEFORE )<br />
+              {<br /> +                       after_index = index_rel;<br /> +               }<br /> +              
if(strcmp(bef->eref->aliasname,"before")== 0 && bef->rtekind == RTE_BEFORE )<br /> +              
{<br/> +                       before_index = index_rel;<br /> +               }<br /> +               index_rel++;<br
/>+       }<br />         /*<br />          * We can perform the desired Var fixup by abusing the fix_join_expr<br />
        * machinery that formerly handled inner indexscan fixup.  We search the<br /> @@ -1924,6 +1998,7 @@
set_returning_clause_references(PlannerInfo*root,<br />                                                  
resultRelation,<br/>                                                   rtoffset);<br />  <br /> +      
bind_returning_variables(rlist,before_index, after_index);<br />         pfree(itlist);<br />  <br />         return
rlist;<br/> ----------------------------------------<br /><br /> Why is it enough to keep the last before_index and
after_indexvalues?<br /> What if there are more than one matching RangeTblEntry for "before"<br /> and/or for "after"?
Isit an error condition or of them should be fixed?<br /><br /> I think that's all for now.<br /><br /> Best
regards,<br/> Zoltán Böszörményi<br /><br /><pre class="moz-signature" cols="90">-- 
 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a>
<aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a>
 
</pre>

Re: GSOC13 proposal - extend RETURNING syntax

От
Boszormenyi Zoltan
Дата:
<div class="moz-cite-prefix">2013-08-21 19:17 keltezéssel, Boszormenyi Zoltan írta:<br /></div><blockquote
cite="mid:5214F63F.5050608@cybertec.at"type="cite"><div class="moz-cite-prefix">Hi,<br /><br /> 2013-08-20 21:06
keltezéssel,Karol Trzcionka írta:<br /></div><blockquote cite="mid:5213BE1D.1000705@gmail.com" type="cite"><div
class="moz-cite-prefix">Wdniu 20.08.2013 20:55, Boszormenyi Zoltan pisze:<br /></div><blockquote
cite="mid:5213BB9D.4020204@cybertec.at"type="cite"> Here's a new one, for v7:<br /><br /> setrefs.c: In function
‘set_plan_refs’:<br/> setrefs.c:2001:26: warning: ‘before_index’ may be used uninitialized in this function
[-Wmaybe-uninitialized]<br/>   bind_returning_variables(rlist, before_index, after_index);<br />
                         ^<br /> setrefs.c:1957:21: note: ‘before_index’ was declared here<br />   int after_index=0,
before_index;<br/>                      ^<br /></blockquote> Right, my mistake. Sorry and thanks. Fixed.<br />
Regards,<br/> Karol Trzcionka<br /></blockquote><br /> With this fixed, a more complete review:<br /><br /> * Is the
patchin a patch format which has context? (eg: context diff format)<br /><br /> Yes.<br /><br /> * Does it apply
cleanlyto the current git master?<br /><br /> Yes.<br /><br /> * Does it include reasonable tests, necessary doc
patches,etc? <br /><br /> There is a new regression test (returning_before_after.sql) covering<br /> this feature.
However,I think it should be added to the group<br /> where "returning.sql" resides currently. There is a value in
runningit<br /> in parallel to other tests. Sometimes a subtle bug is uncovered<br /> because of this and your v2 patch
fixedsuch a bug already.<br /><br /> doc/src/sgml/ref/update.sgml describes this feature.<br /><br />
doc/src/sgml/dml.sgmlshould also be touched to cover this feature.<br /><br /> * Does the patch actually implement what
it'ssupposed to do?<br /><br /> Yes.<br /><br /> * Do we want that?<br /><br /> Yes.<br /><br /> * Do we already have
it?<br/><br /> No.<br /><br /> * Does it follow SQL spec, or the community-agreed behavior?<br /><br /> RETURNING is a
PostgreSQLextension, so the SQL-spec part<br /> of the question isn't applicable.<br /><br /> It implements the
community-agreedbehaviour, according to<br /> the new regression test coverage.<br /><br /> * Does it include pg_dump
support(if applicable)?<br /><br /> n/a<br /><br /> * Are there dangers?<br /><br /> I don't think so.<br /><br /> *
Haveall the bases been covered?<br /><br /> It seems so. I have also tried mixing before/after columns in<br />
differentorders and the query didn't fail:<br /><br /> zozo=# create table t1 (id serial primary key, i1 int4, i2 int4,
t1text, t2 text);<br /> CREATE TABLE<br /> zozo=# insert into t1 (i1, i2, t1, t2) values (1, 1, 'a', 'a');<br /> INSERT
01<br /> zozo=# insert into t1 (i1, i2, t1, t2) values (2, 2, 'b', 'b');<br /> INSERT 0 1<br /> zozo=# insert into t1
(i1,i2, t1, t2) values (3, 3, 'c', 'c');<br /> INSERT 0 1<br /> zozo=# select * from t1;<br />  id | i1 | i2 | t1 | t2
<br/> ----+----+----+----+----<br />   1 |  1 |  1 | a  | a<br />   2 |  2 |  2 | b  | b<br />   3 |  3 |  3 | c  |
c<br/> (3 rows)<br /><br /> zozo=# begin;<br /> BEGIN<br /> zozo=# update t1 set i2 = i2*2, t2 = t2 || 'x2' where id =
2returning before.i1, after.i1, before.i2, after.i2, before.t1, after.t1, before.t2, after.t2;<br />  i1 | i1 | i2 | i2
|t1 | t1 | t2 | t2  <br /> ----+----+----+----+----+----+----+-----<br />   2 |  2 |  2 |  4 | b  | b  | b  | bx2<br />
(1row)<br /><br /> UPDATE 1<br /> zozo=# update t1 set i1 = i1 * 3, i2 = i2*2, t1 = t1 || 'x3', t2 = t2 || 'x2' where
id= 3 returning before.i1, before.i2, after.i1, after.i2, before.t1, before.t2, after.t1, after.t2; i1 | i2 | i1 | i2 |
t1| t2 | t1  | t2  <br /> ----+----+----+----+----+----+-----+-----<br />   3 |  3 |  9 |  6 | c  | c  | cx3 | cx2<br
/>(1 row)<br /><br /> UPDATE 1<br /><br /><br /><br /> * Does the feature work as advertised?<br /><br /> Yes.<br /><br
/>* Are there corner cases the author has failed to consider?<br /><br /> I don't know.<br /><br /> * Are there any
assertionfailures or crashes?<br /><br /> No.<br /><br /> * Does the patch slow down simple tests?<br /><br /> No.<br
/><br/> * If it claims to improve performance, does it?<br /><br /> n/a<br /><br /> * Does it slow down other things?
<br/><br /> No.<br /><br /> * Does it follow the project coding guidelines?<br /><br /> Mostly.<br /><br /> In the
src/test/regress/parallel_schedulecontains an extra<br /> new line at the end, it shouldn't.<br /><br /> In
b/src/backend/optimizer/plan/setrefs.c:<br/><br /> +static void bind_returning_variables(List *rlist, int bef, int
aft);<br/><br /> but later it becomes not public:<br /><br /> + */<br /> +void bind_returning_variables(List *rlist,
intbef, int aft)<br /> +{<br /><br /> Strange, but GCC 4.8.1 -Wall doesn't catch it. But the forward<br /> declaration
isnot needed, the function is called only from<br /> later functions.<br /><br /> Similar for parse_clause.c:<br /><br
/>+extern void addAliases(ParseState *pstate);<br />  <br /> +void addAliases(ParseState *pstate)<br /><br /> This
externaldeclaration is not needed since it is already<br /> in src/include/parser/parse_clause.h<br /><br /> In
setrefs.c,bind_returning_variables() I would also rename<br /> the function arguments, so "before" and "after" are
spelledout.<br /> These are not C keywords.<br /><br /> Some assignments, like:<br /><br /> +                      
var=(Var*)tle;<br/> and<br /> +       int index_rel=1;<br /><br /> in setrefs.c need spaces.<br /><br /> "if()"
statementsneed a space before the "(" and not after.<br /><br /> Add spaces in the {} list in addAliases():<br />
+      char    *aliases[] = {"before","after"};<br /> like this: { "before", "after" }<br /><br /> Spaces are needed
here,too:<br /> +       for(i=0 ; i<noal; i++)<br /><br /> This "noal" should be "naliases" or "n_aliases" if you
reallywant<br /> a variable. I would simply use the constant "2" for the two for()<br /> loops in addAliases() instead,
itspurpose is obvious enough.<br /><br /> In setrefs.c, bind_returning_variables():<br /> +       Var *var = NULL;<br
/>+       foreach(temp, rlist){<br /> Add an empty line after the declaration block.<br /><br /><br /> * Are there
portabilityissues?<br /><br /> No.<br /><br /> * Will it work on Windows/BSD etc?<br /><br /> Yes.<br /><br /> * Are
thecomments sufficient and accurate?<br /></blockquote><br /> There should be more comments, especially regarding<br />
myquestion at the end.<br /><br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><br /><br /><br /> *
Doesit do what it says, correctly?<br /><br /> Yes.<br /><br /> * Does it produce compiler warnings?<br /><br /> No.<br
/><br/> * Can you make it crash?<br /><br /> No.<br /><br /> * Is everything done in a way that fits together
coherentlywith other features/modules?<br /><br /> I think so, mostly. Comments below.<br /><br /> * Are there
interdependenciesthat can cause problems?<br /><br /> I don't think so.<br /><br /> Other comments:<br /><br /> This
#definein pg_class:<br /><br /> diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h<br />
index49c4f6f..1b09994 100644<br /> --- a/src/include/catalog/pg_class.h<br /> +++ b/src/include/catalog/pg_class.h<br
/>@@ -154,6 +154,7 @@ DESCR("");<br />  #define                  RELKIND_COMPOSITE_TYPE  'c'           /* composite
type*/<br />  #define                  RELKIND_FOREIGN_TABLE   'f'           /* foreign table */<br />
 #define                 RELKIND_MATVIEW                 'm'           /* materialized view */<br />
+#define                 RELKIND_BEFORE                  'b'           /* virtual table for before/after statements
*/<br/>  <br />  #define                  RELPERSISTENCE_PERMANENT      'p'             /* regular table */<br />
 #define                 RELPERSISTENCE_UNLOGGED       'u'             /* unlogged permanent table */<br /><br /> The
"RELKIND_*"values all show up in the pg_class table except<br /> this new one. I don't think pg_class.h should be
modifiedat all.<br /> addAliases() should use RELKIND_RELATION together with<br /> RTE_BEFORE. Then checks like:<br
/><br/> +               if (rte->relkind == RELKIND_BEFORE)<br /> +                       continue;<br /><br />
shouldbecome<br /><br /> +               if (rte->relkind == RELKIND_RELATION && rte->rtekind ==
RTE_BEFORE)<br/> +                       continue;<br /></blockquote><br /> Thinking about it more,<br />
                   if (rte->rtekind == RTE_BEFORE)<br /> would be enough, as no other kinds of rte's can have
rtekind== RTE_BEFORE.<br /><br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><br /> Speaking of
which,RTE_BEFORE is more properly named<br /> RTE_RETURNING_ALIAS or something like that because it<br /> covers both
"before"and "after". Someone may have a better<br /> idea for naming this symbol.<br /><br /> I feel like I understand
whatthe code does and it looks sane to me.<br /><br /> One question, though, about this part:<br /><br />
----------------------------------------<br/> @@ -1900,7 +1954,27 @@ set_returning_clause_references(PlannerInfo
*root,<br/>                                                                 int rtoffset)<br />  {<br />        
indexed_tlist*itlist;<br /> +       int after_index=0, before_index=0;<br /> +       Query      *parse =
root->parse;<br/>  <br /> +       ListCell   *rt;<br /> +       RangeTblEntry *bef;<br /> +<br /> +       int
index_rel=1;<br/> +<br /> +       foreach(rt,parse->rtable)<br /> +       {<br /> +               bef =
(RangeTblEntry*)lfirst(rt);<br /> +               if(strcmp(bef->eref->aliasname,"after") == 0 &&
bef->rtekind== RTE_BEFORE )<br /> +               {<br /> +                       after_index = index_rel;<br />
+              }<br /> +               if(strcmp(bef->eref->aliasname,"before") == 0 && bef->rtekind
==RTE_BEFORE )<br /> +               {<br /> +                       before_index = index_rel;<br /> +              
}<br/> +               index_rel++;<br /> +       }<br />         /*<br />          * We can perform the desired Var
fixupby abusing the fix_join_expr<br />          * machinery that formerly handled inner indexscan fixup.  We search
the<br/> @@ -1924,6 +1998,7 @@ set_returning_clause_references(PlannerInfo *root,<br />
                                                 resultRelation,<br />
                                                 rtoffset);<br />  <br /> +       bind_returning_variables(rlist,
before_index,after_index);<br />         pfree(itlist);<br />  <br />         return rlist;<br />
----------------------------------------<br/><br /> Why is it enough to keep the last before_index and after_index
values?<br/> What if there are more than one matching RangeTblEntry for "before"<br /> and/or for "after"? Is it an
errorcondition or of them should be fixed?<br /><br /> I think that's all for now.<br /><br /> Best regards,<br />
ZoltánBöszörményi<br /><br /><pre class="moz-signature" cols="90">-- 
 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de"
moz-do-not-send="true">http://www.postgresql-support.de</a>   <a class="moz-txt-link-freetext"
href="http://www.postgresql.at/"moz-do-not-send="true">http://www.postgresql.at/</a>
 
</pre></blockquote><br /><br /><pre class="moz-signature" cols="90">-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a>
<aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a>
 
</pre>

Re: GSOC13 proposal - extend RETURNING syntax

От
Boszormenyi Zoltan
Дата:
<div class="moz-cite-prefix">2013-08-21 20:00 keltezéssel, Boszormenyi Zoltan írta:<br /></div><blockquote
cite="mid:52150052.3010104@cybertec.at"type="cite"><div class="moz-cite-prefix">2013-08-21 19:17 keltezéssel,
BoszormenyiZoltan írta:<br /></div><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><div
class="moz-cite-prefix">Hi,<br/><br /> 2013-08-20 21:06 keltezéssel, Karol Trzcionka írta:<br /></div><blockquote
cite="mid:5213BE1D.1000705@gmail.com"type="cite"><div class="moz-cite-prefix">W dniu 20.08.2013 20:55, Boszormenyi
Zoltanpisze:<br /></div><blockquote cite="mid:5213BB9D.4020204@cybertec.at" type="cite"> Here's a new one, for v7:<br
/><br/> setrefs.c: In function ‘set_plan_refs’:<br /> setrefs.c:2001:26: warning: ‘before_index’ may be used
uninitializedin this function [-Wmaybe-uninitialized]<br />   bind_returning_variables(rlist, before_index,
after_index);<br/>                           ^<br /> setrefs.c:1957:21: note: ‘before_index’ was declared here<br />  
intafter_index=0, before_index;<br />                      ^<br /></blockquote> Right, my mistake. Sorry and thanks.
Fixed.<br/> Regards,<br /> Karol Trzcionka<br /></blockquote><br /> With this fixed, a more complete review:<br /><br
/>* Is the patch in a patch format which has context? (eg: context diff format)<br /><br /> Yes.<br /><br /> * Does it
applycleanly to the current git master?<br /><br /> Yes.<br /><br /> * Does it include reasonable tests, necessary doc
patches,etc? <br /><br /> There is a new regression test (returning_before_after.sql) covering<br /> this feature.
However,I think it should be added to the group<br /> where "returning.sql" resides currently. There is a value in
runningit<br /> in parallel to other tests. Sometimes a subtle bug is uncovered<br /> because of this and your v2 patch
fixedsuch a bug already.<br /><br /> doc/src/sgml/ref/update.sgml describes this feature.<br /><br />
doc/src/sgml/dml.sgmlshould also be touched to cover this feature.<br /><br /> * Does the patch actually implement what
it'ssupposed to do?<br /><br /> Yes.<br /><br /> * Do we want that?<br /><br /> Yes.<br /><br /> * Do we already have
it?<br/><br /> No.<br /><br /> * Does it follow SQL spec, or the community-agreed behavior?<br /><br /> RETURNING is a
PostgreSQLextension, so the SQL-spec part<br /> of the question isn't applicable.<br /><br /> It implements the
community-agreedbehaviour, according to<br /> the new regression test coverage.<br /><br /> * Does it include pg_dump
support(if applicable)?<br /><br /> n/a<br /><br /> * Are there dangers?<br /><br /> I don't think so.<br /><br /> *
Haveall the bases been covered?<br /><br /> It seems so. I have also tried mixing before/after columns in<br />
differentorders and the query didn't fail:<br /><br /> zozo=# create table t1 (id serial primary key, i1 int4, i2 int4,
t1text, t2 text);<br /> CREATE TABLE<br /> zozo=# insert into t1 (i1, i2, t1, t2) values (1, 1, 'a', 'a');<br /> INSERT
01<br /> zozo=# insert into t1 (i1, i2, t1, t2) values (2, 2, 'b', 'b');<br /> INSERT 0 1<br /> zozo=# insert into t1
(i1,i2, t1, t2) values (3, 3, 'c', 'c');<br /> INSERT 0 1<br /> zozo=# select * from t1;<br />  id | i1 | i2 | t1 | t2
<br/> ----+----+----+----+----<br />   1 |  1 |  1 | a  | a<br />   2 |  2 |  2 | b  | b<br />   3 |  3 |  3 | c  |
c<br/> (3 rows)<br /><br /> zozo=# begin;<br /> BEGIN<br /> zozo=# update t1 set i2 = i2*2, t2 = t2 || 'x2' where id =
2returning before.i1, after.i1, before.i2, after.i2, before.t1, after.t1, before.t2, after.t2;<br />  i1 | i1 | i2 | i2
|t1 | t1 | t2 | t2  <br /> ----+----+----+----+----+----+----+-----<br />   2 |  2 |  2 |  4 | b  | b  | b  | bx2<br />
(1row)<br /><br /> UPDATE 1<br /> zozo=# update t1 set i1 = i1 * 3, i2 = i2*2, t1 = t1 || 'x3', t2 = t2 || 'x2' where
id= 3 returning before.i1, before.i2, after.i1, after.i2, before.t1, before.t2, after.t1, after.t2; i1 | i2 | i1 | i2 |
t1| t2 | t1  | t2  <br /> ----+----+----+----+----+----+-----+-----<br />   3 |  3 |  9 |  6 | c  | c  | cx3 | cx2<br
/>(1 row)<br /><br /> UPDATE 1<br /><br /><br /><br /> * Does the feature work as advertised?<br /><br /> Yes.<br /><br
/>* Are there corner cases the author has failed to consider?<br /><br /> I don't know.<br /><br /> * Are there any
assertionfailures or crashes?<br /><br /> No.<br /><br /> * Does the patch slow down simple tests?<br /><br /> No.<br
/><br/> * If it claims to improve performance, does it?<br /><br /> n/a<br /><br /> * Does it slow down other things?
<br/><br /> No.<br /><br /> * Does it follow the project coding guidelines?<br /><br /> Mostly.<br /><br /> In the
src/test/regress/parallel_schedulecontains an extra<br /> new line at the end, it shouldn't.<br /><br /> In
b/src/backend/optimizer/plan/setrefs.c:<br/><br /> +static void bind_returning_variables(List *rlist, int bef, int
aft);<br/><br /> but later it becomes not public:<br /><br /> + */<br /> +void bind_returning_variables(List *rlist,
intbef, int aft)<br /> +{<br /><br /> Strange, but GCC 4.8.1 -Wall doesn't catch it. But the forward<br /> declaration
isnot needed, the function is called only from<br /> later functions.<br /><br /> Similar for parse_clause.c:<br /><br
/>+extern void addAliases(ParseState *pstate);<br />  <br /> +void addAliases(ParseState *pstate)<br /><br /> This
externaldeclaration is not needed since it is already<br /> in src/include/parser/parse_clause.h<br /><br /> In
setrefs.c,bind_returning_variables() I would also rename<br /> the function arguments, so "before" and "after" are
spelledout.<br /> These are not C keywords.<br /><br /> Some assignments, like:<br /><br /> +                      
var=(Var*)tle;<br/> and<br /> +       int index_rel=1;<br /><br /> in setrefs.c need spaces.<br /><br /> "if()"
statementsneed a space before the "(" and not after.<br /><br /> Add spaces in the {} list in addAliases():<br />
+      char    *aliases[] = {"before","after"};<br /> like this: { "before", "after" }<br /><br /> Spaces are needed
here,too:<br /> +       for(i=0 ; i<noal; i++)<br /><br /> This "noal" should be "naliases" or "n_aliases" if you
reallywant<br /> a variable. I would simply use the constant "2" for the two for()<br /> loops in addAliases() instead,
itspurpose is obvious enough.<br /><br /> In setrefs.c, bind_returning_variables():<br /> +       Var *var = NULL;<br
/>+       foreach(temp, rlist){<br /> Add an empty line after the declaration block.<br /><br /><br /> * Are there
portabilityissues?<br /><br /> No.<br /><br /> * Will it work on Windows/BSD etc?<br /><br /> Yes.<br /><br /> * Are
thecomments sufficient and accurate?<br /></blockquote><br /> There should be more comments, especially regarding<br />
myquestion at the end.<br /><br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><br /><br /><br /> *
Doesit do what it says, correctly?<br /><br /> Yes.<br /><br /> * Does it produce compiler warnings?<br /><br /> No.<br
/><br/> * Can you make it crash?<br /><br /> No.<br /><br /> * Is everything done in a way that fits together
coherentlywith other features/modules?<br /><br /> I think so, mostly. Comments below.<br /><br /> * Are there
interdependenciesthat can cause problems?<br /><br /> I don't think so.<br /><br /> Other comments:<br /><br /> This
#definein pg_class:<br /><br /> diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h<br />
index49c4f6f..1b09994 100644<br /> --- a/src/include/catalog/pg_class.h<br /> +++ b/src/include/catalog/pg_class.h<br
/>@@ -154,6 +154,7 @@ DESCR("");<br />  #define                  RELKIND_COMPOSITE_TYPE  'c'           /* composite
type*/<br />  #define                  RELKIND_FOREIGN_TABLE   'f'           /* foreign table */<br />
 #define                 RELKIND_MATVIEW                 'm'           /* materialized view */<br />
+#define                 RELKIND_BEFORE                  'b'           /* virtual table for before/after statements
*/<br/>  <br />  #define                  RELPERSISTENCE_PERMANENT      'p'             /* regular table */<br />
 #define                 RELPERSISTENCE_UNLOGGED       'u'             /* unlogged permanent table */<br /><br /> The
"RELKIND_*"values all show up in the pg_class table except<br /> this new one. I don't think pg_class.h should be
modifiedat all.<br /> addAliases() should use RELKIND_RELATION together with<br /> RTE_BEFORE. Then checks like:<br
/><br/> +               if (rte->relkind == RELKIND_BEFORE)<br /> +                       continue;<br /><br />
shouldbecome<br /><br /> +               if (rte->relkind == RELKIND_RELATION && rte->rtekind ==
RTE_BEFORE)<br/> +                       continue;<br /></blockquote><br /> Thinking about it more,<br />
                   if (rte->rtekind == RTE_BEFORE)<br /> would be enough, as no other kinds of rte's can have
rtekind== RTE_BEFORE.<br /><br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><br /> Speaking of
which,RTE_BEFORE is more properly named<br /> RTE_RETURNING_ALIAS or something like that because it<br /> covers both
"before"and "after". Someone may have a better<br /> idea for naming this symbol.<br /><br /> I feel like I understand
whatthe code does and it looks sane to me.<br /><br /> One question, though, about this part:<br /><br />
----------------------------------------<br/> @@ -1900,7 +1954,27 @@ set_returning_clause_references(PlannerInfo
*root,<br/>                                                                 int rtoffset)<br />  {<br />        
indexed_tlist*itlist;<br /> +       int after_index=0, before_index=0;<br /> +       Query      *parse =
root->parse;<br/>  <br /> +       ListCell   *rt;<br /> +       RangeTblEntry *bef;<br /> +<br /> +       int
index_rel=1;<br/> +<br /> +       foreach(rt,parse->rtable)<br /> +       {<br /> +               bef =
(RangeTblEntry*)lfirst(rt);<br /> +               if(strcmp(bef->eref->aliasname,"after") == 0 &&
bef->rtekind== RTE_BEFORE )<br /> +               {<br /> +                       after_index = index_rel;<br />
+              }<br /> +               if(strcmp(bef->eref->aliasname,"before") == 0 && bef->rtekind
==RTE_BEFORE )<br /> +               {<br /> +                       before_index = index_rel;<br /> +              
}<br/> +               index_rel++;<br /> +       }<br />         /*<br />          * We can perform the desired Var
fixupby abusing the fix_join_expr<br />          * machinery that formerly handled inner indexscan fixup.  We search
the<br/> @@ -1924,6 +1998,7 @@ set_returning_clause_references(PlannerInfo *root,<br />
                                                 resultRelation,<br />
                                                 rtoffset);<br />  <br /> +       bind_returning_variables(rlist,
before_index,after_index);<br />         pfree(itlist);<br />  <br />         return rlist;<br />
----------------------------------------<br/><br /> Why is it enough to keep the last before_index and after_index
values?<br/> What if there are more than one matching RangeTblEntry for "before"<br /> and/or for "after"? Is it an
errorcondition or of them should be fixed?<br /></blockquote></blockquote><br /> Since addAliases() only adds a single
onefor each and only for an<br /> UPDATE ... RETURNING query, it is okay. Also, because<br />
set_returning_clause_references()is called separately for each<br /> query with RETURNING. Anyway, a comment before the
new<br/> foreach() loop in set_returning_clause_references() should explain<br /> the fact that only one of each
("before"and "after") can occur for<br /> such a query.<br /><br /> I have just tried this:<br /><br /> before as
(updatet1 set i1 = i1 * 2, i2 = i2 * 3, t1 = t1 || 'x2', t2 = t2 || 'x3'<br />         where id = 1 returning
before.i1,after.i1, before.i2, after.i2),<br /> after as (update t1 set i1 = i1 * 2, i2 = i2 * 3, t1 = t1 || 'x2', t2 =
t2|| 'x3'<br />         where id = 2 returning before.i1, after.i1, before.i2, after.i2)<br /> select * from (select *
frombefore union all select * from after) as x;<br /><br /> and it gave me the proper results, no crash.<br /><br />
AboutaddAliases():<br /> - it can be moved to parser/analyze.c so it can be static.<br /> - addReturningAliases() may
bea better name for the function.<br /><br /><blockquote cite="mid:52150052.3010104@cybertec.at"
type="cite"><blockquotecite="mid:5214F63F.5050608@cybertec.at" type="cite"><br /> I think that's all for now.<br /><br
/>Best regards,<br /> Zoltán Böszörményi<br /><br /><pre class="moz-signature" cols="90">-- 
 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de"
moz-do-not-send="true">http://www.postgresql-support.de</a>   <a class="moz-txt-link-freetext"
href="http://www.postgresql.at/"moz-do-not-send="true">http://www.postgresql.at/</a>
 
</pre></blockquote><br /><br /><pre class="moz-signature" cols="90">-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de"
moz-do-not-send="true">http://www.postgresql-support.de</a>   <a class="moz-txt-link-freetext"
href="http://www.postgresql.at/"moz-do-not-send="true">http://www.postgresql.at/</a>
 
</pre></blockquote><br /><br /><pre class="moz-signature" cols="90">-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a>
<aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a>
 
</pre>

Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
W dniu 21.08.2013 19:17, Boszormenyi Zoltan pisze:
With this fixed, a more complete review:
Thanks.
There is a new regression test (returning_before_after.sql) covering
this feature. However, I think it should be added to the group
where "returning.sql" resides currently. There is a value in running it
in parallel to other tests. Sometimes a subtle bug is uncovered
because of this and your v2 patch fixed such a bug already.
Modified to work correct in parallel testing
doc/src/sgml/ref/update.sgml describes this feature.

doc/src/sgml/dml.sgml should also be touched to cover this feature.
I don't think so, there is not any info about returning feature, I think it shouldn't be part of my patch.
In the src/test/regress/parallel_schedule contains an extra
new line at the end, it shouldn't.
Fixed

In b/src/backend/optimizer/plan/setrefs.c:

+static void bind_returning_variables(List *rlist, int bef, int aft);

but later it becomes not public:

+ */
+void bind_returning_variables(List *rlist, int bef, int aft)
+{
I've change to static in the definition.

+extern void addAliases(ParseState *pstate);
 
+void addAliases(ParseState *pstate)

This external declaration is not needed since it is already
in src/include/parser/parse_clause.h
Removed.

In setrefs.c, bind_returning_variables() I would also rename
the function arguments, so "before" and "after" are spelled out.
These are not C keywords.
Changed.
Some assignments, like:

+                       var=(Var*)tle;
and
+       int index_rel=1;

in setrefs.c need spaces.

"if()" statements need a space before the "(" and not after.

Add spaces in the {} list in addAliases():
+       char    *aliases[] = {"before","after"};
like this: { "before", "after" }

Spaces are needed here, too:
+       for(i=0 ; i<noal; i++)

This "noal" should be "naliases" or "n_aliases" if you really want
a variable. I would simply use the constant "2" for the two for()
loops in addAliases() instead, its purpose is obvious enough.
Added some whitespaces.
In setrefs.c, bind_returning_variables():
+       Var *var = NULL;
+       foreach(temp, rlist){
Add an empty line after the declaration block.
Added.
Other comments:

This #define in pg_class:

diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index 49c4f6f..1b09994 100644
--- a/src/include/catalog/pg_class.h
+++ b/src/include/catalog/pg_class.h
@@ -154,6 +154,7 @@ DESCR("");
 #define                  RELKIND_COMPOSITE_TYPE  'c'           /* composite type */
 #define                  RELKIND_FOREIGN_TABLE   'f'           /* foreign table */
 #define                  RELKIND_MATVIEW                 'm'           /* materialized view */
+#define                  RELKIND_BEFORE                  'b'           /* virtual table for before/after statements */
 
 #define                  RELPERSISTENCE_PERMANENT      'p'             /* regular table */
 #define                  RELPERSISTENCE_UNLOGGED       'u'             /* unlogged permanent table */

RELKIND_BEFORE removed - it probably left over previous work and/or I needed it because RTE_BEFORE wasn't available (not included?).
Speaking of which, RTE_BEFORE is more properly named
RTE_RETURNING_ALIAS or something like that because it
covers both "before" and "after". Someone may have a better
idea for naming this symbol.
Renamed to RTE_ALIAS - similar to addAliases (not addReturningAliases)
One question, though, about this part:

----------------------------------------
@@ -1900,7 +1954,27 @@ set_returning_clause_references(PlannerInfo *root,
                                                                int rtoffset)
 {
        indexed_tlist *itlist;
+       int after_index=0, before_index=0;
+       Query      *parse = root->parse;
 
+       ListCell   *rt;
+       RangeTblEntry *bef;
+
+       int index_rel=1;
+
+       foreach(rt,parse->rtable)
+       {
+               bef = (RangeTblEntry *)lfirst(rt);
+               if(strcmp(bef->eref->aliasname,"after") == 0 && bef->rtekind == RTE_BEFORE )
+               {
+                       after_index = index_rel;
+               }
+               if(strcmp(bef->eref->aliasname,"before") == 0 && bef->rtekind == RTE_BEFORE )
+               {
+                       before_index = index_rel;
+               }
+               index_rel++;
+       }
        /*
         * We can perform the desired Var fixup by abusing the fix_join_expr
         * machinery that formerly handled inner indexscan fixup.  We search the
@@ -1924,6 +1998,7 @@ set_returning_clause_references(PlannerInfo *root,
                                                  resultRelation,
                                                  rtoffset);
 
+       bind_returning_variables(rlist, before_index, after_index);
        pfree(itlist);
 
        return rlist;
----------------------------------------

Why is it enough to keep the last before_index and after_index values?
What if there are more than one matching RangeTblEntry for "before"
and/or for "after"? Is it an error condition or of them should be fixed?
I think it is safe, it is the first and the last index. On each level of statement there can be (at most) the only one "before" and one "after" alias.
Regards,
Karol Trzcionka
Вложения

Re: GSOC13 proposal - extend RETURNING syntax

От
Boszormenyi Zoltan
Дата:
<div class="moz-cite-prefix">Hi,<br /><br /> 2013-08-21 20:52 keltezéssel, Karol Trzcionka írta:<br /></div><blockquote
cite="mid:52150C69.8030608@gmail.com"type="cite"><div class="moz-cite-prefix">W dniu 21.08.2013 19:17, Boszormenyi
Zoltanpisze:<br /></div><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"> With this fixed, a more
completereview:<br /></blockquote> Thanks.<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"> There
isa new regression test (returning_before_after.sql) covering<br /> this feature. However, I think it should be added
tothe group<br /> where "returning.sql" resides currently. There is a value in running it<br /> in parallel to other
tests.Sometimes a subtle bug is uncovered<br /> because of this and your v2 patch fixed such a bug already.<br
/></blockquote>Modified to work correct in parallel testing<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at"
type="cite">doc/src/sgml/ref/update.sgml describes this feature.<br /><br /> doc/src/sgml/dml.sgml should also be
touchedto cover this feature.<br /></blockquote> I don't think so, there is not any info about returning feature, I
thinkit shouldn't be part of my patch.<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"> In the
src/test/regress/parallel_schedulecontains an extra<br /> new line at the end, it shouldn't.<br /></blockquote>
Fixed<br/><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"><br /> In
b/src/backend/optimizer/plan/setrefs.c:<br/><br /> +static void bind_returning_variables(List *rlist, int bef, int
aft);<br/><br /> but later it becomes not public:<br /><br /> + */<br /> +void bind_returning_variables(List *rlist,
intbef, int aft)<br /> +{<br /></blockquote> I've change to static in the definition.<br /><blockquote
cite="mid:5214F63F.5050608@cybertec.at"type="cite"><br /> +extern void addAliases(ParseState *pstate);<br />  <br />
+voidaddAliases(ParseState *pstate)<br /><br /> This external declaration is not needed since it is already<br /> in
src/include/parser/parse_clause.h<br/></blockquote> Removed.<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at"
type="cite"><br/> In setrefs.c, bind_returning_variables() I would also rename<br /> the function arguments, so
"before"and "after" are spelled out.<br /> These are not C keywords.<br /></blockquote> Changed.<br /><blockquote
cite="mid:5214F63F.5050608@cybertec.at"type="cite"> Some assignments, like:<br /><br /> +                      
var=(Var*)tle;<br/> and<br /> +       int index_rel=1;<br /><br /> in setrefs.c need spaces.<br /><br /> "if()"
statementsneed a space before the "(" and not after.<br /><br /> Add spaces in the {} list in addAliases():<br />
+      char    *aliases[] = {"before","after"};<br /> like this: { "before", "after" }<br /><br /> Spaces are needed
here,too:<br /> +       for(i=0 ; i<noal; i++)<br /><br /> This "noal" should be "naliases" or "n_aliases" if you
reallywant<br /> a variable. I would simply use the constant "2" for the two for()<br /> loops in addAliases() instead,
itspurpose is obvious enough.<br /></blockquote> Added some whitespaces.<br /><blockquote
cite="mid:5214F63F.5050608@cybertec.at"type="cite"> In setrefs.c, bind_returning_variables():<br /> +       Var *var =
NULL;<br/> +       foreach(temp, rlist){<br /> Add an empty line after the declaration block.<br /></blockquote>
Added.<br/><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"> Other comments:<br /><br /> This #define in
pg_class:<br/><br /> diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h<br /> index
49c4f6f..1b09994100644<br /> --- a/src/include/catalog/pg_class.h<br /> +++ b/src/include/catalog/pg_class.h<br /> @@
-154,6+154,7 @@ DESCR("");<br />  #define                  RELKIND_COMPOSITE_TYPE  'c'           /* composite type
*/<br/>  #define                  RELKIND_FOREIGN_TABLE   'f'           /* foreign table */<br />
 #define                 RELKIND_MATVIEW                 'm'           /* materialized view */<br />
+#define                 RELKIND_BEFORE                  'b'           /* virtual table for before/after statements
*/<br/>  <br />  #define                  RELPERSISTENCE_PERMANENT      'p'             /* regular table */<br />
 #define                 RELPERSISTENCE_UNLOGGED       'u'             /* unlogged permanent table */<br /><br
/></blockquote>RELKIND_BEFORE removed - it probably left over previous work and/or I needed it because RTE_BEFORE
wasn'tavailable (not included?).<br /><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"> Speaking of
which,RTE_BEFORE is more properly named<br /> RTE_RETURNING_ALIAS or something like that because it<br /> covers both
"before"and "after". Someone may have a better<br /> idea for naming this symbol.<br /></blockquote> Renamed to
RTE_ALIAS- similar to addAliases (not addReturningAliases)<br /></blockquote><br /> Others may have also a word in this
topic,but consider that<br /> this is *not* a regular alias and for those, RTE_ALIAS is not used,<br /> like in<br
/><br/>     UPDATE table AS alias SET ...<br /><br /> Maybe RTE_RETURNING_ALIAS takes a little more typing, but<br />
itbecomes obvious when reading and new code won't confuse<br /> it with regular aliases.<br /><br /><blockquote
cite="mid:52150C69.8030608@gmail.com"type="cite"><blockquote cite="mid:5214F63F.5050608@cybertec.at" type="cite"> One
question,though, about this part:<br /><br /> ----------------------------------------<br /> @@ -1900,7 +1954,27 @@
set_returning_clause_references(PlannerInfo*root,<br />                                                                
intrtoffset)<br />  {<br />         indexed_tlist *itlist;<br /> +       int after_index=0, before_index=0;<br />
+      Query      *parse = root->parse;<br />  <br /> +       ListCell   *rt;<br /> +       RangeTblEntry *bef;<br
/>+<br /> +       int index_rel=1;<br /> +<br /> +       foreach(rt,parse->rtable)<br /> +       {<br />
+              bef = (RangeTblEntry *)lfirst(rt);<br /> +               if(strcmp(bef->eref->aliasname,"after")
==0 && bef->rtekind == RTE_BEFORE )<br /> +               {<br /> +                       after_index =
index_rel;<br/> +               }<br /> +               if(strcmp(bef->eref->aliasname,"before") == 0 &&
bef->rtekind== RTE_BEFORE )<br /> +               {<br /> +                       before_index = index_rel;<br />
+              }<br /> +               index_rel++;<br /> +       }<br />         /*<br />          * We can perform
thedesired Var fixup by abusing the fix_join_expr<br />          * machinery that formerly handled inner indexscan
fixup. We search the<br /> @@ -1924,6 +1998,7 @@ set_returning_clause_references(PlannerInfo *root,<br />
                                                 resultRelation,<br />
                                                 rtoffset);<br />  <br /> +       bind_returning_variables(rlist,
before_index,after_index);<br />         pfree(itlist);<br />  <br />         return rlist;<br />
----------------------------------------<br/><br /> Why is it enough to keep the last before_index and after_index
values?<br/> What if there are more than one matching RangeTblEntry for "before"<br /> and/or for "after"? Is it an
errorcondition or of them should be fixed?<br /></blockquote> I think it is safe, it is the first and the last index.
Oneach level of statement there can be (at most) the only one "before" and one "after" alias.<br /></blockquote><br />
Ideduced it in the meantime. I still think it worth a comment<br /> in setrefs.c.<br /><br /> I think your v9 patch can
belooked at by a more seasoned reviewer<br /> or a committer.<br /><br /> Best regards,<br /> Zoltán Böszörményi<br
/><br/><blockquote cite="mid:52150C69.8030608@gmail.com" type="cite"> Regards,<br /> Karol Trzcionka<br /><br
/><fieldsetclass="mimeAttachmentHeader"></fieldset><br /><pre wrap="">
 
</pre></blockquote><br /><br /><pre class="moz-signature" cols="90">-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a>
<aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a>
 
</pre>

Re: GSOC13 proposal - extend RETURNING syntax

От
Robert Haas
Дата:
I took a look at this patch today, and I'm pretty skeptical about
whether it's on the right track.  It adds a new kind of RTE called
RTE_ALIAS, which doesn't seem particularly descriptive and alias is
used elsewhere to mean something fairly different.  More generally,
I'm not convinced that adding a new type of RTE is the right way to
handle this.  The changes in pull_up_subqueries_recurse,
pullup_replace_vars_callback, preprocess_targetlist, and
build_joinrel_tlist seem like weird hacks.  Those functions aren't
directly related to this feature; why do they need to know about it?

I wonder if we shouldn't be trying to handle resolution of these names
at an earlier processing stage, closer to the processor.  I notice
that set_returning_clause_references() seems to have already solved
the hard part of this problem, which is frobbing target list entries
to return values from the new row rather than, as they naturally
would, the old row.  In fact, we can already get approximately the
desired effect already:

rhaas=# update foo as after set a = before.a + 1 from foo as before
where before.a = after.a returning before.a, after.a;a | a
---+---1 | 2
(1 row)

Now this is a hack, because we don't really want to add an extra
scan/join just to get the behavior we want.  But it seems to me
significant that this processing makes Vars that refer to the target
table refer to the new values, and if we got rid of it, they'd refer
to the old values.  Can't we contrive to make AFTER.x parse into the
same Var node that x currently does?  Then we don't need an RTE for
it.  And maybe BEFORE.x ought to parse to the same node that just
plain x does but with some marking, or some other node wrapped around
it (like a TargetEntry with some flag set?) that suppresses this
processing.  I'm just shooting from the hip here; that might be wrong
in detail, or even in broad strokes, but it just looks to me like the
additional RTE kind is going to bleed into a lot of places.

This patch also has significant style issues.  Conforming to
PostgreSQL coding style is essential; if neither the author nor the
reviewer fixes problems in this area, then that is essentially making
it the committer's job, and the committer may not feel like taking
time to do that.  Here's a selection of issues that I noticed while
reading this through: we use spaces around operators; the patch adds
two blank lines that shouldn't be there to the middle of the variable
declarations section; variables should be declared in the innermost
possible scope; single-statement blocks shouldn't have curly braces;
there shouldn't be whitespace before a closing parenthesis; there
should be a space after if and before the subsequent parenthesis;
braces should be uncuddled; code that does non-obvious things isn't
commented.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
W dniu 04.10.2013 00:28, Robert Haas pisze:
> I wonder if we shouldn't be trying to handle resolution of these names
> at an earlier processing stage, closer to the processor.
Maybe it can be done in parser (in flex?) but at now it seems to be more
isolated feature.
> In fact, we can already get approximately the
> desired effect already:
>
> rhaas=# update foo as after set a = before.a + 1 from foo as before
> where before.a = after.a returning before.a, after.a;
>  a | a
> ---+---
>  1 | 2
> (1 row)
Compare EXPLAIN ANALYZE VERBOSE on your statement and on "patched"
workflow. I can see significant difference. And your "after" returns the
value after whole the work (after trigger fired) as I know (I don't know
if it is needed or not, I only point at the difference).
> Now this is a hack, because we don't really want to add an extra
> scan/join just to get the behavior we want.  But it seems to me
> significant that this processing makes Vars that refer to the target
> table refer to the new values, and if we got rid of it, they'd refer
> to the old values.  Can't we contrive to make AFTER.x parse into the
> same Var node that x currently does?  Then we don't need an RTE for
> it.  And maybe BEFORE.x ought to parse to the same node that just
> plain x does but with some marking, or some other node wrapped around
> it (like a TargetEntry with some flag set?) that suppresses this
> processing.  I'm just shooting from the hip here; that might be wrong
> in detail, or even in broad strokes, but it just looks to me like the
> additional RTE kind is going to bleed into a lot of places.
While planning/analyzing the problem there were many ideas about hot to
solve it. I was trying to avoid adding new RTE and referencing to "core"
table. However it makes more and more issues. You can see some PoC on
the
https://github.com/davidfetter/postgresql_projects/compare/returning_before_after
(other ideas I revert w/o commit because I couldn't get expected
result). The other major reason was that we can avoid touching executor
and/or parser's core (flex) this way. One observation: why shouldn't we
use the values computed at the moment (it would be computed again if we
want to do it later, in executor)?
I think we can do it by modify the Var structure (add some kind of flag
while generating the vars in parser?) but I'm not sure if it is good
idea. The major issue is to know if the Var/TargetEntry references to
the real alias "BEFORE" (named with "AS" syntax or even the real
table-name - I can see there is no difference in code) or the virtual
(from feature patch) "BEFORE". Doing it in parser (more "low-level")
would be very awful - we'd need to check in which part of statement
BEFORE/AFTER is placed (it is not allowed to use it in the other places
than in "RETURNING"). We don't want to make "BEFORE" and "AFTER"
restricted keywords.
Now most of the code means "don't touch these because they are not real" :)
If anyone has the fresh idea to it better, please write it by mail, I
don't have more ideas how to solve it.
> This patch also has significant style issues.
I'll try to fix it soon.
Regards,
Karol Trzcionka




Re: GSOC13 proposal - extend RETURNING syntax

От
Robert Haas
Дата:
On Thu, Oct 3, 2013 at 7:54 PM, Karol Trzcionka <karlikt@gmail.com> wrote:
> Compare EXPLAIN ANALYZE VERBOSE on your statement and on "patched"
> workflow. I can see significant difference. And your "after" returns the
> value after whole the work (after trigger fired) as I know (I don't know
> if it is needed or not, I only point at the difference).

Sure, I'm not saying we should implement it that way.  I'm just
pointing out that the ability already exists, at the executor level,
to return either tuple.  So I think the executor itself shouldn't need
to be changed; it's just a matter of getting the correct plan tree to
pop out.

> While planning/analyzing the problem there were many ideas about hot to
> solve it.

Do you have a link to previous discussion on the mailing list?

> I think we can do it by modify the Var structure (add some kind of flag
> while generating the vars in parser?) but I'm not sure if it is good
> idea. The major issue is to know if the Var/TargetEntry references to
> the real alias "BEFORE" (named with "AS" syntax or even the real
> table-name - I can see there is no difference in code) or the virtual
> (from feature patch) "BEFORE". Doing it in parser (more "low-level")
> would be very awful - we'd need to check in which part of statement
> BEFORE/AFTER is placed (it is not allowed to use it in the other places
> than in "RETURNING"). We don't want to make "BEFORE" and "AFTER"
> restricted keywords.

You're right, it can't happen actually in the parser.  But maybe it
can happen during parse analysis.  I'd spend some time looking at
transformColumnRef(), because that's where we translate things x.y
into Var nodes.  I'm not positive there's enough information available
at that stage, but if p_target_rangetblentry is populated at that
point, you should be able to make AFTER.x translate to a Var
referencing that range table entry.  It's a bit less clear how we know
that we're inside the returning-list at that point; I'm not sure how
much work it would be to pass that information down.  But I think it's
worth looking at.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GSOC13 proposal - extend RETURNING syntax

От
Karol Trzcionka
Дата:
W dniu 04.10.2013 02:51, Robert Haas pisze:
> Do you have a link to previous discussion on the mailing list?
Sorry, most of discussion was at IRC channel.
> I'm not positive there's enough information available
> at that stage, but if p_target_rangetblentry is populated at that
> point, you should be able to make AFTER.x translate to a Var
> referencing that range table entry.
It's not enough. Even if we know "where we are", there are more issues.
The main question is: how should we pass information about "hello, I'm
specific Var, don't evaluate me like others"? We can add two fields to
Var structure (flag - normal/before/after and no. column) - however it
needs to modify copyObject for Vars (at now it's done e.g. in
flatten_join_alias_vars_mutator for varoattno and varnoold). If
copyObject is modified, sure code in
flatten_join_alias_vars_mutator/pullup_replace_vars_callback will be
useless. I don't know if modifying pg at the low-level (changing
structure of Var and behaviour of copyObject) is good idea. Yes if the
community really want it but it needs more "votes". There is "medium"
solution: changing Var structure and do the "copy" like now (in mutator
and callback) but w/o the condition statement (for the new fields). I
think it might need to modify more places in code because of "comparing"
vars (maybe we'd need to include new fields while comparision).
Regards,
Karol Trzcionka



Re: GSOC13 proposal - extend RETURNING syntax

От
Marko Tiikkaja
Дата:
On 10/4/13 12:28 AM, Robert Haas wrote:
> In fact, we can already get approximately the
> desired effect already:
>
> rhaas=# update foo as after set a = before.a + 1 from foo as before
> where before.a = after.a returning before.a, after.a;
>   a | a
> ---+---
>   1 | 2
> (1 row)
>
> Now this is a hack, because we don't really want to add an extra
> scan/join just to get the behavior we want.  But it seems to me
> significant that this processing makes Vars that refer to the target
> table refer to the new values, and if we got rid of it, they'd refer
> to the old values.  Can't we contrive to make AFTER.x parse into the
> same Var node that x currently does?  Then we don't need an RTE for
> it.

This part sounds reasonable from here.

> And maybe BEFORE.x ought to parse to the same node that just
> plain x does but with some marking, or some other node wrapped around
> it (like a TargetEntry with some flag set?) that suppresses this
> processing.  I'm just shooting from the hip here; that might be wrong
> in detail, or even in broad strokes, but it just looks to me like the
> additional RTE kind is going to bleed into a lot of places.

I might be completely in the woods here, but I believe something like 
this was attempted by Karol earlier, and it failed if two concurrent 
transactions did something similar to:
  UPDATE foo SET a = a + 1 RETURNING BEFORE.a;

Both of them would see BEFORE.a = 0, because that's what the "a" 
evaluated to from the tuple we got before EvalPlanQual.

But maybe what you're suggesting doesn't have this problem?


Regards,
Marko Tiikkaja



Re: GSOC13 proposal - extend RETURNING syntax

От
Andres Freund
Дата:
On 2013-10-03 20:51:08 -0400, Robert Haas wrote:
> On Thu, Oct 3, 2013 at 7:54 PM, Karol Trzcionka <karlikt@gmail.com> wrote:
> > Compare EXPLAIN ANALYZE VERBOSE on your statement and on "patched"
> > workflow. I can see significant difference. And your "after" returns the
> > value after whole the work (after trigger fired) as I know (I don't know
> > if it is needed or not, I only point at the difference).
> 
> Sure, I'm not saying we should implement it that way.  I'm just
> pointing out that the ability already exists, at the executor level,
> to return either tuple.  So I think the executor itself shouldn't need
> to be changed; it's just a matter of getting the correct plan tree to
> pop out.

Note what pullups ExecDelete is doing to return the old tuple
though... So, based on precedent special executor support is not an
unlikely thing to be required for a proper implemenation. As Marko
mentions, any trivial implementation not doing playing dirty like that
will refer to the wrong version of the tuple.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: GSOC13 proposal - extend RETURNING syntax

От
Robert Haas
Дата:
On Fri, Oct 4, 2013 at 4:42 AM, Karol Trzcionka <karlikt@gmail.com> wrote:
> W dniu 04.10.2013 02:51, Robert Haas pisze:
>> Do you have a link to previous discussion on the mailing list?
> Sorry, most of discussion was at IRC channel.
>> I'm not positive there's enough information available
>> at that stage, but if p_target_rangetblentry is populated at that
>> point, you should be able to make AFTER.x translate to a Var
>> referencing that range table entry.
> It's not enough. Even if we know "where we are", there are more issues.
> The main question is: how should we pass information about "hello, I'm
> specific Var, don't evaluate me like others"?

My point is that AFTER.x doesn't appear to need any special marking;
it means the same thing as target_table.x.  BEFORE.x *does* need some
kind of special marking, and I admit I'm not sure what that should
look like.  Maybe an RTE is OK, but letting that RTE get into the join
planning machinery does not seem good; that's going to result in funky
special cases all over the place.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GSOC13 proposal - extend RETURNING syntax

От
Robert Haas
Дата:
On Fri, Oct 4, 2013 at 5:04 AM, Marko Tiikkaja <marko@joh.to> wrote:
> I might be completely in the woods here, but I believe something like this
> was attempted by Karol earlier, and it failed if two concurrent transactions
> did something similar to:
>
>   UPDATE foo SET a = a + 1 RETURNING BEFORE.a;
>
> Both of them would see BEFORE.a = 0, because that's what the "a" evaluated
> to from the tuple we got before EvalPlanQual.
>
> But maybe what you're suggesting doesn't have this problem?

Hmm, it probably does.  That explains why there are executor changes
here; I guess they need some comments to explain their purpose.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: GSOC13 proposal - extend RETURNING syntax

От
David Fetter
Дата:
On Fri, Oct 04, 2013 at 10:42:32AM +0200, Karol Trzcionka wrote:
> W dniu 04.10.2013 02:51, Robert Haas pisze:
> > Do you have a link to previous discussion on the mailing list?
> Sorry, most of discussion was at IRC channel.
> > I'm not positive there's enough information available
> > at that stage, but if p_target_rangetblentry is populated at that
> > point, you should be able to make AFTER.x translate to a Var
> > referencing that range table entry.
> It's not enough. Even if we know "where we are", there are more issues.
> The main question is: how should we pass information about "hello, I'm
> specific Var, don't evaluate me like others"? We can add two fields to
> Var structure (flag - normal/before/after and no. column) - however it
> needs to modify copyObject for Vars (at now it's done e.g. in
> flatten_join_alias_vars_mutator for varoattno and varnoold). If
> copyObject is modified, sure code in
> flatten_join_alias_vars_mutator/pullup_replace_vars_callback will be
> useless. I don't know if modifying pg at the low-level (changing
> structure of Var and behaviour of copyObject) is good idea. Yes if the
> community really want it but it needs more "votes". There is "medium"
> solution: changing Var structure and do the "copy" like now (in mutator
> and callback) but w/o the condition statement (for the new fields). I
> think it might need to modify more places in code because of "comparing"
> vars (maybe we'd need to include new fields while comparision).
> Regards,
> Karol Trzcionka

Karol,

Do you plan to continue this work for the current commitfest?  A lot
of people really want the feature :)

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: GSOC13 proposal - extend RETURNING syntax

От
David Fetter
Дата:
On Wed, Aug 21, 2013 at 08:52:25PM +0200, Karol Trzcionka wrote:
> W dniu 21.08.2013 19:17, Boszormenyi Zoltan pisze:
> > With this fixed, a more complete review:
> Thanks.

I've done some syntactic and white space cleanup, here attached.

Karol, would you care to help with commenting the sections that need
same?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

Re: GSOC13 proposal - extend RETURNING syntax

От
David Fetter
Дата:
On Sun, Feb 02, 2014 at 02:52:42PM -0800, David Fetter wrote:
> On Wed, Aug 21, 2013 at 08:52:25PM +0200, Karol Trzcionka wrote:
> > W dniu 21.08.2013 19:17, Boszormenyi Zoltan pisze:
> > > With this fixed, a more complete review:
> > Thanks.
>
> I've done some syntactic and white space cleanup, here attached.
>
> Karol, would you care to help with commenting the sections that need
> same?

Karol,

Thanks for the updates :)

Other folks,

Next version attached.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

Re: GSOC13 proposal - extend RETURNING syntax

От
Andres Freund
Дата:
Hi,

Some comments about the patch:
* Coding Style: * multiline comments have both /* and */ on their own lines. * I think several places indent by two
tabs.* Spaces around operators * ...
 
* Many of the new comments would enjoy a bit TLC by a native speaker.

* The way RTE_ALIAS creeps in many place where it doesn't seem to belong seems to indicate that the design isn't yet
ready.I share Robert's suspicion that this would be better solved by referring to a special range table entry.
 

Based on the lack of activity around this and the fact that this needs a
*significant* chunk of work before being committable, I am going to mark
this as returned with feedback.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: GSOC13 proposal - extend RETURNING syntax

От
Arthur Axel 'fREW' Schmidt
Дата:
Andres Freund <andres <at> anarazel.de> writes:

> 
> Hi,
> 
> Some comments about the patch:
> * Coding Style:
>   * multiline comments have both /* and */ on their own lines.
>   * I think several places indent by two tabs.
>   * Spaces around operators
>   * ...
> * Many of the new comments would enjoy a bit TLC by a native speaker.
> 
> * The way RTE_ALIAS creeps in many place where it doesn't seem to belong
>   seems to indicate that the design isn't yet ready. I share Robert's
>   suspicion that this would be better solved by referring to a special
>   range table entry.
> 
> Based on the lack of activity around this and the fact that this needs a
> *significant* chunk of work before being committable, I am going to mark
> this as returned with feedback.

I'm actively working towards converting our software at work to use Pg
instead of SQL Server and before we switch we'll need this feature to be
merged.  I'll do what I can to get the verbage and style whipped into shape,
though I doubt I can do much with the actual code.  Hopefully I can get
something in soon.