Обсуждение: COPY IN as SELECT target
Recently there was discussion about allowing a COPY statement to be a SELECT target, returning a text array, although the syntax wasn't really nailed down that I recall. I was thinking that we might have COPY RETURNING ARRAY FROM ... instead of COPY tablename opt_column_list FROM ... the we possibly could do things like: SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY RETURNING ARRAY FROM STDIN CSV) as t; Thoughts? cheers andrew
Andrew Dunstan wrote: > > Recently there was discussion about allowing a COPY statement to be a > SELECT target, returning a text array, although the syntax wasn't really > nailed down that I recall. I was thinking that we might have > > COPY RETURNING ARRAY FROM ... > > instead of > > COPY tablename opt_column_list FROM ... It's not really returning an array, is it? It's returning a bag of rows like a (sub)query. > the we possibly could do things like: > > SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY RETURNING ARRAY > FROM STDIN CSV) as t; How about just COPY FROM? As in SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY FROM STDIN CSV) as t -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Thu, Dec 17, 2009 at 12:23 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > How about just COPY FROM? As in > > SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY FROM STDIN CSV) as t I had the same thought. Though it would also be nice to allow something like: COPY (type1, type2, type3, type4) FROM STDIN CSV ...which is obviously going to create a horrible parser problem if you actually tried to use that syntax. ...Robert
On Thu, Dec 17, 2009 at 12:28:50PM -0500, Robert Haas wrote: > On Thu, Dec 17, 2009 at 12:23 PM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: > > How about just COPY FROM? As in > > > > SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY FROM STDIN > > CSV) as t > > I had the same thought. Though it would also be nice to allow > something like: > > COPY (type1, type2, type3, type4) FROM STDIN CSV > > ...which is obviously going to create a horrible parser problem if > you actually tried to use that syntax. How about using the CTE syntax? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Thu, Dec 17, 2009 at 12:38 PM, David Fetter <david@fetter.org> wrote: > On Thu, Dec 17, 2009 at 12:28:50PM -0500, Robert Haas wrote: >> On Thu, Dec 17, 2009 at 12:23 PM, Heikki Linnakangas >> <heikki.linnakangas@enterprisedb.com> wrote: >> > How about just COPY FROM? As in >> > >> > SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY FROM STDIN >> > CSV) as t >> >> I had the same thought. Though it would also be nice to allow >> something like: >> >> COPY (type1, type2, type3, type4) FROM STDIN CSV >> >> ...which is obviously going to create a horrible parser problem if >> you actually tried to use that syntax. > > How about using the CTE syntax? I'm not sure what you're suggesting exactly, but the problem with the syntax I suggested is that COPY (...) TO <whatever> expects the "..." part to be a subselect. You can't make COPY (...) FROM have something in there other than a subselect, because the parser can't fast-forward and look at the word FROM and then go back and decide how to parse the parenthesized stuff. That's almost magic in the general case. You'd have to stick a keyword in there before the opening parentheses. ...Robert
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > Andrew Dunstan wrote: >> COPY RETURNING ARRAY FROM ... > It's not really returning an array, is it? It's returning a bag of rows > like a (sub)query. > How about just COPY FROM? The problem with COPY FROM is that it hard-wires a decision that there is one and only one possible result format, which I think we pretty much proved already is the wrong thing. I'm not thrilled with "RETURNING ARRAY" either, but we need to leave ourselves wiggle room to have more than one result format from the same source file. regards, tom lane
Tom Lane wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > >> Andrew Dunstan wrote: >> >>> COPY RETURNING ARRAY FROM ... >>> > > >> It's not really returning an array, is it? It's returning a bag of rows >> like a (sub)query. >> > > >> How about just COPY FROM? >> > > The problem with COPY FROM is that it hard-wires a decision that there > is one and only one possible result format, which I think we pretty > much proved already is the wrong thing. I'm not thrilled with "RETURNING > ARRAY" either, but we need to leave ourselves wiggle room to have more > than one result format from the same source file. > > > Well, we could have "RETURNING type-expression" with "text[]" supported for the first iteration. In answer to Heiki's argument, what I wanted was exactly to return an array of text for each row. Whatever we have needs to be able to handle to possibility of ragged input (see previous discussion) so we can't tie it down too tightly. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> The problem with COPY FROM is that it hard-wires a decision that there >> is one and only one possible result format, which I think we pretty >> much proved already is the wrong thing. I'm not thrilled with "RETURNING >> ARRAY" either, but we need to leave ourselves wiggle room to have more >> than one result format from the same source file. > Well, we could have "RETURNING type-expression" with "text[]" supported > for the first iteration. > In answer to Heiki's argument, what I wanted was exactly to return an > array of text for each row. Whatever we have needs to be able to handle > to possibility of ragged input (see previous discussion) so we can't tie > it down too tightly. I think that there are two likely possibilities for the result format: * "Raw" data after just the de-escaping and column separation steps. Array of text is probably the right thing here, at least for a text COPY (doesn't seem to cover the binary case though). * The data converted to some specified row type. "RETURNING type-expression" is probably not good since it looks more like the second case than the first --- and in fact it could be outright ambiguous, what if your data actually is one column that is a text array? If we're willing to assume these are the *only* possibilities then we could use "COPY FROM ..." for the first and "COPY RETURNING type-list FROM ..." for the second. I'm a bit uncomfortable with that assumption though; it seems likely that we'll want to shoehorn in some more alternatives later. (Like, what about the binary case?) regards, tom lane
> In answer to Heiki's argument, what I wanted was exactly to return an > array of text for each row. Whatever we have needs to be able to handle > to possibility of ragged input (see previous discussion) so we can't tie > it down too tightly. I would have *lots* of use for this feature. Mind you, returning (arbitrary expression) would be even better, but if we can get returning TEXT[] for 8.5, I think it's worth doing on its own. --Josh Berkus
On Thu, Dec 17, 2009 at 1:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Tom Lane wrote: >>> The problem with COPY FROM is that it hard-wires a decision that there >>> is one and only one possible result format, which I think we pretty >>> much proved already is the wrong thing. I'm not thrilled with "RETURNING >>> ARRAY" either, but we need to leave ourselves wiggle room to have more >>> than one result format from the same source file. > >> Well, we could have "RETURNING type-expression" with "text[]" supported >> for the first iteration. > >> In answer to Heiki's argument, what I wanted was exactly to return an >> array of text for each row. Whatever we have needs to be able to handle >> to possibility of ragged input (see previous discussion) so we can't tie >> it down too tightly. > > I think that there are two likely possibilities for the result format: > > * "Raw" data after just the de-escaping and column separation steps. > Array of text is probably the right thing here, at least for a text COPY > (doesn't seem to cover the binary case though). > > * The data converted to some specified row type. Agreed. > "RETURNING type-expression" is probably not good since it looks more > like the second case than the first --- and in fact it could be outright > ambiguous, what if your data actually is one column that is a text > array? > > If we're willing to assume these are the *only* possibilities then we > could use "COPY FROM ..." for the first and "COPY RETURNING type-list > FROM ..." for the second. I'm a bit uncomfortable with that assumption > though; it seems likely that we'll want to shoehorn in some more > alternatives later. (Like, what about the binary case?) You might want to specify column names as well as well as types, in this second case. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > You might want to specify column names as well as well as types, in > this second case. Well, we could do it like VALUES: arbitrarily name the columns column1 ... columnN and tell people to use an alias if they want other names. If it's convenient to fit column names into the syntax, good, but we don't absolutely have to. [ thinks... ] Although actually the obvious SQL-ish syntax for a rowtype specification is ( colname typename [ , ... ] ) so that's probably what we'd want to do in the processed-data case. Not sure about the raw-data case --- maybe a predetermined name is okay there. regards, tom lane
On Thu, Dec 17, 2009 at 1:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> You might want to specify column names as well as well as types, in >> this second case. > > Well, we could do it like VALUES: arbitrarily name the columns column1 > ... columnN and tell people to use an alias if they want other names. > If it's convenient to fit column names into the syntax, good, but we > don't absolutely have to. > > [ thinks... ] Although actually the obvious SQL-ish syntax for a rowtype > specification is > > ( colname typename [ , ... ] ) > > so that's probably what we'd want to do in the processed-data case. Yeah, I think that's good. > Not sure about the raw-data case --- maybe a predetermined name is > okay there. I would expect so. ...Robert
Hi, Le 17 déc. 2009 à 19:39, Josh Berkus a écrit : > Mind you, returning (arbitrary expression) would be even better, but if > we can get returning TEXT[] for 8.5, I think it's worth doing on its own. Well, you already have it as soon as you have text[]: INSERT INTO destinationSELECT row[0], row[1], myfunction(row[0], row[1]), row[2]::int + 1 FROM (COPY RETURNING text[] FROM'/path/to/file.cvs' CVS HEADER) as file(row); Of course as Andrew said already what it needs that the syntax here does not cover is ragged file processing, that is acceptingfile content when all the rows will not have the same number of columns. But if you have ragged input reading and COPY as a relation in a query, then you're able to apply any expression you wantto in the query itself. Such as transforming the input slightly in order to conform to PostgreSQL datatype input syntaxes,e.g. Regards, -- dim Let's deprecate pgloader.
2009/12/17 Andrew Dunstan <andrew@dunslane.net>: > > Recently there was discussion about allowing a COPY statement to be a SELECT > target, returning a text array, although the syntax wasn't really nailed > down that I recall. I was thinking that we might have > > COPY RETURNING ARRAY FROM ... > > instead of > > COPY tablename opt_column_list FROM ... > > > the we possibly could do things like: > > SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY RETURNING ARRAY FROM > STDIN CSV) as t; > > Thoughts? In this case copy doesn't return array - so RETURNING ARRAY is little bit strange. what SELECT .. FROM (COPY VALUES [(colums)] FROM ....) Regards Pavel > > cheers > > andrew > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Pavel Stehule wrote: > 2009/12/17 Andrew Dunstan <andrew@dunslane.net>: > >> Recently there was discussion about allowing a COPY statement to be a SELECT >> target, returning a text array, although the syntax wasn't really nailed >> down that I recall. I was thinking that we might have >> >> COPY RETURNING ARRAY FROM ... >> >> instead of >> >> COPY tablename opt_column_list FROM ... >> >> >> the we possibly could do things like: >> >> SELECT t[5] as a, 3*(t[3]::numeric) as b FROM (COPY RETURNING ARRAY FROM >> STDIN CSV) as t; >> >> Thoughts? >> > > In this case copy doesn't return array - so RETURNING ARRAY is little > bit strange. > > what > > SELECT .. FROM (COPY VALUES [(colums)] FROM ....) > > You are misunderstanding what I want to provide, which is that it *does* return an array of text for each line. That was what the previous discussion arrived at, and is illustrated in the example I showed above. cheers andrew
Tom Lane wrote: > I think that there are two likely possibilities for the result format: > > * "Raw" data after just the de-escaping and column separation steps. > Array of text is probably the right thing here, at least for a text COPY > (doesn't seem to cover the binary case though). > > * The data converted to some specified row type. > > "RETURNING type-expression" is probably not good since it looks more > like the second case than the first --- and in fact it could be outright > ambiguous, what if your data actually is one column that is a text > array? > > If we're willing to assume these are the *only* possibilities then we > could use "COPY FROM ..." for the first and "COPY RETURNING type-list > FROM ..." for the second. I'm a bit uncomfortable with that assumption > though; it seems likely that we'll want to shoehorn in some more > alternatives later. (Like, what about the binary case?) > > > Yeah. I think we need an explicit marker. The first of these cases is the one I'm particularly interested in. I think you could actually get the second from the first with a little more work anyway, but the "raw" input as an array lets me get the things I can't easily get another way. I think we're going to need some marker such as parentheses to distinguish the second case. In that case, "RETURNING text[]" could be the first case and "RETURNING (text[])" could be the second, but maybe that's a bit too subtle. How about "RETURNING TYPE (type_list)" for the second case? cheers andrew