Обсуждение: COPY as a set returning function
# select a,c,e from copy_srf('echo 12,345,67,89,2016-01-01',true) as t(a integer, b text, c text, d text, e date);a | c | e----+----+------------12 | 67 | 2016-01-01(1 row)
- a function that accepts an options string and parse that
- we could alter the grammar to make COPY RETURNING col1, col3, col5 FROM 'filename' a legit CTE.
Regardless of the path forward, I'm going to need help in getting there, hence this email. Thank you for your consideration.
Вложения
Corey Huinker <corey.huinker@gmail.com> writes:
> Attached is a _very_ rough patch implementing a proof-of-concept function
> copy_srf();
> ...
> As for that future direction, we could either have:
> - a robust function named something like copy_srf(), with parameters for
> all of the relevant options found in the COPY command
> - a function that accepts an options string and parse that
> - we could alter the grammar to make COPY RETURNING col1, col3, col5 FROM
> 'filename' a legit CTE.
I think the last of those suggestions has come up before. It has the
large advantage that you don't have to remember a different syntax for
copy-as-a-function. Once you had the framework for that, other
rows-returning utility commands such as EXPLAIN might plug in as well,
whenever somebody got enough of an itch for it.
regards, tom lane
<p dir="ltr"><p dir="ltr">On 1 Oct. 2016 05:20, "Tom Lane" <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>>wrote:<br /> ><br /> > Corey Huinker <<a href="mailto:corey.huinker@gmail.com">corey.huinker@gmail.com</a>>writes:<br /> > > Attached is a _very_ rough patchimplementing a proof-of-concept function<br /> > > copy_srf();<br /> > > ...<br /> > > As for thatfuture direction, we could either have:<br /> > > - a robust function named something like copy_srf(), with parametersfor<br /> > > all of the relevant options found in the COPY command<br /> > > - a function that acceptsan options string and parse that<br /> > > - we could alter the grammar to make COPY RETURNING col1, col3, col5FROM<br /> > > 'filename' a legit CTE.<br /> ><br /> > I think the last of those suggestions has come upbefore. It has the<br /> > large advantage that you don't have to remember a different syntax for<br /> > copy-as-a-function. Once you had the framework for that, other<br /> > rows-returning utility commands such as EXPLAINmight plug in as well,<br /> > whenever somebody got enough of an itch for it.<br /><p dir="ltr">That sounds fantastic.It'd help this copy variant retain festure parity with normal copy. And it'd bring us closer to being able to FETCHin non queries.<p dir="ltr">> regards, tom lane<br /> ><br /> ><br /> > --<br />> Sent via pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br/> > To make changes to your subscription:<br/> > <a href="http://www.postgresql.org/mailpref/pgsql-hackers">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/>
Craig Ringer <craig.ringer@2ndquadrant.com> writes:
> On 1 Oct. 2016 05:20, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>> I think the last of those suggestions has come up before. It has the
>> large advantage that you don't have to remember a different syntax for
>> copy-as-a-function.
> That sounds fantastic. It'd help this copy variant retain festure parity
> with normal copy. And it'd bring us closer to being able to FETCH in non
> queries.
On second thought, though, this couldn't exactly duplicate the existing
COPY syntax, because COPY relies heavily on the rowtype of the named
target table to tell it what it's copying. You'd need some new syntax
to provide the list of column names and types, which puts a bit of
a hole in the "syntax we already know" argument. A SRF-returning-record
would have a leg up on that, because we do have existing syntax for
defining the concrete rowtype that any particular call returns.
regards, tom lane
> That sounds fantastic. It'd help this copy variant retain festure parity
> with normal copy. And it'd bring us closer to being able to FETCH in non
> queries.
On second thought, though, this couldn't exactly duplicate the existing
COPY syntax, because COPY relies heavily on the rowtype of the named
target table to tell it what it's copying. You'd need some new syntax
to provide the list of column names and types, which puts a bit of
a hole in the "syntax we already know" argument. A SRF-returning-record
would have a leg up on that, because we do have existing syntax for
defining the concrete rowtype that any particular call returns.
regards, tom lane
WITH my_copy AS (COPY FROM 'example.csv' TO RESULT SET(c1 text, c2 integer, dummy1 text, dummy2 text, c5 date) WITH (FORMAT CSV)RETURNING c1, c2, c3)SELECT ...FROM my_copyLEFT OUTER JOIN ref_table ...
<p dir="ltr">On 15 Oct. 2016 04:56, "Corey Huinker" <<a href="mailto:corey.huinker@gmail.com">corey.huinker@gmail.com</a>>wrote:<p dir="ltr">> I would like to make COPY itselfa SRF. That's a bit beyond my capabilities, so if that is the route we want to go, I will need help.<br /> ><br/> > The syntax would probably look like this (new bits in bold):<br /> ><br /> >> WITH my_copy AS (<br/> >> COPY FROM 'example.csv' TO RESULT SET(c1 text, c2 integer, dummy1 text, dummy2 text, c5 date) WITH (FORMATCSV)<br /> >> RETURNING c1, c2, c3<br /> >> )<p dir="ltr">Strong -1 from me on this approach. OurCTE implementation materializes everything so this is no better than COPYing to a temp table.<p dir="ltr">Not unless youplan to fix that (and figure out the backward compatibility issues since the bug is documented as a feature) or implementRETURNING in subqueries... I'd go for the function.
On 15 Oct. 2016 04:56, "Corey Huinker" <corey.huinker@gmail.com> wrote:
> I would like to make COPY itself a SRF. That's a bit beyond my capabilities, so if that is the route we want to go, I will need help.
>
> The syntax would probably look like this (new bits in bold):
>
>> WITH my_copy AS (
>> COPY FROM 'example.csv' TO RESULT SET(c1 text, c2 integer, dummy1 text, dummy2 text, c5 date) WITH (FORMAT CSV)
>> RETURNING c1, c2, c3
>> )Strong -1 from me on this approach. Our CTE implementation materializes everything so this is no better than COPYing to a temp table.
Not unless you plan to fix that (and figure out the backward compatibility issues since the bug is documented as a feature) or implement RETURNING in subqueries... I'd go for the function.
If it does stay a function, we only need to implement 8 of the 12 options as parameters (FREEZE and FORCE* options don't apply). My guess is that future options added to COPY will be more about handling output or optimizing table inserts, neither of which mean more options for this proposed function.
Would the best approach be to build in a core srf-returning function that might be deprecated once COPY is set-returning AND CTEs don't have to materialize, or to refactor what's in copy.c such that a contrib module can easily plug into it, and have copy_srf live there?
On Fri, Sep 30, 2016 at 9:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Craig Ringer <craig.ringer@2ndquadrant.com> writes: >> On 1 Oct. 2016 05:20, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: >>> I think the last of those suggestions has come up before. It has the >>> large advantage that you don't have to remember a different syntax for >>> copy-as-a-function. > >> That sounds fantastic. It'd help this copy variant retain festure parity >> with normal copy. And it'd bring us closer to being able to FETCH in non >> queries. > > On second thought, though, this couldn't exactly duplicate the existing > COPY syntax, because COPY relies heavily on the rowtype of the named > target table to tell it what it's copying. You'd need some new syntax > to provide the list of column names and types, which puts a bit of > a hole in the "syntax we already know" argument. A SRF-returning-record > would have a leg up on that, because we do have existing syntax for > defining the concrete rowtype that any particular call returns. One big disadvantage of SRF-returning-record syntax is that functions are basically unwrappable with generic wrappers sans major gymnastics such as dynamically generating the query and executing it. This is a major disadvantage relative to the null::type hack we use in the populate_record style functions and perhaps ought to make this (SRF-returning-record syntax) style of use discouraged for useful library functions. If there were a way to handle wrapping I'd withdraw this minor objection -- this has come up in dblink discussions a few times). merlin
The function signature reflects cstate more than it reflects the COPY options (filename+is_program instead of FILENAME or PROGRAM, etc)
On Fri, Sep 30, 2016 at 9:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Craig Ringer <craig.ringer@2ndquadrant.com> writes:
>> On 1 Oct. 2016 05:20, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>>> I think the last of those suggestions has come up before. It has the
>>> large advantage that you don't have to remember a different syntax for
>>> copy-as-a-function.
>
>> That sounds fantastic. It'd help this copy variant retain festure parity
>> with normal copy. And it'd bring us closer to being able to FETCH in non
>> queries.
>
> On second thought, though, this couldn't exactly duplicate the existing
> COPY syntax, because COPY relies heavily on the rowtype of the named
> target table to tell it what it's copying. You'd need some new syntax
> to provide the list of column names and types, which puts a bit of
> a hole in the "syntax we already know" argument. A SRF-returning-record
> would have a leg up on that, because we do have existing syntax for
> defining the concrete rowtype that any particular call returns.
One big disadvantage of SRF-returning-record syntax is that functions
are basically unwrappable with generic wrappers sans major gymnastics
such as dynamically generating the query and executing it. This is a
major disadvantage relative to the null::type hack we use in the
populate_record style functions and perhaps ought to make this
(SRF-returning-record syntax) style of use discouraged for useful
library functions. If there were a way to handle wrapping I'd
withdraw this minor objection -- this has come up in dblink
discussions a few times).
merlin
Вложения
Attached is a patch that implements copy_srf().
Moved to next CF with "needs review" status.
On Mon, Dec 5, 2016 at 2:10 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > On Tue, Nov 1, 2016 at 7:45 AM, Corey Huinker <corey.huinker@gmail.com> > wrote: >> >> Attached is a patch that implements copy_srf(). > > Moved to next CF with "needs review" status. This patch is still waiting for review. David, are you planning to look at it by the end of the CF? -- Michael
On Wed, Jan 25, 2017 at 02:37:57PM +0900, Michael Paquier wrote: > On Mon, Dec 5, 2016 at 2:10 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > > On Tue, Nov 1, 2016 at 7:45 AM, Corey Huinker <corey.huinker@gmail.com> > > wrote: > >> > >> Attached is a patch that implements copy_srf(). > > > > Moved to next CF with "needs review" status. > > This patch is still waiting for review. David, are you planning to > look at it by the end of the CF? I'll be doing this today. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Mon, Oct 31, 2016 at 04:45:40PM -0400, Corey Huinker wrote: > Attached is a patch that implements copy_srf(). > > The function signature reflects cstate more than it reflects the COPY > options (filename+is_program instead of FILENAME or PROGRAM, etc) The patch as it stands needs a rebase. I'll see what I can do today. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Wed, Jan 25, 2017 at 06:16:16AM -0800, David Fetter wrote:
> On Mon, Oct 31, 2016 at 04:45:40PM -0400, Corey Huinker wrote:
> > Attached is a patch that implements copy_srf().
> >
> > The function signature reflects cstate more than it reflects the COPY
> > options (filename+is_program instead of FILENAME or PROGRAM, etc)
>
> The patch as it stands needs a rebase. I'll see what I can do today.
Please find attached a rebase, which fixes an OID collision that crept in.
- The patch builds against master and passes "make check".
- The patch does not contain user-visible documentation or examples.
- The patch does not contain tests.
I got the following when I did a brief test.
SELECT * FROM copy_srf(filename => 'ls', is_program => true) AS t(l text);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
David Fetter wrote:
> @@ -562,7 +563,6 @@ CopyGetData(CopyState cstate, void *databuf, int minread, int maxread)
> errmsg("could not read from COPY file: %m")));
> break;
> case COPY_OLD_FE:
> -
> /*
> * We cannot read more than minread bytes (which in practice is 1)
> * because old protocol doesn't have any clear way of separating
This change is pointless as it'd be undone by pgindent.
> + /*
> + * Function signature is:
> + * copy_srf( filename text default null,
> + * is_program boolean default false,
> + * format text default 'text',
> + * delimiter text default E'\t' in text mode, ',' in csv mode,
> + * null_string text default '\N',
> + * header boolean default false,
> + * quote text default '"' in csv mode only,
> + * escape text default null, -- defaults to whatever quote is
> + * encoding text default null
> + */
This comment would be mangled by pgindent -- please add an /*--- marker
to prevent it.
> + /* param 7: escape text default null, -- defaults to whatever quote is */
> + if (PG_ARGISNULL(7))
> + {
> + copy_state.escape = copy_state.quote;
> + }
> + else
> + {
> + if (copy_state.csv_mode)
> + {
> + copy_state.escape = TextDatumGetCString(PG_GETARG_TEXT_P(7));
> + if (strlen(copy_state.escape) != 1)
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("COPY escape must be a single one-byte character")));
> + }
> + }
> + else
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("COPY escape available only in CSV mode")));
> + }
> + }
I don't understand why do we have all these checks. Can't we just pass
the values to COPY and let it apply the checks? That way, when COPY is
updated to support multibyte escape chars (for example) we don't need to
touch this code. Together with removing the unneeded braces that would
make these stanzas about six lines long instead of fifteen.
> + tuple = heap_form_tuple(tupdesc,values,nulls);
> + //tuple = BuildTupleFromCStrings(attinmeta, field_strings);
> + tuplestore_puttuple(tupstore, tuple);
No need to form a tuple; use tuplestore_putvalues here.
> + }
> +
> + /* close "file" */
> + if (copy_state.is_program)
> + {
> + int pclose_rc;
> +
> + pclose_rc = ClosePipeStream(copy_state.copy_file);
> + if (pclose_rc == -1)
> + ereport(ERROR,
> + (errcode_for_file_access(),
> + errmsg("could not close pipe to external command: %m")));
> + else if (pclose_rc != 0)
> + ereport(ERROR,
> + (errcode(ERRCODE_EXTERNAL_ROUTINE_EXCEPTION),
> + errmsg("program \"%s\" failed",
> + copy_state.filename),
> + errdetail_internal("%s", wait_result_to_str(pclose_rc))));
> + }
> + else
> + {
> + if (copy_state.filename != NULL && FreeFile(copy_state.copy_file))
> + ereport(ERROR,
> + (errcode_for_file_access(),
> + errmsg("could not close file \"%s\": %m",
> + copy_state.filename)));
> + }
I wonder if these should be an auxiliary function in copy.c to do this.
Surely copy.c itself does pretty much the same thing ...
> +DATA(insert OID = 3353 ( copy_srf PGNSP PGUID 12 1 0 0 0 f f f f f t v u 9 0 2249 "25 16 25 25 25 16 25 25 25"
_null__null_ _null_ _null_ _null_ copy_srf _null_ _null_ _null_ ));
> +DESCR("set-returning COPY proof of concept");
Why is this marked "proof of concept"? If this is a PoC only, why are
you submitting it as a patch?
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
David Fetter wrote:
> @@ -562,7 +563,6 @@ CopyGetData(CopyState cstate, void *databuf, int minread, int maxread)
> errmsg("could not read from COPY file: %m")));
> break;
> case COPY_OLD_FE:
> -
> /*
> * We cannot read more than minread bytes (which in practice is 1)
> * because old protocol doesn't have any clear way of separating
This change is pointless as it'd be undone by pgindent.
> + /*
> + * Function signature is:
> + * copy_srf( filename text default null,
> + * is_program boolean default false,
> + * format text default 'text',
> + * delimiter text default E'\t' in text mode, ',' in csv mode,
> + * null_string text default '\N',
> + * header boolean default false,
> + * quote text default '"' in csv mode only,
> + * escape text default null, -- defaults to whatever quote is
> + * encoding text default null
> + */
This comment would be mangled by pgindent -- please add an /*--- marker
to prevent it.
> + /* param 7: escape text default null, -- defaults to whatever quote is */
> + if (PG_ARGISNULL(7))
> + {
> + copy_state.escape = copy_state.quote;
> + }
> + else
> + {
> + if (copy_state.csv_mode)
> + {
> + copy_state.escape = TextDatumGetCString(PG_GETARG_TEXT_P(7));
> + if (strlen(copy_state.escape) != 1)
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("COPY escape must be a single one-byte character")));
> + }
> + }
> + else
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("COPY escape available only in CSV mode")));
> + }
> + }
I don't understand why do we have all these checks. Can't we just pass
the values to COPY and let it apply the checks? That way, when COPY is
updated to support multibyte escape chars (for example) we don't need to
touch this code. Together with removing the unneeded braces that would
make these stanzas about six lines long instead of fifteen.
> + tuple = heap_form_tuple(tupdesc,values,nulls);
> + //tuple = BuildTupleFromCStrings(attinmeta, field_strings);
> + tuplestore_puttuple(tupstore, tuple);
No need to form a tuple; use tuplestore_putvalues here.
> + }
> +
> + /* close "file" */
> + if (copy_state.is_program)
> + {
> + int pclose_rc;
> +
> + pclose_rc = ClosePipeStream(copy_state.copy_file);
> + if (pclose_rc == -1)
> + ereport(ERROR,
> + (errcode_for_file_access(),
> + errmsg("could not close pipe to external command: %m")));
> + else if (pclose_rc != 0)
> + ereport(ERROR,
> + (errcode(ERRCODE_EXTERNAL_ROUTINE_EXCEPTION),
> + errmsg("program \"%s\" failed",
> + copy_state.filename),
> + errdetail_internal("%s", wait_result_to_str(pclose_rc))));
> + }
> + else
> + {
> + if (copy_state.filename != NULL && FreeFile(copy_state.copy_file))
> + ereport(ERROR,
> + (errcode_for_file_access(),
> + errmsg("could not close file \"%s\": %m",
> + copy_state.filename)));
> + }
I wonder if these should be an auxiliary function in copy.c to do this.
Surely copy.c itself does pretty much the same thing ...
> +DATA(insert OID = 3353 ( copy_srf PGNSP PGUID 12 1 0 0 0 f f f f f t v u 9 0 2249 "25 16 25 25 25 16 25 25 25" _null_ _null_ _null_ _null_ _null_ copy_srf _null_ _null_ _null_ ));
> +DESCR("set-returning COPY proof of concept");
Why is this marked "proof of concept"? If this is a PoC only, why are
you submitting it as a patch?
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jan 25, 2017 at 12:23:23PM -0500, Corey Huinker wrote: > > Feel free to mark it returned as feedback. The concept didn't > generate as much enthusiasm as I had hoped, so I think the right > thing to do now is scale it back to a patch that makes > CopyFromRawFields() externally visible so that extensions can use > them. You're getting enthusiasm in the form of these reviews and suggestions for improvement. That it doesn't always happen immediately is a byproduct of the scarcity of developer time and the sheer volume of things to which people need to pay attention. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Wed, Jan 25, 2017 at 12:23:23PM -0500, Corey Huinker wrote:
>
> Feel free to mark it returned as feedback. The concept didn't
> generate as much enthusiasm as I had hoped, so I think the right
> thing to do now is scale it back to a patch that makes
> CopyFromRawFields() externally visible so that extensions can use
> them.
You're getting enthusiasm in the form of these reviews and suggestions
for improvement. That it doesn't always happen immediately is a
byproduct of the scarcity of developer time and the sheer volume of
things to which people need to pay attention.
> + /* param 7: escape text default null, -- defaults to whatever quote is */
> + if (PG_ARGISNULL(7))
> + {
> + copy_state.escape = copy_state.quote;
> + }
> + else
> + {
> + if (copy_state.csv_mode)
> + {
> + copy_state.escape = TextDatumGetCString(PG_GETARG_TEXT_P(7));
> + if (strlen(copy_state.escape) != 1)
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("COPY escape must be a single one-byte character")));
> + }
> + }
> + else
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("COPY escape available only in CSV mode")));
> + }
> + }
I don't understand why do we have all these checks. Can't we just pass
the values to COPY and let it apply the checks? That way, when COPY is
updated to support multibyte escape chars (for example) we don't need to
touch this code. Together with removing the unneeded braces that would
make these stanzas about six lines long instead of fifteen.
> + tuple = heap_form_tuple(tupdesc,values,nulls);
> + //tuple = BuildTupleFromCStrings(attinmeta, field_strings);
> + tuplestore_puttuple(tupstore, tuple);
No need to form a tuple; use tuplestore_putvalues here.
> + }
> +
> + /* close "file" */
> + if (copy_state.is_program)
> + {
> + int pclose_rc;
> +
> + pclose_rc = ClosePipeStream(copy_state.copy_file);
> + if (pclose_rc == -1)
> + ereport(ERROR,
> + (errcode_for_file_access(),
> + errmsg("could not close pipe to external command: %m")));
> + else if (pclose_rc != 0)
> + ereport(ERROR,
> + (errcode(ERRCODE_EXTERNAL_ROUTINE_EXCEPTION),
> + errmsg("program \"%s\" failed",
> + copy_state.filename),
> + errdetail_internal("%s", wait_result_to_str(pclose_rc))));
> + }
> + else
> + {
> + if (copy_state.filename != NULL && FreeFile(copy_state.copy_file))
> + ereport(ERROR,
> + (errcode_for_file_access(),
> + errmsg("could not close file \"%s\": %m",
> + copy_state.filename)));
> + }
I wonder if these should be an auxiliary function in copy.c to do this.
Surely copy.c itself does pretty much the same thing ...
I don't understand why do we have all these checks. Can't we just pass
the values to COPY and let it apply the checks? That way, when COPY is
updated to support multibyte escape chars (for example) we don't need to
touch this code. Together with removing the unneeded braces that would
make these stanzas about six lines long instead of fifteen.If I understand you correctly, COPY (via BeginCopyFrom) itself relies on having a relation in pg_class to reference for attributes.In this case, there is no such relation. So I'd have to fake a relcache entry, or refactor BeginCopyFrom() to extract a ReturnSetInfo from the Relation and pass that along to a new function BeginCopyFromReturnSet. I'm happy to go that route if you think it's a good idea.
> + tuple = heap_form_tuple(tupdesc,values,nulls);
> + //tuple = BuildTupleFromCStrings(attinmeta, field_strings);
> + tuplestore_puttuple(tupstore, tuple);
No need to form a tuple; use tuplestore_putvalues here.Good to know!
I wonder if these should be an auxiliary function in copy.c to do this.
Surely copy.c itself does pretty much the same thing ...Yes. This got started as a patch to core because not all of the parts of COPY are externally callable, and aren't broken down in a way that allowed for use in a SRF.I'll get to work on these suggestions.
Some issues:
- This function will have the same difficulties as adding the program option did to file_fdw: there's very little we can reference that isn't os/environment specific
- Inline (STDIN) prompts the user for input, but gives the error: server sent data ("D" message) without prior row description ("T" message). I looked for a place where the Relation was consulted for the row description, but I'm not finding it.
Вложения
On Wed, Jan 25, 2017 at 08:51:38PM -0500, Corey Huinker wrote: > I've put in some more work on this patch, mostly just taking Alvaro's > suggestions, which resulted in big code savings. The patch applies atop master. The test (ls) which previously crashed the backend now doesn't, and works in a reasonable way. The description of the function still talks about its being a proof of concept. There are still neither regression tests nor SGML documentation. Are we at a point where we should add these things? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 1/25/17 8:51 PM, Corey Huinker wrote:
> # select * from copy_srf('echo "x\ty"',true) as t(x text, y text);
I find these parameters weird. Just looking at this, one has no idea
what the "true" means. Why not have a "filename" and a "program"
parameter and make them mutually exclusive?
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1/27/17 8:07 PM, David Fetter wrote: > There are still neither regression tests nor SGML documentation. > > Are we at a point where we should add these things? One could argue about the documentation at this point, since the function is somewhat self-documenting for the time being. But surely there should be tests. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1/25/17 8:51 PM, Corey Huinker wrote:
> # select * from copy_srf('echo "x\ty"',true) as t(x text, y text);
I find these parameters weird. Just looking at this, one has no idea
what the "true" means. Why not have a "filename" and a "program"
parameter and make them mutually exclusive?
I suppose I could have written it as:
select * from copy_srf(filename => 'echo "x\ty"', is_program => true) as t(x text, y text);
Options at this point are:
4. Someone else's better idea.
On Tue, Jan 31, 2017 at 3:05 AM, Corey Huinker <corey.huinker@gmail.com> wrote:
> On Fri, Jan 27, 2017 at 9:42 PM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>>
>> On 1/25/17 8:51 PM, Corey Huinker wrote:
>> > # select * from copy_srf('echo "x\ty"',true) as t(x text, y text);
>>
>> I find these parameters weird. Just looking at this, one has no idea
>> what the "true" means. Why not have a "filename" and a "program"
>> parameter and make them mutually exclusive?
>
>
> It was done that way to match the parameters of BeginCopyFrom() and it could
> easily be changed.
>
> I suppose I could have written it as:
>
> select * from copy_srf(filename => 'echo "x\ty"', is_program => true) as t(x
> text, y text);
>
>
> But this goes to the core of this patch/poc: I don't know where we want to
> take it next.
>
> Options at this point are:
> 1. Continue with a SRF, in which case discussions about parameters are
> completely valid.
> 2. Add a RETURNING clause to COPY. This would dispense with the parameters
> problem, but potentially create others.
> 3. Add the TupDesc parameter to BeginCopyFrom, make sure all data structures
> are visible to an extension, and call it a day. If someone wants to write an
> extension that makes their own copy_srf(), they can.
> 4. Someone else's better idea.
Here is a 4: Refactoring BeginCopyFrom so as instead of a Relation are
used a TupleDesc, a default expression list, and a relation name. You
could as well make NextCopyFrom() smarter so as it does nothing if no
expression contexts are given by the caller, which is the case of your
function here. To be honest, I find a bit weird to use
NextCopyFromRawFields when there is already a bunch of sanity checks
happening in NextCopyFrom(), where you surely want to have them
checked even for your function.
Looking at the last patch proposed, I find the current patch proposed
as immature, as rsTupDesc basically overlaps with the relation a
caller can define in BeginCopyFrom(), so marking this patch as
"returned with feedback".
--
Michael
Here is a 4: Refactoring BeginCopyFrom so as instead of a Relation are
used a TupleDesc, a default expression list, and a relation name. You
could as well make NextCopyFrom() smarter so as it does nothing if no
expression contexts are given by the caller, which is the case of your
function here. To be honest, I find a bit weird to use
NextCopyFromRawFields when there is already a bunch of sanity checks
happening in NextCopyFrom(), where you surely want to have them
checked even for your function.
Looking at the last patch proposed, I find the current patch proposed
as immature, as rsTupDesc basically overlaps with the relation a
caller can define in BeginCopyFrom(), so marking this patch as
"returned with feedback".
--
Michael
I like that suggestion and will move forward on it. I wasn't sure there would be support for such a refactoring.
As for the copy from stdin case, Andrew Gierth laid out why that's nearly impossible to unwind in the network protocol (the act of starting the copy causes the query result to end before any rows were returned), and that might make STDIN input a dead issue.