Обсуждение: COPY FROM STDIN

Поиск
Список
Период
Сортировка

COPY FROM STDIN

От
Luke Coldiron
Дата:
Is there a way to achieve the performance of the COPY FROM STDIN command within a C extension function connected to the db connection that called the C function? I have text that I would like to receive as input to a C function that contains many COPY command statements in the file that would be parsed similar to how psql would handle the file but I don't want to shell out to psql as I need to do all of this work on the db connection that the function was called from as there are other commands that I need to perform as well after before and after handling the COPY commands on this same connection. I would like the unit of work to be all or nothing and have the performance of the COPY FROM STDIN command and not break things out into SELECT INTO or INSERT statements for performance.

Ideally I would like to be able to attach to the calling db connection via SPI_connect() and then use the libpq library to issue the copy commands via PQputCopyData, PQputCopyEnd.

Re: COPY FROM STDIN

От
Jim Nasby
Дата:
On 1/4/16 12:18 PM, Luke Coldiron wrote:
> Is there a way to achieve the performance of the COPY FROM STDIN command
> within a C extension function connected to the db connection that called
> the C function? I have text that I would like to receive as input to a C
> function that contains many COPY command statements in the file that
> would be parsed similar to how psql would handle the file but I don't
> want to shell out to psql as I need to do all of this work on the db
> connection that the function was called from as there are other commands
> that I need to perform as well after before and after handling the COPY
> commands on this same connection. I would like the unit of work to be
> all or nothing and have the performance of the COPY FROM STDIN command
> and not break things out into SELECT INTO or INSERT statements for
> performance.
>
> Ideally I would like to be able to attach to the calling db connection
> via SPI_connect() and then use the libpq library to issue the copy
> commands via PQputCopyData, PQputCopyEnd.

C functions can use SPI, so I'm not sure what the issue is?

http://www.postgresql.org/docs/9.5/static/spi.html

(BTW, you'll want to scroll to the bottom of that page...)
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: COPY FROM STDIN

От
Luke Coldiron
Дата:
> On 1/4/16 12:18 PM, Luke Coldiron wrote:
> > Is there a way to achieve the performance of the COPY FROM STDIN command
> > within a C extension function connected to the db connection that called
> > the C function? I have text that I would like to receive as input to a C
> > function that contains many COPY command statements in the file that
> > would be parsed similar to how psql would handle the file but I don't
> > want to shell out to psql as I need to do all of this work on the db
> > connection that the function was called from as there are other commands
> > that I need to perform as well after before and after handling the COPY
> > commands on this same connection. I would like the unit of work to be
> > all or nothing and have the performance of the COPY FROM STDIN command
> > and not break things out into SELECT INTO or INSERT statements for
> > performance.
> >
> > Ideally I would like to be able to attach to the calling db connection
> > via SPI_connect() and then use the libpq library to issue the copy
> > commands via PQputCopyData, PQputCopyEnd.
>
> C functions can use SPI, so I'm not sure what the issue is?
>
> http://www.postgresql.org/docs/9.5/static/spi.html
>
> (BTW, you'll want to scroll to the bottom of that page...)
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://bluetreble.com/

I should probably back the boat up. I'm not too familiar with the
inter-workings of postgres. I have a general understanding of the SPI API
and realize that other c functions can be called. Part of the problem is I'm
not sure which ones. The first thing I am trying to figure out is if I can
perform a COPY FROM STDIN command via a C extension function using the SPI.
From what I read in the documentation it seemed to indicate that this may
not be possible
(http://www.postgresql.org/docs/9.3/static/spi-spi-execute.html).

Here is what I have tried thus far.

#include "postgres.h"
#include "fmgr.h"
#include "funcapi.h"
#include "executor/spi.h"
#include "catalog/pg_type.h"
#include "commands/copy.h"

// Attempt #1
Datum copy_test(PG_FUNCTION_ARGS)
{
   SPI_connect();

   int spi_status = SPI_execute("COPY public.test_table(val1, val2) FROM
stdin;" /* command */,
                                false, /* read_only */
                                0     /* count */);
   if (spi_status != SPI_OK_SELECT)
   {
      ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR),
errmsg_internal("Failed: %s.", SPI_result_code_string(spi_status))));
   }

   SPI_finish();

   PG_RETURN_VOID();
}

When I run the function above I get this error:  ERROR:  Failed:
SPI_ERROR_COPY. Which is what I would expect from the documentation.

However if I try something like this:

// Attempt #2
Datum copy_test(PG_FUNCTION_ARGS)
{
   SPI_connect();

   SPIPlanPtr plan = SPI_prepare("COPY public.test_table(val1, val2) FROM
stdin;" /* command */, 0 /* nargs */, NULL /* argtypes*/);
   if (plan == NULL)
   {
      ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR),
errmsg_internal("Failed: %s.", SPI_result_code_string(SPI_result))));
   }

   SPI_finish();

   PG_RETURN_VOID();
}

It works, although I really haven't got to the point of doing anything. It
looked like the "commands/copy.h" has the interface that I want to use but
I'm not sure if I am even going about it the correct way. Here roughly what
I am thinking I need to do.

// Attempt #3
Datum copy_test(PG_FUNCTION_ARGS)
{
   SPI_connect();

   SPIPlanPtr plan = SPI_prepare("COPY public.test_table(val1, val2) FROM
stdin;" /* command */, 0 /* nargs */, NULL /* argtypes*/);
   if (plan == NULL)
   {
      ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR),
errmsg_internal("Failed: %s.", SPI_result_code_string(SPI_result))));
   }

   uint64 processed;
   Oid table_oid = DoCopy(const CopyStmt *stmt,
                          "COPY public.test_table(val1, val2) FROM stdin;"
/* queryString */,
                          &processed /* processed */);

   //CopyState cstate = BeginCopyFrom(table_oid /* rel */,
                                                                      NULL
/* filename */,
                                                  false /* is_program */,
                                                                      List
*attnamelist,
                                                                      List
*options);

  // Somehow

   // End the copy command
   // EndCopyFrom(cstate);

    // TODO: Make use of the callback
   // extern void CopyFromErrorCallback(void *arg);
   // extern DestReceiver *CreateCopyDestReceiver(void);

   SPI_finish();

   PG_RETURN_VOID();
}

In the example above I'm not sure if I can use some sub struct of the
SPIPlanPtr and hand it off to the DoCopy function as the CopyStmt or if I
need to go about this entirely different. Any advice on the matter would be
much appreciated.




Re: COPY FROM STDIN

От
Jim Nasby
Дата:
On 1/6/16 9:45 PM, Luke Coldiron wrote:
> In the example above I'm not sure if I can use some sub struct of the
> SPIPlanPtr and hand it off to the DoCopy function as the CopyStmt or if I
> need to go about this entirely different. Any advice on the matter would be
> much appreciated.

I don't know off-hand. I suggest you look at what psql does to implement
\copy (note the \).
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: COPY FROM STDIN

От
Luke Coldiron
Дата:
> On 1/6/16 9:45 PM, Luke Coldiron wrote:
> > In the example above I'm not sure if I can use some sub struct of the
> > SPIPlanPtr and hand it off to the DoCopy function as the CopyStmt or
> > if I need to go about this entirely different. Any advice on the
> > matter would be much appreciated.
>
> I don't know off-hand. I suggest you look at what psql does to implement
\copy (note the \).
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in
Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in
Treble! http://BlueTreble.com

I took a look at the \copy but as far as I can tell this is using the
frontend (client side) of postgresql and I need to be able to do this on the
backend (server side). I don't see a way with this either to attach to the
connection that called the c function and execute the copy statement. The
DoCopy in commands/copy.h appears to me to be the server side copy command.
And I think I can get it to work if I can figure out how to attach to the
connection of the calling function and pass in the CopyStmt variable.


Re: COPY FROM STDIN

От
Jim Nasby
Дата:
On 1/8/16 10:37 AM, Luke Coldiron wrote:
>> On 1/6/16 9:45 PM, Luke Coldiron wrote:
>>> In the example above I'm not sure if I can use some sub struct of the
>>> SPIPlanPtr and hand it off to the DoCopy function as the CopyStmt or
>>> if I need to go about this entirely different. Any advice on the
>>> matter would be much appreciated.
>>
>> I don't know off-hand. I suggest you look at what psql does to implement
> \copy (note the \).
>> --
>> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in
> Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in
> Treble! http://BlueTreble.com
>
> I took a look at the \copy but as far as I can tell this is using the
> frontend (client side) of postgresql and I need to be able to do this on the

Oops.

> backend (server side). I don't see a way with this either to attach to the
> connection that called the c function and execute the copy statement. The
> DoCopy in commands/copy.h appears to me to be the server side copy command.
> And I think I can get it to work if I can figure out how to attach to the
> connection of the calling function and pass in the CopyStmt variable.

That sounds... bad.

I think the way to handle this is to add a new type to CopyDest and
modify CopyGetData() and CopySendEndOfRow() accordingly.

It might be tempting to use CopyState->filename as the pointer to a
StringInfoData (StringInfo), but I'm not sure that's a great idea. I
think it'd be better to add a new field to CopyStateData.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com