Обсуждение: PQputCopyData dont signal error

Поиск
Список
Период
Сортировка

PQputCopyData dont signal error

От
Pavel Stehule
Дата:
Hello

I have a problem with PQputCopyData function. It doesn't signal some error.
    while ((row = mysql_fetch_row(res)) != NULL)    {        snprintf(buffer, sizeof(buffer), "%s%s\n", row[0],
row[1]);       copy_result = PQputCopyData(pconn, buffer, strlen(buffer));        printf(">>%s<<\n",
PQerrorMessage(pconn));       printf("%d\n", copy_result);        if (copy_result != 1)        {
fprintf(stderr,"Copy to target table failed: %s",                    PQerrorMessage(pconn));            EXIT;        }
 }
 

it returns 1 for broken values too :(

Is necessary some special check?

Regards

Pavel Stehule


Re: PQputCopyData dont signal error

От
Pavel Stehule
Дата:
Hello


>
> The way COPY works is that PQputCopyData just sends the data to the server,
> and the server will buffer it in its internal buffer and processes it when
> it feels like it. The PQputCopyData() calls don't even need to match line
> boundaries.
>

Yes, it is current behave - then documentation is obsolete

> I think you'll need to send all the data and finish the COPY until you get
> an error. If you have a lot of data to send, you might want to slice it into
> multiple COPY statements of say 50MB each, so that you can catch errors in
> between.

:( I wold to import table in one statement

Regards

Pavel Stehule



>
> --
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>


Re: PQputCopyData dont signal error

От
Heikki Linnakangas
Дата:
On 14.04.2011 10:15, Pavel Stehule wrote:
> Hello
>
> I have a problem with PQputCopyData function. It doesn't signal some error.
>
>         while ((row = mysql_fetch_row(res)) != NULL)
>         {
>             snprintf(buffer, sizeof(buffer), "%s%s\n", row[0], row[1]);
>             copy_result = PQputCopyData(pconn, buffer, strlen(buffer));
>             printf(">>%s<<\n", PQerrorMessage(pconn));
>             printf("%d\n", copy_result);
>             if (copy_result != 1)
>             {
>                 fprintf(stderr, "Copy to target table failed: %s",
>                         PQerrorMessage(pconn));
>                 EXIT;
>             }
>         }
>
> it returns 1 for broken values too :(
>
> Is necessary some special check?

The way COPY works is that PQputCopyData just sends the data to the 
server, and the server will buffer it in its internal buffer and 
processes it when it feels like it. The PQputCopyData() calls don't even 
need to match line boundaries.

I think you'll need to send all the data and finish the COPY until you 
get an error. If you have a lot of data to send, you might want to slice 
it into multiple COPY statements of say 50MB each, so that you can catch 
errors in between.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: PQputCopyData dont signal error

От
Jeff Davis
Дата:
On Thu, 2011-04-14 at 10:50 +0300, Heikki Linnakangas wrote:
> On 14.04.2011 10:15, Pavel Stehule wrote:
> > Hello
> >
> > I have a problem with PQputCopyData function. It doesn't signal some error.
> >
> >         while ((row = mysql_fetch_row(res)) != NULL)
> >         {
> >             snprintf(buffer, sizeof(buffer), "%s%s\n", row[0], row[1]);
> >             copy_result = PQputCopyData(pconn, buffer, strlen(buffer));
> >             printf(">>%s<<\n", PQerrorMessage(pconn));
> >             printf("%d\n", copy_result);
> >             if (copy_result != 1)
> >             {
> >                 fprintf(stderr, "Copy to target table failed: %s",
> >                         PQerrorMessage(pconn));
> >                 EXIT;
> >             }
> >         }
> >
> > it returns 1 for broken values too :(
> >
> > Is necessary some special check?
> 
> The way COPY works is that PQputCopyData just sends the data to the 
> server, and the server will buffer it in its internal buffer and 
> processes it when it feels like it. The PQputCopyData() calls don't even 
> need to match line boundaries.
> 
> I think you'll need to send all the data and finish the COPY until you 
> get an error. If you have a lot of data to send, you might want to slice 
> it into multiple COPY statements of say 50MB each, so that you can catch 
> errors in between.

[ replying to old thread ]

According to the protocol docs[1]:

"In the event of a backend-detected error during copy-in mode (including
receipt of a CopyFail message), the backend will issue an ErrorResponse
message. If the COPY command was issued via an extended-query message,
the backend will now discard frontend messages until a Sync message is
received, then it will issue ReadyForQuery and return to normal
processing. If the COPY command was issued in a simple Query message,
the rest of that message is discarded and ReadyForQuery is issued. In
either case, any subsequent CopyData, CopyDone, or CopyFail messages
issued by the frontend will simply be dropped."

If the remaining CopyData messages are dropped, I don't see why
PQputCopyData can't return some kind of error indicating that further
CopyData messages are useless so that it can stop sending them.

Asking the client to break the copy into multiple COPY commands is bad,
because then the client needs to figure out the line breaks, which is a
burden in many cases.

Certainly we don't want to *guarantee* that the backend will issue an
error at any particular point, because of the buffering on the server
side. But from a practical standpoint, the server will let the client
know fairly quickly and it will avoid a lot of client-side work and
network traffic.

Would a change to PQputCopyData be welcome?

Regards,Jeff Davis

[1] http://www.postgresql.org/docs/9.3/static/protocol-flow.html






Re: PQputCopyData dont signal error

От
Tom Lane
Дата:
Jeff Davis <pgsql@j-davis.com> writes:
> On Thu, 2011-04-14 at 10:50 +0300, Heikki Linnakangas wrote:
>> I think you'll need to send all the data and finish the COPY until you 
>> get an error. If you have a lot of data to send, you might want to slice 
>> it into multiple COPY statements of say 50MB each, so that you can catch 
>> errors in between.

> If the remaining CopyData messages are dropped, I don't see why
> PQputCopyData can't return some kind of error indicating that further
> CopyData messages are useless so that it can stop sending them.

An error from PQputCopyData indicates a connection-level problem,
typically; that is it means that it couldn't send data not that the server
had detected a problem with some earlier data.  I'd not be surprised
if applications would respond to such an error indication by just
pre-emptively dropping the connection.  (A quick trawl through our
own code shows a precedent in pg_restore, in fact.)  So the idea
of having PQputCopyData start failing as soon as an error has arrived
from the server doesn't sound that attractive.

What'd be safer is to provide a way of detecting whether an error
has arrived (without actually consuming it, of course) so that the
case could be handled by adding something like
if (PQerrorIsPending(conn))    break;

to the send-data loop.  This would allow the application code to know
what is happening.  It would also not impose the penalty of checking
for errors on apps that prefer to optimize for the no-error case.

A different approach would be for PQputCopyData to just internally
suppress sending of further CopyData messages, *without* returning
an error; this would be transparent and it would save the network
traffic at least.  However, my reading of the original example is
that the OP was at least as interested in suppressing the creation
of further data as in suppressing the sending, so this wouldn't
really solve his problem completely.  And it's also optimizing
for the error case not the no-error case, in a way that doesn't give
the app any say in the matter.  So I'm not for this particularly,
just mentioning it for completeness.
        regards, tom lane



Re: PQputCopyData dont signal error

От
steve k
Дата:
I realize this is an old thread, but seems to be the only discussion I can
find on this topic "I have a problem with PQputCopyData function. It doesn't
signal some error. "  

I am using from within a c++ program:      PQexec(m_pConn, "COPY... ...FROM stdin"), 
    followed by PQputCopyData(m_pConn, ssQuery.str().c_str(),
ssQuery.str().length()), 
    finished with PQputCopyEnd(m_pConn, NULL).  

Everything that the gentleman that started this thread discussed is still in
force in PostGres 9.3.  Specifically that if some data anomaly within the
data being copied causes the copy to fail, there is no notice, no return
code change, and no way to know if this has happened until you examine the
actual table being copied to itself and see that in fact the copy didn't
work.  

Does anyone know of a way to find out if a copy went wrong so an error can
be posted and the condition noted without having to check the table and
count the records and compare the actual count against how many were
expected to present after the copy.  

When one uses copy from the command line you get very useful feedback about
the quality of your data such as: 
    postgres=# COPY dr_cpdlc(id_pk, fk_guid_machine_run_info_lk, sim_time,
wall_time,...    Enter data to be copied followed by a newline.    End with a backslash and a period on a line by
itself.   >>
 
988|4789|1394686027.050000|1394686027.049000|ASTOR|RECV|ATOS|NASA02|4|47a|7...    >> \.    ERROR:  invalid input syntax
forinteger: "47a"    CONTEXT:  COPY dr_cpdlc, line 1, column minute_of_the_time: "47a"    postgres=# 
 
Does such error reporting functionality exist for using copy from within a
c++ program?  

Does anyone have any ideas or know if this is being worked on?  Am I
completely missing something obvious here?  If not I feel this is a feature
shortcoming that could hopefully be addressed if it hasn't already.  It
prevents PostGres copy from being reliable for use when writing data during
a live simulation run.  

Imagine getting approximately 100 or so people together including Air
Traffic Controllers, Pilots, Researchers, Radio operators, and various other
support staff.  They spend all day (or days) running through various
scenarios which generate say 20 frames a second of data for say a few
hundred simulated aircraft plus all the simulated radio traffic and other
com related data.  After all these people have flown back to where they came
from you notice some missing data.  ---OOPs! :) Now what?  What do you say
to these people that put this experiment together??  Sorry?  

During these simulation runs, things break since each new scenario is
testing a new hypothesis, and people expect to have to stop, recalibrate,
tune/tweak code/settings, etc before the final run and actual data
gathering.  A data anomaly suddenly occurring would be a situation that all
parties would want to resolve before going further.  If you could capture
this event (copy failing) and write it to a log, the source of the bad data
could be addressed before going further.  As things stand currently the vast
reams of data might easily hide the fact that a few thousand or 10,000
records were missing until it was too late to do a rerun.  

Effectively, this precludes the use of PostGres in support of harvesting
data during a simulation run unless someone knows a quicker way to jam data
into a set of relational tables without falling out of realtime.  Sorry to
be so negative but I find it amazing that an rdbms engine as robust as
PostGres seems to have this gaping hole in its capabilities and potential
utilization.  Coming from an Oracle world I was delighted to see the full
functionality offered by PostGres that seems just as good and perhaps better
(no listeners to troubleshoot, no pfile vs. spfile interactions from unusual
shutdowns etc...)

I haven't tried using inserts yet with multiple "values" clauses but have
read this is possible.  However, once the planner gets involved, speed drops
noticeably.  

Thanks to everyone for their time or any feedback.  

Regards, 
Steve K. 





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5797826.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: PQputCopyData dont signal error

От
Tom Lane
Дата:
steve k <steven.c.kohler@nasa.gov> writes:
> I realize this is an old thread, but seems to be the only discussion I can
> find on this topic "I have a problem with PQputCopyData function. It doesn't
> signal some error. "  

PQputCopyData/PQputCopyEnd are only concerned with transferring data.
After you're done with that, you should call PQgetResult to find out the
COPY command's completion status.  You should get PGRES_COMMAND_OK,
otherwise it'll be an error result.

> Does anyone have any ideas or know if this is being worked on?  Am I
> completely missing something obvious here?

Um ... you could have read the manual:
http://www.postgresql.org/docs/9.3/static/libpq-copy.html
Or looked at the source code for psql, which as you noted has no problem
displaying errors for COPY.
        regards, tom lane



Re: PQputCopyData dont signal error

От
David Johnston
Дата:
steve k wrote
> I realize this is an old thread, but seems to be the only discussion I can
> find on this topic "I have a problem with PQputCopyData function. It
> doesn't signal some error. "  
> 
> I am using from within a c++ program:  
>      PQexec(m_pConn, "COPY... ...FROM stdin"), 
> 
>      followed by PQputCopyData(m_pConn, ssQuery.str().c_str(),
> ssQuery.str().length()), 
> 
>      finished with PQputCopyEnd(m_pConn, NULL).  

This may be over simplified but...

To summarize here the PQexec needs a matching PQgetResult.  The PQputCopyEnd
only "closes" the preceding PQputCopyData.  The server is not compelled to
process the copy data until the client says that no more data is coming. 
Once the PQputCopyEnd has returned for practical purposes you back to the
generic command handling API and need to proceed as you would for any other
command - including being prepared to wait for long-running commands and
handle errors.

The request in this thread is for some means for the client to send partial
data and if the server has chosen to process that data (or maybe the client
can compel it to start...) AND has encountered an error then the client can
simply abort the rest of the copy and return an error.  The current API
return values deal with the results of the actual call just performed and
not with any pre-existing state on the server.  Tom's suggestion is to add a
polling method to query the current server state for those who care to
expend the overhead instead of imposing that overhead on all callers.

The C API seems strange to me, a non-C programmer, but at a cursory glance
it is at least internally consistent and does provide lots of flexibility
(especially for sync/async options) at the cost of complexity and having to
make sure that you code in the appropriate PQgetResult checks or risk
ignoring errors and reporting success incorrectly.

David J.





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5797912.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: PQputCopyData dont signal error

От
steve k
Дата:
<http://postgresql.1045698.n5.nabble.com/file/n5798002/PG_man_excerpt.png> 


These were my results:  

<http://postgresql.1045698.n5.nabble.com/file/n5798002/PG_embedded_copy_log_excerpt.png> 


I'd advise anyone contemplating using this feature to seriously seriously
seriously test this and examine your logs after each test run before you
move this feature into your baseline.  Maybe you'll have better luck than I
did.  

For what it's worth I got very good performance results from using INSERT
with multiple values clauses that inserted 1000 records at a time.  

For example on one error test (of many) purposefully attempting to insert
alphabetic data into a numeric field yielded explicit, correct information
about the exact line of data causing the error within the 1000 records
attempting to be inserted.  With this information in hand it would be
eminently feasible to go back to the baseline and examine any recent source 
code updates that might have altered the generation of the data that caused
an error like this.  

Hopefully this helps anyone trying to handle large amounts of data quickly
and wondering what a viable solution might be.  

Best regards to everyone and thank you all for your time,  
Steve K.  



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798002.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: PQputCopyData dont signal error

От
Andrew Dunstan
Дата:
On 03/31/2014 10:18 AM, steve k wrote:
> <http://postgresql.1045698.n5.nabble.com/file/n5798002/PG_man_excerpt.png>
>
>
> These were my results:
>
> <http://postgresql.1045698.n5.nabble.com/file/n5798002/PG_embedded_copy_log_excerpt.png>
>
>
> I'd advise anyone contemplating using this feature to seriously seriously
> seriously test this and examine your logs after each test run before you
> move this feature into your baseline.  Maybe you'll have better luck than I
> did.


You appear not to have understood what you have been told. Specifically 
Tom Lane's remarks:


> PQputCopyData/PQputCopyEnd are only concerned with transferring data.
> After you're done with that, you should call PQgetResult to find out the
> COPY command's completion status.  You should get PGRES_COMMAND_OK,
> otherwise it'll be an error result.



Neither of these is supposed to return -1 if there's a COPY error, only 
if there's a data transfer error.

Plenty of people have made this work just fine.

cheers

andrew



Re: PQputCopyData dont signal error

От
steve k
Дата:
Am I to understand then that I should expect no error feedback if copy fails
because of something like attempting to insert alphabetic into a numeric?  

I apologize for my ignorance, but all my return codes were always successful
(PGRES_COMMAND_OK) even if nothing was copied due to garbage data.  Also,
calling PQgetResult never returned any information either because everything
was always PGRES_COMMAND_OK.  

If that's what is supposed to happen then I have completely missed the boat
and apologize for wasting everyone's time.  

The exact same garbage data test returned specific error related information
if the copy is done with the same sql query from the command line.  This is
what I was trying to get to happen with the embedded sql so that in the
event of bad data, you could stop and determine the source of the bad data,
and not go on assuming you were successfully copying only to find later that
1000 recs here and a 1000 records there were never inserted/copied and to
have no idea or log messages as to why.  

In a sense, I have made this work just fine.  It compiled, ran, and copied
data - as long as all the data was perfect.  I was trying set up error
handling to notify in case there was invalid data contained within the
records being copied.  

Again, I apologize if I was unclear and further apologize for irritating
everyone that has replied thus far.  Thank you all for your time and best
regards.  I am examining other ways to do mass inserts/writes that allow for
notification if some of the data contained within for some reason fails to
copy/insert so that the cause of the bad data can be examined and remedied
as soon as it occurs as well as writing the offending data to a log so that
not all of it is lost.  

Regards, 
Steve K.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798032.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: PQputCopyData dont signal error

От
Alvaro Herrera
Дата:
steve k wrote:

> I am examining other ways to do mass inserts/writes that allow for
> notification if some of the data contained within for some reason fails to
> copy/insert so that the cause of the bad data can be examined and remedied
> as soon as it occurs as well as writing the offending data to a log so that
> not all of it is lost.  

Have you looked at PGLoader?
https://github.com/dimitri/pgloader

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: PQputCopyData dont signal error

От
David Johnston
Дата:
steve k wrote
> Am I to understand then that I should expect no error feedback if copy
> fails because of something like attempting to insert alphabetic into a
> numeric?  
> 
> I apologize for my ignorance, but all my return codes were always
> successful (PGRES_COMMAND_OK) even if nothing was copied due to garbage
> data.  Also, calling PQgetResult never returned any information either
> because everything was always PGRES_COMMAND_OK.  
> 
> If that's what is supposed to happen then I have completely missed the
> boat and apologize for wasting everyone's time.  

In your example you successfully sent an error message to the server and so
PQputCopyEnd does not report an error since it did what it was asked to do. 
Later, when you finish the copy and ask for the error message, you probably
will get the same message that you sent here but you may get a different one
depending on whether the server encountered any other errors before you sent
the explicit error.  Regardless of the message you are guaranteed to get
back an error after calling PQgetResult.

It seems to me that you need to supply a simple C program - along with a
test file that you expect to fail - that never reports an error so that
others may evaluate actual code.  The likelihood of this NOT being human
error is small so we really have to see your actual code since that is most
probably the culprit.

Note that, as Tom mentioned, psql is open source.  If you are trying to
duplicate its behavior in your own code you should probably look to see what
it does.  The fact that you get the proper errors in some cases means that
the server is quite obviously capable of working in the manner you desire
and thus the client - again your specific software - it where any issue
likely resides.

Again, from what I'm reading PQputCopy(Data|End) will not report on data
parsing issues.  You will only see those after issuing PQgetResult - which
is noted in the last paragraph of the PQputCopyEnd documentation excerpt you
provided.  The "put" commands only report whether the sending of the data
was successful.



David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798036.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: PQputCopyData dont signal error

От
steve k
Дата:
Hi Alvaro, 

Thanks for the prompt response.  PGLoader looks like an awesome project and
I especially liked this part:  

/"Handling PostgreSQL errors

Some data will get rejected by PostgreSQL, even after being carefully
prepared by the transformation functions you can attach to pgloader. Then
pgloader parses the PostgreSQL CONTEXT error message that contains the line
number in the batch of where the error did happen.

It's then easy enough to *resend the all the rows from the batch that are
located before the error, skip and log as rejected the faulty row, and
continue*, handling eventual next errors the same way. /"  

I didn't see anything in the documentation about binary files and that is
unfortunately the only thing I have for input currently.  They used binary
files because that was the fastest way to write the data for each of the
frames in the sim without falling out of realtime.  

We're trying to bring some parts of this project more up to date with the
ultimate goal of being able to write directly to the database itself without
falling out of realtime and developing a dashboard for researchers to
monitor during experiments and simulation runs.  

Thanks again for the suggestion and I'll definitely keep PGLoader in mind as
things unfold here on this project.  

Best Regards, 
Steve K.





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798038.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: PQputCopyData dont signal error

От
steve k
Дата:
I started with this:  

DBInsert_excerpts6_test_cpdlc.cpp
<http://postgresql.1045698.n5.nabble.com/file/n5798049/DBInsert_excerpts6_test_cpdlc.cpp>  

Due to a cut and paste error I was originally querying the wrong sequence
which led to me wondering why data wasn't in the table that the log reported
as having successfully been written.  After I fixed the sequence query error
and got done laughing at myself for being so ignorant I realized I better
see what happened if there was actually bad data and not just user
stupidity.  

The data is in this log file excerpt.    raptor_efb2_excerpts.out
<http://postgresql.1045698.n5.nabble.com/file/n5798049/raptor_efb2_excerpts.out>  
This excerpt is from a run that copied as expected with perfect data.  

The test was to modify line 36 in the code excerpt from: 
<<                               "|" << dataRec.getMinute()
to:              
<<                               "|" << dataRec.getMinute() << "a"

which put alphabetic data into the data which should have been rejected as
nonnumeric.  No error was raised and the log indicated that there should be
20 rows of data in the cpdlc table except there weren't.  

By the time I gave up on this I had so many friggin log statements in these
2 methods that there was more logging than actual useful code.  Results
never changed though.  No errors and no messages when I appended the "a"
onto numeric data destined for a numeric field.  This was naturally
unsettling so I turned to message boards to see if anyone had similar
experiences or knew a work around or better way.  I'm taking the time to
give you all this information because I want to learn from it despite the
potential for acute public embarassment.  Maybe others will get something
too.  It's how we all get ahead.  At any rate, thanks for your time and I
look forward to what you find.  

Steve K.   




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798049.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: PQputCopyData dont signal error

От
David Johnston
Дата:
steve k wrote
> I started with this:  
> DBInsert_excerpts6_test_cpdlc.cpp
> <http://postgresql.1045698.n5.nabble.com/file/n5798049/DBInsert_excerpts6_test_cpdlc.cpp>  
> 

Can you point out to me where in that code you've followed this instruction
from the documentation:

"After successfully calling PQputCopyEnd, call PQgetResult to obtain the
final result status of the COPY command. One can wait for this result to be
available in the usual way. Then return to normal operation."

Since I cannot seem to find it.

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798077.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: PQputCopyData dont signal error

От
steve k
Дата:
David, 

The code I posted is what I started/finished with.  As I tried to figure
things out I kept adding PQgetResult, PQresultStatus and
PQresultErrorMessage  calls and writing the results to the log file.  By the
time I was done it was so messy that I stripped all that messaging/logging
back out and made sure it ran with good data and moved on to something else
because I couldn't waste any more time on it and figured I'd look it over
again if the insert with multiple values clauses was a bust too (which
thankfully it wasn't).  

When I got the multiple values clause insert replacements going and compared
processing times and error handling I dumped all the old debugging version
stuff because that's not the kind of thing you commit to version control. 
In the end I didn't think it mattered because I wasn't going to use any of
it since I had no way to know if bad data actually didn't get written by
virtue of its being bad data.  And no return code ever indicated anything
was awry. 

I'd love to see an actual working example where an executing C++ program was
able to in fact determine that copy data containing bad data that was sent
by the C++ program was rejected by the server and subsequently the C++
program was able to raise/log/notify specifically which block of data failed
and then log information about it.  However, all I ever got was
PGRES_COMMAND_OK whether or not the data that was sent was rejected as
containing bad data or not.  Effectively these were silent failures.  

Sorry I can't provide more information but I do appreciate your time.  If
you can't get any further with it I understand and don't expect another
reply.  

Regards, 
Steve K. 





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798104.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: PQputCopyData dont signal error

От
David Johnston
Дата:
steve k wrote
> Sorry I can't provide more information but I do appreciate your time.  If
> you can't get any further with it I understand and don't expect another
> reply.  

For the benefit of others I'm reading this as basically you've found a
better way to do this so you are no longer concerned with correcting the
broken (incomplete) code you have provided.

It should be as simple as adding one more if statement between the copy-end
check and the overall failure check to see whether the copy command itself
failed in addition to the existing checks to see if sending the data or
ending the data send failed.

I will not pretend to write c code but the general logic and requirements
seems quite clear from the documentation I have read/shown.

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798115.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: PQputCopyData dont signal error

От
Michael Paquier
Дата:
On Tue, Apr 1, 2014 at 1:33 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> steve k wrote:
>
>> I am examining other ways to do mass inserts/writes that allow for
>> notification if some of the data contained within for some reason fails to
>> copy/insert so that the cause of the bad data can be examined and remedied
>> as soon as it occurs as well as writing the offending data to a log so that
>> not all of it is lost.
>
> Have you looked at PGLoader?
> https://github.com/dimitri/pgloader
Or pg_bulkload? It does exactly what you are looking for, aka
filtering rows that failed with COPY, on top of other things.
http://pgbulkload.projects.pgfoundry.org/pg_bulkload.html
-- 
Michael



Re: PQputCopyData dont signal error

От
Robert Haas
Дата:
On Mon, Mar 31, 2014 at 4:21 PM, steve k <steven.c.kohler@nasa.gov> wrote:
> I'd love to see an actual working example where an executing C++ program was
> able to in fact determine that copy data containing bad data that was sent
> by the C++ program was rejected by the server and subsequently the C++
> program was able to raise/log/notify specifically which block of data failed
> and then log information about it.  However, all I ever got was
> PGRES_COMMAND_OK whether or not the data that was sent was rejected as
> containing bad data or not.  Effectively these were silent failures.

With all respect, you're doing someting wrong.  There's plenty of
working C code that does just this, including some I have written.
You made the comment upthread that you found it "amazing that an rdbms
engine as robust as PostGres seems to have this gaping hole in its
capabilities" - and you're right, that would be remarkable.  It would
mean, for example, that users wouldn't be able to know whether their
backups restored OK.  But it turns out that psql and pg_restore can
detect this kind of problem just fine, which means your code should be
able to do the same, if it's written correctly.  So the problem is not
that PostgreSQL doesn't have this capability; it's that you have a bug
in your code.  I can't tell you what the bug is, because I haven't
seen or tried to analyze your code, but I *can* tell you that when
things work for other people and not for you, that's a sign that
you've goofed, not a sign that the feature is missing.

Admittedly, the libpq interface is somewhat confusing, and I often
find it necessary to refer to existing examples of code when trying to
figure out how to do things correctly.  We've been maintaining
backward compatibility for a really long time, and have accumulated a
number of warts along the way, and I'm not sure how much like the
current design things would look if we started over from scratch.  So
if you want to say, hey, this interface is confusing, and it's too
hard to figure out how to use it, I'd have a hard time disagreeing
with that.  But if you want to say, COPY error detection is impossible
under all circumstances because my code for COPY error detection
doesn't work, well, no.  Because psql and other utilities do the same
task just fine using the exact same interfaces.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PQputCopyData dont signal error

От
steve k
Дата:
Thanks Robert, 

I'm already there.  Obviously I'm the only one in the room that didn't get
the memo.  I've had some time to reflect on what might be done differently,
just not any time to try it.  If I get it to work I'll let everyone know. 
The code I was working with went away when the Network admins pushed
something that forced me to reboot and close all my temp file windows last
Friday.  Sorry for any troubles I've caused you all and I didn't mean to put
everyone on the defensive.  

It has occurred to me that I may have been examining the wrong results set. 
One of the things you mentioned is "I often find it necessary to refer to
existing examples of code when trying to figure out how to do things
correctly".  I couldn't agree more.  Haven't seen one yet, but found plenty
of discussion that tap danced around one or more of the components of the
copy, put, end paradigm.  Maybe I should have just asked for a sample code
snippet but didn't after a day or so of frustration and trying to piece
together other people's incomplete samples.  It seems that none of the
discussion threads I looked at (doesn't mean there aren't any - before
everyone gets worked up) where people  were having similar questions also
never offered a working solution.  So I don't know if those folks gave up or
figured it out on their own.   In the end it comes down to how much time do
you have to google, read through a thread, find out that discussion thread
really has nothing to do with your topic of interest, repeat, finally try
something different, repeat?  Again, my apologies for lighting a fire under
everyone.  

Regards, 
Steve K. 



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/PQputCopyData-dont-signal-error-tp4302340p5798202.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: PQputCopyData dont signal error

От
Robert Haas
Дата:
On Tue, Apr 1, 2014 at 12:14 PM, steve k <steven.c.kohler@nasa.gov> wrote:
> I'm already there.  Obviously I'm the only one in the room that didn't get
> the memo.  I've had some time to reflect on what might be done differently,
> just not any time to try it.  If I get it to work I'll let everyone know.
> The code I was working with went away when the Network admins pushed
> something that forced me to reboot and close all my temp file windows last
> Friday.  Sorry for any troubles I've caused you all and I didn't mean to put
> everyone on the defensive.

No problem.

> It has occurred to me that I may have been examining the wrong results set.

That definitely seems possible.  It is easier than it might be to mess
that up; there's really nothing in the API to warn you if you've made
that mistake.  And I've been there myself.

> One of the things you mentioned is "I often find it necessary to refer to
> existing examples of code when trying to figure out how to do things
> correctly".  I couldn't agree more.  Haven't seen one yet, but found plenty
> of discussion that tap danced around one or more of the components of the
> copy, put, end paradigm.  Maybe I should have just asked for a sample code
> snippet but didn't after a day or so of frustration and trying to piece
> together other people's incomplete samples.

FWIW, I've generally found that the best examples are what's in the
core distribution.  I'd go and look at a tool like psql or pg_restore
and find the code that handles this, and then copy it and cut it down
to what you need.  You could go around looking for other snippets on
the Internet that are more self-contained, but there's too much chance
that they're actually wrong.  The code that implements the existing
core tools is more likely to be good code - not that it can never have
any bugs, but it gets a lot of exercise in real-world deployments, so
if something really obvious like error detection is broken then we can
be pretty sure a user will complain.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PQputCopyData dont signal error

От
Martijn van Oosterhout
Дата:
On Tue, Apr 01, 2014 at 01:53:13PM -0400, Robert Haas wrote:
> > One of the things you mentioned is "I often find it necessary to refer to
> > existing examples of code when trying to figure out how to do things
> > correctly".  I couldn't agree more.  Haven't seen one yet, but found plenty
> > of discussion that tap danced around one or more of the components of the
> > copy, put, end paradigm.  Maybe I should have just asked for a sample code
> > snippet but didn't after a day or so of frustration and trying to piece
> > together other people's incomplete samples.
>
> FWIW, I've generally found that the best examples are what's in the
> core distribution.  I'd go and look at a tool like psql or pg_restore
> and find the code that handles this, and then copy it and cut it down
> to what you need.

To move the conversation along:

https://github.com/postgres/postgres/blob/master/src/bin/psql/copy.c#L664

Seems possibly even more robust than most people will code, but it's
had a lot of real world testing.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.  -- Arthur Schopenhauer

Re: PQputCopyData dont signal error

От
Tom Lane
Дата:
Martijn van Oosterhout <kleptog@svana.org> writes:
> To move the conversation along:

> https://github.com/postgres/postgres/blob/master/src/bin/psql/copy.c#L664

> Seems possibly even more robust than most people will code, but it's
> had a lot of real world testing.

Note that the looping behavior there is actually rather new, and is
probably unnecessary for 99% of applications; it would depend on what
your ambitions are for dealing gracefully with connection loss (and
even then, I think we were forced into this by pre-existing decisions
elsewhere in psql about where and how we handle connection loss).

The important thing for this question is the error reporting that
occurs at lines 692-694.
        regards, tom lane