Обсуждение: initdb failure (was Re: [GENERAL] sequence's plpgsql)
Tim McAuley <mcauleyt@tcd.ie> writes:
> Another question now. I am unable to compile Postgresql 7.4 beta 3 under 
> cygwin (Windows 2K, using cgyipc 2).
> I am getting the error:
> "
> creating information schema... ERROR:  end-of-copy marker does not match 
> previous newline style
> CONTEXT:  COPY FROM, line 361
> "
That's interesting.  COPY is complaining because the \. terminator in
the file it's been fed has a different kind of newline after it than
the newlines earlier in the file (LF vs CR/LF, no doubt).  The part
of the initdb script that must be causing this is
 echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name, is_supported,
comments)FROM STDIN;" cat "$datadir"/sql_features.txt echo "\."
 
) \| "$PGPATH"/postgres $PGSQL_OPT template1 > /dev/null || exit_nicely
echo "ok"
so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt.  A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer.  Peter, any thoughts?
        regards, tom lane
			
		On Fri, 2003-09-26 at 11:01, Tom Lane wrote: > so it appears that cygwin's "echo" generates a different newline style > than what got put into sql_features.txt. A possible way to fix this is > to put the "\." line into sql_features.txt, but maybe there's a cleaner > answer. Peter, any thoughts? Does cygwin still have the install time option of what type of line feed to use? I know at one point (a long time ago) when I installed cygwin, and chose windows line feeds (CRLF) that it caused problems with several applications. So the problem might be that with CYGWIN you could have either type of line feed depending on what the user selected during install.
On Fri, Sep 26, 2003 at 11:01:34AM -0400, Tom Lane wrote: > > I am getting the error: > > " > > creating information schema... ERROR: end-of-copy marker does not match > > previous newline style > > CONTEXT: COPY FROM, line 361 > > " > > That's interesting. COPY is complaining because the \. terminator in > the file it's been fed has a different kind of newline after it than > the newlines earlier in the file (LF vs CR/LF, no doubt). The part Well actually I'm not so sure. I tried installing 7.4 from Oliver's Debian packages and experienced exactly the same. We have yet to find out why, but I doubt newlines changed between his Debian system and mine. > of the initdb script that must be causing this is > > echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name, is_supported,comments) FROM STDIN;" > cat "$datadir"/sql_features.txt > echo "\." > ) \ > | "$PGPATH"/postgres $PGSQL_OPT template1 > /dev/null || exit_nicely > echo "ok" > > so it appears that cygwin's "echo" generates a different newline style > than what got put into sql_features.txt. A possible way to fix this is > to put the "\." line into sql_features.txt, but maybe there's a cleaner > answer. Peter, any thoughts? Putting the "\." line into sql_features.txt did help me, too. Michael -- Michael Meskes Email: Michael at Fam-Meskes dot De ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!
Tom Lane writes: > so it appears that cygwin's "echo" generates a different newline style > than what got put into sql_features.txt. A possible way to fix this is > to put the "\." line into sql_features.txt, but maybe there's a cleaner > answer. Peter, any thoughts? There's no clean answer to this on Cygwin. This specific case is just a little problem that we could solve locally, but in general you'll just end up annoying people if you require them to use consistent line endings on Cygwin. -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut wrote: > Tom Lane writes: > > > so it appears that cygwin's "echo" generates a different newline style > > than what got put into sql_features.txt. A possible way to fix this is > > to put the "\." line into sql_features.txt, but maybe there's a cleaner > > answer. Peter, any thoughts? > > There's no clean answer to this on Cygwin. This specific case is just a > little problem that we could solve locally, but in general you'll just end > up annoying people if you require them to use consistent line endings on > Cygwin. This error is coming from the new 7.4 COPY code that allows \r\n as a line terminator. Requiring the end-of-line to be consistent seemed to be the only way to be sure we were not eating a literal carriage return in the data stream. Let's put the "\." into sql_features.txt and see if that fixes it, or can we use echo -c "\.\n" in initdb? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> so it appears that cygwin's "echo" generates a different newline style
>> than what got put into sql_features.txt.  A possible way to fix this is
>> to put the "\." line into sql_features.txt, but maybe there's a cleaner
>> answer.  Peter, any thoughts?
> There's no clean answer to this on Cygwin.  This specific case is just a
> little problem that we could solve locally, but in general you'll just end
> up annoying people if you require them to use consistent line endings on
> Cygwin.
Yeah, I was wondering whether you wouldn't propose dropping the newline
consistency check.  I'm not very comfortable with that, but maybe we
should.  Bruce?
        regards, tom lane
			
		Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > Tom Lane writes: > >> so it appears that cygwin's "echo" generates a different newline style > >> than what got put into sql_features.txt. A possible way to fix this is > >> to put the "\." line into sql_features.txt, but maybe there's a cleaner > >> answer. Peter, any thoughts? > > > There's no clean answer to this on Cygwin. This specific case is just a > > little problem that we could solve locally, but in general you'll just end > > up annoying people if you require them to use consistent line endings on > > Cygwin. > > Yeah, I was wondering whether you wouldn't propose dropping the newline > consistency check. I'm not very comfortable with that, but maybe we > should. Bruce? I posted on that a few minutes ago. Yea, we can drop it, but we risk eating carraige returns as data values. I am not sure how consistently we output literal carriage returns in old dumps, nor how many apps produce on literal carriage returns in COPY. If we conditionally eat them, we run the risk of discarding some of their data without warning. Perhaps we can throw a warning rather than an error, and adjust initdb to be consistent. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: >I posted on that a few minutes ago. Yea, we can drop it, but we risk >eating carraige returns as data values. I am not sure how consistently >we output literal carriage returns in old dumps, nor how many apps >produce on literal carriage returns in COPY. If we conditionally eat >them, we run the risk of discarding some of their data without warning. >Perhaps we can throw a warning rather than an error, and adjust initdb >to be consistent. > > > Would the best longterm solution be to require escaping CR in a data value? (Yes I know this might cause backwards compatibility problems, at least for a while). cheers andrew
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Perhaps we can throw a warning rather than an error, and adjust initdb
> to be consistent.
I like the idea of reducing the newline consistency check to a warning.
There is one thing we'd have to watch for though (it's already an issue
but would become a bigger one): client-side COPY code had better be
prepared to absorb backend Notice messages while processing COPY IN.
Currently libpq doesn't read input data at all during a COPY IN loop,
which means that if the COPY generates more than a few K of warning
messages, the backend gets blocked on a full pipe and the whole
operation locks up.  I have been meaning to fix that in libpq anyway,
but what other client libraries might have the same issue?  Anyone know
whether JDBC would need a similar fix?
            regards, tom lane
			
		Andrew Dunstan wrote: > Bruce Momjian wrote: > > >I posted on that a few minutes ago. Yea, we can drop it, but we risk > >eating carraige returns as data values. I am not sure how consistently > >we output literal carriage returns in old dumps, nor how many apps > >produce on literal carriage returns in COPY. If we conditionally eat > >them, we run the risk of discarding some of their data without warning. > >Perhaps we can throw a warning rather than an error, and adjust initdb > >to be consistent. > > > > > > > > Would the best longterm solution be to require escaping CR in a data > value? (Yes I know this might cause backwards compatibility problems, at > least for a while). Yes, we do have that documented, but we supported it in earlier releases, so we can't be sure who is using it, and it probably exists in older dumps. The particular problem is not literal carriage returns, which we don't support any more, but carriage returns that happen to be at the end of the line, right up against \n. If we relax this, we will conditionally strip off the \r. In fact, we can do that now: create table xx(text); Right now, if all the column value end with \r, we will silently eat it. But if some have \r and some do not, we will throw an error. (Above, when I say \r, I mean literal carriage return, not the "\r" string, which we have always handled cleanly. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Perhaps we can throw a warning rather than an error, and adjust initdb > > to be consistent. > > I like the idea of reducing the newline consistency check to a warning. > There is one thing we'd have to watch for though (it's already an issue > but would become a bigger one): client-side COPY code had better be > prepared to absorb backend Notice messages while processing COPY IN. > Currently libpq doesn't read input data at all during a COPY IN loop, > which means that if the COPY generates more than a few K of warning > messages, the backend gets blocked on a full pipe and the whole > operation locks up. I have been meaning to fix that in libpq anyway, > but what other client libraries might have the same issue? Anyone know > whether JDBC would need a similar fix? Wow, that sounds big. The ERROR will only happen once, while the WARNING could happen a lot --- we could add code to throw the WARNING only once per COPY command --- that would probably make sense. I don't see how we could get all clients to handle this for 7.4, particularly clients from previous releases. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Peter Eisentraut wrote: > Tom Lane writes: > > > so it appears that cygwin's "echo" generates a different newline style > > than what got put into sql_features.txt. A possible way to fix this is > > to put the "\." line into sql_features.txt, but maybe there's a cleaner > > answer. Peter, any thoughts? > > There's no clean answer to this on Cygwin. This specific case is just a > little problem that we could solve locally, but in general you'll just end > up annoying people if you require them to use consistent line endings on > Cygwin. Here is a little diff to make initdb behave if we decide to keep the COPY check. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 Index: src/bin/initdb/initdb.sh =================================================================== RCS file: /cvsroot/pgsql-server/src/bin/initdb/initdb.sh,v retrieving revision 1.201 diff -c -c -r1.201 initdb.sh *** src/bin/initdb/initdb.sh 7 Sep 2003 03:36:03 -0000 1.201 --- src/bin/initdb/initdb.sh 26 Sep 2003 18:23:16 -0000 *************** *** 1087,1094 **** echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name, is_supported,comments) FROM STDIN;" cat "$datadir"/sql_features.txt echo "\." ! ) \ ! | "$PGPATH"/postgres $PGSQL_OPT template1 > /dev/null || exit_nicely echo "ok" $ECHO_N "vacuuming database template1... "$ECHO_C --- 1087,1095 ---- echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name, is_supported,comments) FROM STDIN;" cat "$datadir"/sql_features.txt echo "\." ! ) | ! tr -d '\r' | # make newlines consistent for Win32 ! "$PGPATH"/postgres $PGSQL_OPT template1 > /dev/null || exit_nicely echo "ok" $ECHO_N "vacuuming database template1... "$ECHO_C
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> ... we could add code to throw the WARNING
> only once per COPY command --- that would probably make sense.
Seems like a bit of a kluge, but perhaps the best compromise.  It would
be quite likely that you'd get the same warning on many lines of a COPY,
and that probably isn't really going to help people.
            regards, tom lane
			
		Tom Lane writes: > Yeah, I was wondering whether you wouldn't propose dropping the newline > consistency check. I'm not very comfortable with that, but maybe we > should. Bruce? I don't mind if we keep it on pure-POSIX platforms. But one of the nicer developments on Windows in recent(?) times is that you can actually use any kind of line separator and most programs still work correctly (with the notable exception of Notepad). The drawback of this is that you never really know what kind of line separator a program is actually going to write, especially when you start mixing Cygwin, MinGW, and native. So this newline consistency check is not only going to be pretty annoying on Windows, it's going to cancel a feature of the operating system environment. -- Peter Eisentraut peter_e@gmx.net
On Fri, 26 Sep 2003, Bruce Momjian wrote: > Tom Lane wrote: > > Peter Eisentraut <peter_e@gmx.net> writes: > > > Tom Lane writes: > > >> so it appears that cygwin's "echo" generates a different newline style > > >> than what got put into sql_features.txt. A possible way to fix this is > > >> to put the "\." line into sql_features.txt, but maybe there's a cleaner > > >> answer. Peter, any thoughts? > > > > > There's no clean answer to this on Cygwin. This specific case is just a > > > little problem that we could solve locally, but in general you'll just end > > > up annoying people if you require them to use consistent line endings on > > > Cygwin. > > > > Yeah, I was wondering whether you wouldn't propose dropping the newline > > consistency check. I'm not very comfortable with that, but maybe we > > should. Bruce? > > I posted on that a few minutes ago. Yea, we can drop it, but we risk > eating carraige returns as data values. I am not sure how consistently > we output literal carriage returns in old dumps, nor how many apps > produce on literal carriage returns in COPY. If we conditionally eat > them, we run the risk of discarding some of their data without warning. > Perhaps we can throw a warning rather than an error, and adjust initdb > to be consistent. I'm running into issues where 7.4's pg_dump/pg_dumpall from a 7.2 database to a 7.4beta3 database is producing some errors like this: ERROR: literal newline found in data HINT: Use "\n" to represent newline. CONTEXT: COPY FROM, line 59 ERROR: literal carriage return found in data HINT: Use "\r" to represent carriage return. CONTEXT: COPY FROM, line 41 These show up with little or no context, only the "line number" of the dump file. Since I'm wrapping these up in pg_dumpall, I don't have the dump file so I don't know where the error is really occuring. It would be nice to have such occurances echo the table / row they are getting the error on, or maybe just the first 20 or so characters, so they'd be easier to identify. Is this related to this issue?
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > ... we could add code to throw the WARNING > > only once per COPY command --- that would probably make sense. > > Seems like a bit of a kluge, but perhaps the best compromise. It would > be quite likely that you'd get the same warning on many lines of a COPY, > and that probably isn't really going to help people. I can't think of any cases where we throw _tons_ of warnings at people, like we could with COPY. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
"scott.marlowe" <scott.marlowe@ihs.com> writes:
> I'm running into issues where 7.4's pg_dump/pg_dumpall from a 7.2 database 
> to a 7.4beta3 database is producing some errors like this:
> ERROR:  literal newline found in data
> HINT:  Use "\n" to represent newline.
> CONTEXT:  COPY FROM, line 59
> ERROR:  literal carriage return found in data
> HINT:  Use "\r" to represent carriage return.
> CONTEXT:  COPY FROM, line 41
Really?  7.2 should dump data \r or \n as the backslash versions ...
and does in my tests.  Can you make a reproducible test case?
> It would be nice to have such occurances echo the table / row they are
> getting the error on, or maybe just the first 20 or so characters, so
> they'd be easier to identify.
That's not a bad idea.  I think it would be fairly easy now for the
CONTEXT line of the error message to include the input data line:
CONTEXT:  COPY FROM, line 41: "data here ...."
at least up through the field where the error gets thrown, and with some
limit on the length of the data that will get echoed.  If people like
that idea I'll see about making it happen.
        regards, tom lane
			
		Tom Lane wrote: > > ERROR: literal carriage return found in data > > HINT: Use "\r" to represent carriage return. > > CONTEXT: COPY FROM, line 41 > > Really? 7.2 should dump data \r or \n as the backslash versions ... > and does in my tests. Can you make a reproducible test case? > > > > > It would be nice to have such occurances echo the table / row they are > > getting the error on, or maybe just the first 20 or so characters, so > > they'd be easier to identify. > > That's not a bad idea. I think it would be fairly easy now for the > CONTEXT line of the error message to include the input data line: > > CONTEXT: COPY FROM, line 41: "data here ...." > > at least up through the field where the error gets thrown, and with some > limit on the length of the data that will get echoed. If people like > that idea I'll see about making it happen. Also, he wanted table name too. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> Yeah, I was wondering whether you wouldn't propose dropping the newline
>> consistency check.  I'm not very comfortable with that, but maybe we
>> should.  Bruce?
> I don't mind if we keep it on pure-POSIX platforms.  But one of the nicer
> developments on Windows in recent(?) times is that you can actually use
> any kind of line separator and most programs still work correctly (with
> the notable exception of Notepad).
Not sure if we should make the behavior Windows-specific though.  And
didn't Michael report seeing the same initdb failure on Debian?  That
confuses me a bit --- why would there be a newline discrepancy on Debian?
        regards, tom lane
			
		Tom Lane wrote: >>I don't mind if we keep it on pure-POSIX platforms. But one of the nicer >>developments on Windows in recent(?) times is that you can actually use >>any kind of line separator and most programs still work correctly (with >>the notable exception of Notepad). >> >> > >Not sure if we should make the behavior Windows-specific though. And >didn't Michael report seeing the same initdb failure on Debian? That >confuses me a bit --- why would there be a newline discrepancy on Debian? > > > Is this affected by the fact that the platform where a dump is made might have a different line-end discipline from the one where the dump is restored? cheers andrew
On Fri, 26 Sep 2003, Tom Lane wrote: > "scott.marlowe" <scott.marlowe@ihs.com> writes: > > I'm running into issues where 7.4's pg_dump/pg_dumpall from a 7.2 database > > to a 7.4beta3 database is producing some errors like this: > > > ERROR: literal newline found in data > > HINT: Use "\n" to represent newline. > > CONTEXT: COPY FROM, line 59 > > > ERROR: literal carriage return found in data > > HINT: Use "\r" to represent carriage return. > > CONTEXT: COPY FROM, line 41 > > Really? 7.2 should dump data \r or \n as the backslash versions ... > and does in my tests. Can you make a reproducible test case? The attached file produces this problem. Note it's a blank trailing field that looks to be causing it. The error for this .sql file is: ERROR: literal carriage return found in data HINT: Use "\r" to represent carriage return. CONTEXT: COPY FROM, line 2 Note that loading this into pico and saving it back out fixes the problem. If I remove the preceding row that doesn't end in a blank field, I get a different error, this one: ERROR: end-of-copy marker does not match previous newline style CONTEXT: COPY FROM, line 2 > > It would be nice to have such occurances echo the table / row they are > > getting the error on, or maybe just the first 20 or so characters, so > > they'd be easier to identify. > > That's not a bad idea. I think it would be fairly easy now for the > CONTEXT line of the error message to include the input data line: > > CONTEXT: COPY FROM, line 41: "data here ...." > > at least up through the field where the error gets thrown, and with some > limit on the length of the data that will get echoed. If people like > that idea I'll see about making it happen. table name too, like Bruce said. The bothersome bit is that in pg_dump, it says the line, relative to just this part of the copy command, so you don't even know which table is giving the error.
scott.marlowe wrote: > The attached file produces this problem. Note it's a blank trailing field > that looks to be causing it. The error for this .sql file is: > > ERROR: literal carriage return found in data > HINT: Use "\r" to represent carriage return. > CONTEXT: COPY FROM, line 2 > > Note that loading this into pico and saving it back out fixes the problem. > > If I remove the preceding row that doesn't end in a blank field, I get a > different error, this one: > > ERROR: end-of-copy marker does not match previous newline style > CONTEXT: COPY FROM, line 2 OK, 'vi' shows it as:COPY people2 (id, persons) FROM stdin;59 Chance Terry--S60 ^M\. which is _exactly the case the error was supposed to catch. Now, the big question is where did this dump come from? Pg version? OS platform? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > scott.marlowe wrote: > > The attached file produces this problem. Note it's a blank trailing field > > that looks to be causing it. The error for this .sql file is: > > > > ERROR: literal carriage return found in data > > HINT: Use "\r" to represent carriage return. > > CONTEXT: COPY FROM, line 2 > > > > Note that loading this into pico and saving it back out fixes the problem. > > > > If I remove the preceding row that doesn't end in a blank field, I get a > > different error, this one: > > > > ERROR: end-of-copy marker does not match previous newline style > > CONTEXT: COPY FROM, line 2 > > OK, 'vi' shows it as: > > COPY people2 (id, persons) FROM stdin; > 59 Chance Terry--S > 60 ^M > \. The significant issue here is that we don't want to throw a warning in this case; it should be an error --- this is clearly a data value we don't want to discard as part of the end-of-line. The argument that you want a warning because you might have mixed newlines in the file seems less likely than this case where they are using a literal carriage return as a data value at the end of the line. Seems we should just fix initdb as my patch does and see if we get any more mixed-newline cases or literal carriage-return data cases. My guess is that we will get more of the latter. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Fri, 26 Sep 2003, Bruce Momjian wrote: > scott.marlowe wrote: > > The attached file produces this problem. Note it's a blank trailing field > > that looks to be causing it. The error for this .sql file is: > > > > ERROR: literal carriage return found in data > > HINT: Use "\r" to represent carriage return. > > CONTEXT: COPY FROM, line 2 > > > > Note that loading this into pico and saving it back out fixes the problem. > > > > If I remove the preceding row that doesn't end in a blank field, I get a > > different error, this one: > > > > ERROR: end-of-copy marker does not match previous newline style > > CONTEXT: COPY FROM, line 2 > > OK, 'vi' shows it as: > > COPY people2 (id, persons) FROM stdin; > 59 Chance Terry--S > 60 ^M > \. > > which is _exactly the case the error was supposed to catch. Now, the > big question is where did this dump come from? Pg version? OS platform? The originating system is a RedHat 7.2 box with postgresql 7.2.x running on it. The destination system is a RedHat 7.2 box with postgresql 7.4 beta3 running on it. The data likely came out of a (gasp, horrors) windows box.
scott.marlowe wrote:
> > OK, 'vi' shows it as:
> >     
> >     COPY people2 (id, persons) FROM stdin;
> >     59      Chance Terry--S
> >     60      ^M
> >     \.
> > 
> > which is _exactly the case the error was supposed to catch.  Now, the
> > big question is where did this dump come from?  Pg version?  OS platform?
> 
> The originating system is a RedHat 7.2 box with postgresql 7.2.x running 
> on it.
> 
> The destination system is a RedHat 7.2 box with postgresql 7.4 beta3 
> running on it.
> 
> The data likely came out of a (gasp, horrors) windows box.
OK, try this on your 7.2:
test=> create table test(x text);CREATE TABLEtest=> insert into test values ('\r');INSERT 17158 1test=> copy test to
'/tmp/out';COPY
Then 'vi' /tmp/out. It should show \r, not ^M.  Please report back.
--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 
			
		On Fri, Sep 26, 2003 at 03:05:58PM -0400, Tom Lane wrote: > Not sure if we should make the behavior Windows-specific though. And > didn't Michael report seeing the same initdb failure on Debian? That > confuses me a bit --- why would there be a newline discrepancy on Debian? I take it there are no special characters in that COPY, right? The only difference between the two machines we found was that my was running under UTF-8. Michael -- Michael Meskes Email: Michael at Fam-Meskes dot De ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!
On Fri, 26 Sep 2003, Bruce Momjian wrote:
> scott.marlowe wrote:
> > > OK, 'vi' shows it as:
> > >     
> > >     COPY people2 (id, persons) FROM stdin;
> > >     59      Chance Terry--S
> > >     60      ^M
> > >     \.
> > > 
> > > which is _exactly the case the error was supposed to catch.  Now, the
> > > big question is where did this dump come from?  Pg version?  OS platform?
> > 
> > The originating system is a RedHat 7.2 box with postgresql 7.2.x running 
> > on it.
> > 
> > The destination system is a RedHat 7.2 box with postgresql 7.4 beta3 
> > running on it.
> > 
> > The data likely came out of a (gasp, horrors) windows box.
> 
> OK, try this on your 7.2:
> 
>     test=> create table test(x text);
>     CREATE TABLE
>     test=> insert into test values ('\r');
>     INSERT 17158 1
>     test=> copy test to '/tmp/out';
>     COPY
> 
> Then 'vi' /tmp/out. It should show \r, not ^M.  Please report back.
I'm not much of a vi guy, so the out file thing didn't tell me much, but 
if I try and dump it from the 7.4 beta3 box into a new table, I get:
ERROR:  end-of-copy marker does not match previous newline style
CONTEXT:  COPY FROM, line 2
			
		On Fri, 26 Sep 2003, Bruce Momjian wrote:
> scott.marlowe wrote:
> > > OK, 'vi' shows it as:
> > >     
> > >     COPY people2 (id, persons) FROM stdin;
> > >     59      Chance Terry--S
> > >     60      ^M
> > >     \.
> > > 
> > > which is _exactly the case the error was supposed to catch.  Now, the
> > > big question is where did this dump come from?  Pg version?  OS platform?
> > 
> > The originating system is a RedHat 7.2 box with postgresql 7.2.x running 
> > on it.
> > 
> > The destination system is a RedHat 7.2 box with postgresql 7.4 beta3 
> > running on it.
> > 
> > The data likely came out of a (gasp, horrors) windows box.
> 
> OK, try this on your 7.2:
> 
>     test=> create table test(x text);
>     CREATE TABLE
>     test=> insert into test values ('\r');
>     INSERT 17158 1
>     test=> copy test to '/tmp/out';
>     COPY
> 
> Then 'vi' /tmp/out. It should show \r, not ^M.  Please report back.
Figured out iv (-b switch, man pages rock) and it's a ^M
			
		scott.marlowe writes: > table name too, like Bruce said. The bothersome bit is that in pg_dump, > it says the line, relative to just this part of the copy command, so you > don't even know which table is giving the error. I don't see the problem. Can't you identify the failing command by the line number that psql gives you? -- Peter Eisentraut peter_e@gmx.net
Bruce Momjian writes: > The argument that you want a warning because you might have mixed > newlines in the file seems less likely than this case where they are > using a literal carriage return as a data value at the end of the line. I don't agree with that assessment. Who actually has CRs in their data? -- Peter Eisentraut peter_e@gmx.net
On Fri, 26 Sep 2003, Peter Eisentraut wrote: > Bruce Momjian writes: > > > The argument that you want a warning because you might have mixed > > newlines in the file seems less likely than this case where they are > > using a literal carriage return as a data value at the end of the line. > > I don't agree with that assessment. Who actually has CRs in their data? I do. And so do lots of other people.
On Fri, 26 Sep 2003, Peter Eisentraut wrote: > scott.marlowe writes: > > > table name too, like Bruce said. The bothersome bit is that in pg_dump, > > it says the line, relative to just this part of the copy command, so you > > don't even know which table is giving the error. > > I don't see the problem. Can't you identify the failing command by the > line number that psql gives you? OK, here's the output from pg_dump while it's running ERROR: function "odbc_user" already exists with same argument types REVOKE REVOKE GRANT You are now connected as new user "ayousuff". ERROR: literal newline found in data HINT: Use "\n" to represent newline. CONTEXT: COPY FROM, line 59 You are now connected as new user "smarlowe". You are now connected as new user "ayousuff". CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX You are now connected as new user "smarlowe". CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX So how am I supposed to figure out which table and which row broke? Keep in mind, I'm doing this: pg_dump -h otherserver dbname|psql -h desthost dbname but I get basically the same thing if I dump it to a .sql file and do: psql dbname <dbname.sql
--On Friday, September 26, 2003 14:36:08 -0600 "scott.marlowe" <scott.marlowe@ihs.com> wrote: > On Fri, 26 Sep 2003, Peter Eisentraut wrote: > >> Bruce Momjian writes: >> >> > The argument that you want a warning because you might have mixed >> > newlines in the file seems less likely than this case where they are >> > using a literal carriage return as a data value at the end of the line. >> >> I don't agree with that assessment. Who actually has CRs in their data? > > I do. And so do lots of other people. I store E-Mail type comments in some of my DB's and they may contain cr's. Don't ASSUME that we don't do stuff like this with a database. -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: ler@lerctr.org US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
scott.marlowe writes: > but I get basically the same thing if I dump it to a .sql file and do: > > psql dbname <dbname.sql Use psql -f dbname.sql instead. -- Peter Eisentraut peter_e@gmx.net
On Fri, 26 Sep 2003, Peter Eisentraut wrote: > scott.marlowe writes: > > > but I get basically the same thing if I dump it to a .sql file and do: > > > > psql dbname <dbname.sql > > Use psql -f dbname.sql instead. and the output is: psql:webport.sql:803: ERROR: function "odbc_user" already exists with same argument types REVOKE REVOKE GRANT You are now connected as new user "ayousuff". psql:webport.sql:869: ERROR: literal newline found in data HINT: Use "\n" to represent newline. CONTEXT: COPY FROM, line 59 You are now connected as new user "smarlowe". You are now connected as new user "ayousuff". CREATE INDEX CREATE INDEX CREATE INDEX the same. It doesn't tell me which table in my dump caused the problem, and it certainly isn't line 59 of the dump file, but of the table producing the error.
Peter Eisentraut <peter_e@gmx.net> writes:
> scott.marlowe writes:
>> but I get basically the same thing if I dump it to a .sql file and do:
>> psql dbname <dbname.sql
> Use psql -f dbname.sql instead.
This doesn't seem like a good argument not to add more information to
the CONTEXT line for COPY errors.  Sure, in theory the existing info
should be sufficient, but what if the information is not coming in
through psql?  (For instance, maybe the COPY data is being generated
on-the-fly by some other program.)  Or what if the dump file is so large
you can't easily edit it to determine which line number is in question?
There are plenty of scenarios where it's not all that convenient to
triangulate on a problem from outside information.  Minimalism isn't
really a virtue in error reports anyway.
I'm thinking maybe:
CONTEXT:  COPY tablename, line 41: "data ..."
would serve the purpose nicely.
        regards, tom lane
			
		On Fri, 26 Sep 2003, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > scott.marlowe writes: > >> but I get basically the same thing if I dump it to a .sql file and do: > >> psql dbname <dbname.sql > > > Use psql -f dbname.sql instead. > > This doesn't seem like a good argument not to add more information to > the CONTEXT line for COPY errors. Sure, in theory the existing info > should be sufficient, but what if the information is not coming in > through psql? (For instance, maybe the COPY data is being generated > on-the-fly by some other program.) Or what if the dump file is so large > you can't easily edit it to determine which line number is in question? > There are plenty of scenarios where it's not all that convenient to > triangulate on a problem from outside information. Minimalism isn't > really a virtue in error reports anyway. > > I'm thinking maybe: > > CONTEXT: COPY tablename, line 41: "data ..." > > would serve the purpose nicely. Yeah, just having the table name and line number would be plenty for me. It's the lack of a table name that makes it so frustrating. I had to basically dump / restore the tables one at a time to figure out which one was causing the error. On a database with hundreds of tables, that could be painful.
Peter Eisentraut <peter_e@gmx.net> writes:
> Bruce Momjian writes:
>> The argument that you want a warning because you might have mixed
>> newlines in the file seems less likely than this case where they are
>> using a literal carriage return as a data value at the end of the line.
> I don't agree with that assessment.  Who actually has CRs in their data?
The question isn't so much "who has CRs in their data" as "who is trying
to import data files in which CRs aren't correctly represented as \r" ?
Not anyone upgrading from a recent PG release ... though 7.1 or before
would have an issue.
If we eliminate the mixed-newline check then we will silently discard
CRs that appear at the tail end of the last column of a data line.  This
is somewhat worrisome because the failure will not be obvious.  But in
most cases (CR in the middle of a data field, or in a field that isn't
last), we will detect and report an error due to field count mismatch.
I think that that might be good enough.
If we were designing COPY from scratch today, we'd certainly not think
that a mixed-newline check was appropriate; it's only because of our
historical handling of literal CRs that the question comes up at all.
        regards, tom lane
			
		scott.marlowe writes: > psql:webport.sql:803: ERROR: function "odbc_user" already exists with > same argument types > REVOKE > REVOKE > GRANT > You are now connected as new user "ayousuff". > psql:webport.sql:869: ERROR: literal newline found in data > HINT: Use "\n" to represent newline. > CONTEXT: COPY FROM, line 59 > You are now connected as new user "smarlowe". > You are now connected as new user "ayousuff". > CREATE INDEX > CREATE INDEX > CREATE INDEX > > > the same. It doesn't tell me which table in my dump caused the problem, > and it certainly isn't line 59 of the dump file, but of the table > producing the error. The table is the one the COPY command on line 869 in file webport.sql mentions and the problem data is on line 928 of that file. -- Peter Eisentraut peter_e@gmx.net
I said:
> The question isn't so much "who has CRs in their data" as "who is trying
> to import data files in which CRs aren't correctly represented as \r" ?
> Not anyone upgrading from a recent PG release ... though 7.1 or before
> would have an issue.
Actually, checking the CVS logs shows that 7.2.1 was the first release
that would emit a data CR or LF as \r or \n rather than literally.  If
Scott is dumping from an original 7.2 installation rather than a dot-
release, that would explain his report.
I talked to Bruce about this on the phone, and we now both feel that
that change was too recent to assume that people won't be trying to load
dumps containing bare CRs into 7.4.  Accordingly, it seems that we do
need to throw an error or at least a warning for mixed newlines.
Eventually we can get rid of that behavior and just treat all newline
sequences alike, but I think we can't do it until 7.2.0 is in the
"ancient history" category.
        regards, tom lane
			
		Tom Lane writes: > This doesn't seem like a good argument not to add more information to > the CONTEXT line for COPY errors. Sure, in theory the existing info > should be sufficient, but what if the information is not coming in > through psql? (For instance, maybe the COPY data is being generated > on-the-fly by some other program.) Then you look into the code of that other program. Or you look into the server log, where you should log the statements you generate if you are testing your code. > Or what if the dump file is so large you can't easily edit it to > determine which line number is in question? The line number is already contained in the available information about the error. If the file is too large to load into an editor, you could use perl -npi -e '$. == 123456 and s/\r/\\r/;' or something along those lines. > There are plenty of scenarios where it's not all that convenient to > triangulate on a problem from outside information. Maybe, but the ones I've seen mentioned so far are not among them. I'm not opposed to making errors more easy to identify, but considering that someone else in this thread didn't even know about psql's option to print line numbers of errors, I think some people haven't done their homework yet. > Minimalism isn't really a virtue in error reports anyway. > > I'm thinking maybe: > > CONTEXT: COPY tablename, line 41: "data ..." > > would serve the purpose nicely. The only thing that would really help in the general case is the number of the character where the error occurs. If you print the actual data you lose if the data is repeated within the, er, data and/or if the section of the data that you print contains crazy characters that mess up the display. -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut wrote: > scott.marlowe writes: > > > table name too, like Bruce said. The bothersome bit is that in pg_dump, > > it says the line, relative to just this part of the copy command, so you > > don't even know which table is giving the error. > > I don't see the problem. Can't you identify the failing command by the > line number that psql gives you? If you are loading from pg_dump, you have lots of copy commands, so how do you know which COPY command caused the failure. You just have the line number of _a_ copy. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Peter Eisentraut wrote: > Bruce Momjian writes: > > > The argument that you want a warning because you might have mixed > > newlines in the file seems less likely than this case where they are > > using a literal carriage return as a data value at the end of the line. > > I don't agree with that assessment. Who actually has CRs in their data? Uh, people storing Win32 text documents. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Peter Eisentraut <peter_e@gmx.net> writes:
>> Minimalism isn't really a virtue in error reports anyway.
>> I'm thinking maybe:
>> CONTEXT:  COPY tablename, line 41: "data ..."
>> would serve the purpose nicely.
> The only thing that would really help in the general case is the number of
> the character where the error occurs.  If you print the actual data you
> lose if the data is repeated within the, er, data and/or if the section of
> the data that you print contains crazy characters that mess up the
> display.
A messed-up display would be useful information in itself; and it's not
like we have no precedent for repeating the erroneous data in error
messages.  Consider pg_atoi:
regression=# create table foo (f1 int);
CREATE TABLE
regression=# copy foo from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> z44
>> \.
ERROR:  invalid input syntax for integer: "z44"
CONTEXT:  COPY FROM, line 1
regression=#
I was chewing this over with Bruce on the phone just now, and we refined
the idea a little.  Some errors (primarily those detected inside the
datatype input procedures) can be clearly traced to a specific column,
whereas others (such as too many fields on an input line) really can't
be nailed down more tightly than a line.  So we were thinking about two
different flavors of context info:
CONTEXT:  COPY tablename, line n, field colname: "column data"
versus
CONTEXT:  COPY tablename, line n: "line data"
where in each case the data display would be truncated at 100 characters
or so (and would have to be omitted in the COPY BINARY case anyway).
If you're really concerned about funny characters messing up the report,
we could imagine replacing them by backslash sequences or some such, but
I suspect that would create more confusion than it would solve.
        regards, tom lane
			
		Peter Eisentraut wrote: > Tom Lane writes: > > > so it appears that cygwin's "echo" generates a different newline style > > than what got put into sql_features.txt. A possible way to fix this is > > to put the "\." line into sql_features.txt, but maybe there's a cleaner > > answer. Peter, any thoughts? > > There's no clean answer to this on Cygwin. This specific case is just a > little problem that we could solve locally, but in general you'll just end > up annoying people if you require them to use consistent line endings on > Cygwin. At this point, we have one mixed-EOL case (initdb, fixable), and one \r in data case (from 7.2). Seems the safest solution is to fix initdb and see what other failure reports we get. If we downgrade it to a warning, we will not know about the failures; someone could miss a warning in a dump, but probably will not mess an error and an empty table. If we get more reports like initdb, we can reevaluate, but it seems the safest course is to keep our existing code, which only removes carriage returns when it quite sure. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Fri, 26 Sep 2003, Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Perhaps we can throw a warning rather than an error, and adjust initdb > > to be consistent. > > I like the idea of reducing the newline consistency check to a warning. > There is one thing we'd have to watch for though (it's already an issue > but would become a bigger one): client-side COPY code had better be > prepared to absorb backend Notice messages while processing COPY IN. > Currently libpq doesn't read input data at all during a COPY IN loop, > which means that if the COPY generates more than a few K of warning > messages, the backend gets blocked on a full pipe and the whole > operation locks up. I have been meaning to fix that in libpq anyway, > but what other client libraries might have the same issue? Anyone know > whether JDBC would need a similar fix? > JDBC does not support the COPY protocol at the moment. Kris Jurka
On Fri, 26 Sep 2003, Tom Lane wrote: > I was chewing this over with Bruce on the phone just now, and we refined > the idea a little. Some errors (primarily those detected inside the > datatype input procedures) can be clearly traced to a specific column, > whereas others (such as too many fields on an input line) really can't > be nailed down more tightly than a line. So we were thinking about two > different flavors of context info: > > CONTEXT: COPY tablename, line n, field colname: "column data" > > versus > > CONTEXT: COPY tablename, line n: "line data" Those would be very helpful bits of information. > where in each case the data display would be truncated at 100 characters > or so (and would have to be omitted in the COPY BINARY case anyway). > > If you're really concerned about funny characters messing up the report, > we could imagine replacing them by backslash sequences or some such, but > I suspect that would create more confusion than it would solve. I hate to mention it, but would it be useful/non-overkill to make either of those things (context message maximum length and funny character escaping) configurable somehow? Jon
Jon Jensen wrote: > On Fri, 26 Sep 2003, Tom Lane wrote: > > > I was chewing this over with Bruce on the phone just now, and we refined > > the idea a little. Some errors (primarily those detected inside the > > datatype input procedures) can be clearly traced to a specific column, > > whereas others (such as too many fields on an input line) really can't > > be nailed down more tightly than a line. So we were thinking about two > > different flavors of context info: > > > > CONTEXT: COPY tablename, line n, field colname: "column data" > > > > versus > > > > CONTEXT: COPY tablename, line n: "line data" > > Those would be very helpful bits of information. > > > where in each case the data display would be truncated at 100 characters > > or so (and would have to be omitted in the COPY BINARY case anyway). > > > > If you're really concerned about funny characters messing up the report, > > we could imagine replacing them by backslash sequences or some such, but > > I suspect that would create more confusion than it would solve. > > I hate to mention it, but would it be useful/non-overkill to make either > of those things (context message maximum length and funny character > escaping) configurable somehow? Overkill. Sorry. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Fri, 2003-09-26 at 23:18, Bruce Momjian wrote: > If you are loading from pg_dump, you have lots of copy commands, so how > do you know which COPY command caused the failure. You just have the > line number of _a_ copy. I would recommend using psql -e so that the sql commands are output too. Here is a documentation patch: Index: backup.sgml =================================================================== RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/backup.sgml,v retrieving revision 2.30 diff -c -r2.30 backup.sgml *** backup.sgml 31 Aug 2003 17:32:18 -0000 2.30 --- backup.sgml 27 Sep 2003 05:45:04 -0000 *************** *** 100,106 **** be read in by the <application>psql</application> program. The general command form to restore adump is <synopsis> ! psql <replaceable class="parameter">dbname</replaceable> < <replaceable class="parameter">infile</replaceable> </synopsis> where <replaceable class="parameter">infile</replaceable> is what you used as <replaceable class="parameter">outfile</replaceable> --- 100,106 ---- be read in by the <application>psql</application> program. The general command form to restore adump is <synopsis> ! psql -f <replaceable class="parameter">infile</replaceable> -d <replaceable class="parameter">dbname</replaceable> </synopsis> where <replaceable class="parameter">infile</replaceable> is what you used as <replaceable class="parameter">outfile</replaceable> *************** *** 112,117 **** --- 112,124 ---- <application>psql</> supports similar options to <application>pg_dump</> for controlling the databaseserver location and the user name. See its reference page for more information. + </para> + + <para> + With a large dump, it may be difficult to identify where any errors are + occurring. You may use the -e option to psql to print the SQL commands + as they are run, so that it is easy to see precisely which commands are + causing errors. </para> <para> -- Oliver Elphick Oliver.Elphick@lfix.co.uk Isle of Wight, UK http://www.lfix.co.uk/oliver GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C ======================================== "My brethren, count it all joy when ye fall into various trials, Knowing thatthe testing of your faith produces endurance." James 1:2,3
Jon Jensen writes: > I hate to mention it, but would it be useful/non-overkill to make either > of those things (context message maximum length and funny character > escaping) configurable somehow? I'm thinking the same. We have conflicting needs of different groups: Users on "real" Unix platforms converting old data need the checks, users on Windows platforms with new data don't want them. (I wonder what people on Mac OS that mix Unix-style and native applications would want.) -- Peter Eisentraut peter_e@gmx.net
Oliver Elphick writes: > + <para> > + With a large dump, it may be difficult to identify where any errors are > + occurring. You may use the -e option to psql to print the SQL commands > + as they are run, so that it is easy to see precisely which commands are > + causing errors. > </para> That is just not true. If you use -f, it will tell you the line number of the command causing the error. Add the line number of the COPY error message, there you have it. -- Peter Eisentraut peter_e@gmx.net
Bruce Momjian wrote: > Peter Eisentraut wrote: > > Tom Lane writes: > > > > > so it appears that cygwin's "echo" generates a different newline style > > > than what got put into sql_features.txt. A possible way to fix this is > > > to put the "\." line into sql_features.txt, but maybe there's a cleaner > > > answer. Peter, any thoughts? > > > > There's no clean answer to this on Cygwin. This specific case is just a > > little problem that we could solve locally, but in general you'll just end > > up annoying people if you require them to use consistent line endings on > > Cygwin. > > At this point, we have one mixed-EOL case (initdb, fixable), and one \r > in data case (from 7.2). Seems the safest solution is to fix initdb and > see what other failure reports we get. > > If we downgrade it to a warning, we will not know about the failures; > someone could miss a warning in a dump, but probably will not mess an > error and an empty table. If we get more reports like initdb, we can > reevaluate, but it seems the safest course is to keep our existing code, > which only removes carriage returns when it quite sure. OK, I have applied a patch to make initdb create a COPY file with consistent end-of-line termination, so it passes the COPY EOL checking. If we find more cases where load files have intended inconsistent EOL termination, we can revisit the COPY error for inconsistent EOL termination. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Peter Eisentraut wrote: > Oliver Elphick writes: > > > + <para> > > + With a large dump, it may be difficult to identify where any errors are > > + occurring. You may use the -e option to psql to print the SQL commands > > + as they are run, so that it is easy to see precisely which commands are > > + causing errors. > > </para> > > That is just not true. If you use -f, it will tell you the line number of > the command causing the error. Add the line number of the COPY error > message, there you have it. You are assuming it is easy to find what is on a specific line of the dump file. I am not sure that is always easy for people with limited Unix skills, or MSWin folks. I am not sure I would have thought to add the file offset to find the problem COPY line. I guess I would have eventually, but it wouldn't have been my first idea, and I might _not_ have used -f on the load, and if the load took an hour, I would have to run it again to get that line number. I can see the point that the table name is only really valuable when you are loading a dump, and not valuabvle when you are just doing a copy. However, copy is used enought in dumps that the exta word (the table name) doesn't see to hurt. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian writes: > You are assuming it is easy to find what is on a specific line of the > dump file. I am not sure that is always easy for people with limited > Unix skills, or MSWin folks. I am not sure I would have thought to add > the file offset to find the problem COPY line. I guess I would have > eventually, but it wouldn't have been my first idea, and I might _not_ > have used -f on the load, and if the load took an hour, I would have to > run it again to get that line number. That is all besides the point. If adding -f to the command line is for some reason prohibitive, then the same applies to -e. That is all. -- Peter Eisentraut peter_e@gmx.net
Bruce Momjian writes: > OK, I have applied a patch to make initdb create a COPY file with > consistent end-of-line termination, so it passes the COPY EOL checking. Maybe instead we can change the COPY command to read the file directly and not via stdin? Then we don't need the \. marker. -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut wrote: > Bruce Momjian writes: > > > You are assuming it is easy to find what is on a specific line of the > > dump file. I am not sure that is always easy for people with limited > > Unix skills, or MSWin folks. I am not sure I would have thought to add > > the file offset to find the problem COPY line. I guess I would have > > eventually, but it wouldn't have been my first idea, and I might _not_ > > have used -f on the load, and if the load took an hour, I would have to > > run it again to get that line number. > > That is all besides the point. If adding -f to the command line is for > some reason prohibitive, then the same applies to -e. That is all. I see, both -e give query before error, -f gives line number before error. I suppose the -e is clearer because you don't have to find the line in the file, but the -e output makes it more likely they would miss an error line in the output. Seems we should recommend -f rather than "<" for restores anyway, right? Reporting the table with the error is clearer, but this brings up another case --- what happens with pg_dumpall? Do we print the database name or will they know the database name from the table name? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Peter Eisentraut wrote: > Bruce Momjian writes: > > > OK, I have applied a patch to make initdb create a COPY file with > > consistent end-of-line termination, so it passes the COPY EOL checking. > > Maybe instead we can change the COPY command to read the file directly and > not via stdin? Then we don't need the \. marker. Yes, we need a temp table then. The single-line fix seemed easier. Also, the file is a mix of things from a file and echo's, it isn't just the \.: echo "UPDATE information_schema.sql_implementation_info SET character_value = '$combined_version' WHERE implementation_info_name= 'DBMS VERSION';" echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id,sub_feature_name, is_supported, comments) FROM STDIN;" cat "$datadir"/sql_features.txt echo "\.") |tr -d'\r' | # make newlines consistent for Win32 -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Sat, Sep 27, 2003 at 01:08:37PM -0400, Bruce Momjian wrote: > Peter Eisentraut wrote: > > That is all besides the point. If adding -f to the command line is for > > some reason prohibitive, then the same applies to -e. That is all. > > Seems we should recommend -f rather than "<" for restores anyway, right? Please keep in mind that it is not always possible to use -f. In my case I had a file larger than 2GB and psql was compiled without large file support. Detecting an error in that situation would have required recompiling psql. Also I don't see the point in not extending the context message of the error. It's not like it's going to take too much processing power, nor screen estate, so what is it? -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Right now the sectors on the hard disk run clockwise, but I heard a rumor that you can squeeze 0.2% more throughput by running them counterclockwise. It's worth the effort. Recommended." (Gerry Pourwelle)
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Peter Eisentraut wrote:
>> Maybe instead we can change the COPY command to read the file directly and
>> not via stdin?  Then we don't need the \. marker.
I like that idea too.
> Yes, we need a temp table then.  The single-line fix seemed easier.
Why a temp table?  The COPY command is only sourcing the contents of
"$datadir"/sql_features.txt.
        regards, tom lane
			
		Peter Eisentraut <peter_e@gmx.net> writes:
> Jon Jensen writes:
>> I hate to mention it, but would it be useful/non-overkill to make either
>> of those things (context message maximum length and funny character
>> escaping) configurable somehow?
> I'm thinking the same.  We have conflicting needs of different groups:
> Users on "real" Unix platforms converting old data need the checks, users
> on Windows platforms with new data don't want them.  (I wonder what people
> on Mac OS that mix Unix-style and native applications would want.)
I wouldn't object to adding a GUC variable that turns off the
mixed-newlines error (as long as it defaults to enabled ;-)).
Our intention to remove the check in the future might then reduce
to simply flipping the factory default for this variable.
        regards, tom lane
			
		Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Peter Eisentraut wrote: > >> Maybe instead we can change the COPY command to read the file directly and > >> not via stdin? Then we don't need the \. marker. > > I like that idea too. > > > Yes, we need a temp table then. The single-line fix seemed easier. > > Why a temp table? The COPY command is only sourcing the contents of > "$datadir"/sql_features.txt. Sorry, I meant temp file. I have SQL on the brain. :-) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
pgman@candle.pha.pa.us (Bruce Momjian) writes: > Peter Eisentraut wrote: >> Bruce Momjian writes: >> >> > The argument that you want a warning because you might have mixed >> > newlines in the file seems less likely than this case where they are >> > using a literal carriage return as a data value at the end of the line. >> >> I don't agree with that assessment. Who actually has CRs in their data? > > Uh, people storing Win32 text documents. Or people storing XML data that may contain all sorts of whitespace that may include various inter-line "metadata." -- let name="cbbrowne" and tld="libertyrms.info" in String.concat "@" [name;tld];; <http://dev6.int.libertyrms.com/> Christopher Browne (416) 646 3304 x124 (land)