Обсуждение: Incorrect "invalid AM/PM string" error from to_timestamp
I'm running this on 8.4devel built from CVS HEAD as of 9:44 AM Moutain Daylight time today, on a 32-bit Ubuntu 7.04 box. This is a completely new database. jtolley@polonium:~/devel/raybo$ psql raybo -f test.sql CREATE TABLE INSERT 0 1 INSERT 0 1 psql:test.sql:21: ERROR: invalid AM/PM string psql:test.sql:22: ERROR: invalid AM/PM string The script in question is below: CREATE TABLE ft ( id1 numeric(13,0), am1 numeric(13,2), am1 numeric(13,2), ct timestamp without time zone ); -- -- The commands below come from a dataset I was loading. While trying to reduce the problem to a small test case, I found that various combinations of -- -- the available columns showed the error, and others did not. The commands below are one set that don't display the error. -- INSERT INTO ft (am1, am2, ct) VALUES (45749.25, 45749.25, to_timestamp('21-SEP-08 03.26.04.378954 AM', 'DD-MON-YY HH.MI.SS.US AM')); -- INSERT INTO ft (am1, am2, ct) VALUES (60, 60, to_timestamp('21-SEP-08 08.15.42.950620 PM', 'DD-MON-YY HH.MI.SS.US AM')); -- INSERT INTO ft (am1, am2, ct) VALUES (46.17, 46.17, to_timestamp('21-SEP-08 08.15.42.950620 PM', 'DD-MON-YY HH.MI.SS.US AM')); -- INSERT INTO ft (am1, am2, ct) VALUES (61.55, 61.55, to_timestamp('21-SEP-08 03.26.04.350738 AM', 'DD-MON-YY HH.MI.SS.US AM')); -- -- This set of commands shows the error. Note that running these statements in a different order, or inserting other commands into the statements, or other -- -- such modifications may very well prevent the error from occurring. INSERT INTO ft (id1, am1, ct) VALUES (6924257, 45749.25, to_timestamp('21-SEP-08 03.26.04.378954 AM', 'DD-MON-YY HH.MI.SS.US AM')); INSERT INTO ft (id1, am1, ct) VALUES (6924611, 60, to_timestamp('21-SEP-08 08.15.42.950620 PM', 'DD-MON-YY HH.MI.SS.US AM')); INSERT INTO ft (id1, am1, ct) VALUES (6924612, 46.17, to_timestamp('21-SEP-08 08.15.42.950620 PM', 'DD-MON-YY HH.MI.SS.US AM')); INSERT INTO ft (id1, am1, ct) VALUES (6924254, 61.55, to_timestamp('21-SEP-08 03.26.04.350738 AM', 'DD-MON-YY HH.MI.SS.US AM')); -- Josh Tolley / eggyknap
"Joshua Tolley" <eggyknap@gmail.com> writes: > I'm running this on 8.4devel built from CVS HEAD as of 9:44 AM Moutain > Daylight time today, on a 32-bit Ubuntu 7.04 box. This is a completely > new database. Ugh, apparently there's some state-dependent bug in the recent to_timestamp fixes: regression=# \c - psql (8.4devel) You are now connected to database "regression". regression=# select to_timestamp('21-SEP-08 08.15.42.950620 PM', 'DD-MON-YY HH.MI.SS.US AM'); to_timestamp ------------------------------ 2008-09-21 08:15:42.95062-04 (1 row) regression=# select to_timestamp('21-SEP-08 08.15.42.950620 PM', 'DD-MON-YY HH.MI.SS.US AM'); ERROR: invalid AM/PM string regression=# Yes, those are the same command. The error is repeatable on subsequent executions in the same session. A likely bet is that this is caused by use of uninitialized memory, which happens to have garbage rather than zeroes in it the second time through. Brendan, do you have time to look into this? BTW, the error message is pretty unhelpful: ISTM it really ought to show the value it's complaining of. regards, tom lane
On Thu, Sep 25, 2008 at 10:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > A likely bet is that this is caused by use of uninitialized memory, > which happens to have garbage rather than zeroes in it the second > time through. Yep both DCH_MC and DCH_US were going past the end of the string because they still added the length of the string where from_char_parse_int_len takes care of that for us now... The attach patch fixes it and tries to improve the "invalid AM/PM string" a bit by showing the string.
On Thu, Sep 25, 2008 at 4:05 PM, Alex Hunsaker <badalex@gmail.com> wrote: > On Thu, Sep 25, 2008 at 10:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> A likely bet is that this is caused by use of uninitialized memory, >> which happens to have garbage rather than zeroes in it the second >> time through. > > Yep both DCH_MC and DCH_US were going past the end of the string > because they still added the length of the string where > from_char_parse_int_len takes care of that for us now... > > The attach patch fixes it and tries to improve the "invalid AM/PM > string" a bit by showing the string. [Actually attaches the patch...]
Вложения
On Fri, Sep 26, 2008 at 8:05 AM, Alex Hunsaker <badalex@gmail.com> wrote: > On Thu, Sep 25, 2008 at 10:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> A likely bet is that this is caused by use of uninitialized memory, >> which happens to have garbage rather than zeroes in it the second >> time through. > > Yep both DCH_MC and DCH_US were going past the end of the string > because they still added the length of the string where > from_char_parse_int_len takes care of that for us now... > Nice catch. I think your patch does the right thing here by pulling "len" out of those lines which advance the pointer after calling from_char_parse_int_len. > The attach patch fixes it and tries to improve the "invalid AM/PM > string" a bit by showing the string. > Not so sure about this part ... because it just spits out the variable "s", it will show whatever is left in the string at the time the macro is called. That works okay when the AM/PM string is at the end of the pattern, but ends up looking pretty weird otherwise: postgres=# select to_timestamp('11:47 Pm 26 Sep 2008', 'HH:MI AM DD Mon YYYY'); ERROR: invalid AM/PM string for 'Pm 26 Sep 2008' I have some thoughts on this and other issues surrounding AM/PM, but perhaps they are better reserved for a separate thread. Might I suggest we apply Alex's bugfix and hold off on the message changes pending further discussion? Cheers, BJ
"Brendan Jurd" <direvus@gmail.com> writes: > I have some thoughts on this and other issues surrounding AM/PM, but > perhaps they are better reserved for a separate thread. Might I > suggest we apply Alex's bugfix and hold off on the message changes > pending further discussion? Agreed on separating the message issue. What I wanted to know was whether there are similar bugs elsewhere in to_timestamp, or whether you're pretty sure this is the only occurrence of the coding pattern? I've always found it helpful to think "where else did we make this same mistake" after identifying a bug. regards, tom lane
On Fri, Sep 26, 2008 at 8:11 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Brendan Jurd" <direvus@gmail.com> writes: >> I have some thoughts on this and other issues surrounding AM/PM, but >> perhaps they are better reserved for a separate thread. Might I >> suggest we apply Alex's bugfix and hold off on the message changes >> pending further discussion? > > Agreed on separating the message issue. What I wanted to know was > whether there are similar bugs elsewhere in to_timestamp, or whether > you're pretty sure this is the only occurrence of the coding pattern? > I've always found it helpful to think "where else did we make this > same mistake" after identifying a bug. > > regards, tom lane > I guess we know this already, but for the sake of trying to appear involved, I've confirmed this fixes the problem as I was seeing it. +1 for the message being a bit more specific, though it's a definite improvement as is. - Josh PS> My apologies for sending this directly to Tom instead of to the list the first time I sent it.
On Sat, Sep 27, 2008 at 12:11 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Agreed on separating the message issue. What I wanted to know was > whether there are similar bugs elsewhere in to_timestamp, or whether > you're pretty sure this is the only occurrence of the coding pattern? I'm pretty sure that "MS" and "US" were the only occurrences of this particular coding pattern, and Alex's fix hits both of those. I've just reviewed all of the sites in DCH_from_char() which advance the pointer (with Alex's patch applied). All have the standard "s += SKIP_THth(n->suffix)", with the single exception of "Y,YYY", which needs its own special logic. Cheers, BJ
"Brendan Jurd" <direvus@gmail.com> writes: > On Sat, Sep 27, 2008 at 12:11 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Agreed on separating the message issue. What I wanted to know was >> whether there are similar bugs elsewhere in to_timestamp, or whether >> you're pretty sure this is the only occurrence of the coding pattern? > I'm pretty sure that "MS" and "US" were the only occurrences of this > particular coding pattern, and Alex's fix hits both of those. OK, applied (without the message change, pending further discussion). regards, tom lane