Обсуждение: A creepy story about dates. How to prevent it?
Hi.
I will explain my experience to prevent other to fall in the same problem.
We  upgraded from 7.2 to 7.3 few days ago. We reload the data via pg_dump an
psql -i <file_dumped>.
The pg_dump was made with PGDATESTYLE=SQL, European (dd/mm/yyyy).
When we load the database we didn't configure te pgdestyle yet, then it was
the default (mm/dd/yy). We populate the data successfully.
Some days later we catch on that some dates are correct and some not. Some
dates appears with day and month exchanged. In this case Postgres
flexibility  turn  against us.
The load proces did this:
    INSERT INTO taula VALUES('12/01/2003,..
    INSERT INTO taula VALUES('13/01/2003,..
    SELECT fecha FROM taula:
    fecha
    --------
    12/01/03
    01/13/03 !!!!!
I suggest that maybe the pg_dump has to store the datestyle in order to
prevent this problems, no?
Regards,
Conxita Marín
			
		=?iso-8859-1?Q?Conxita_Mar=EDn?= <comarin@telefonica.net> writes:
> I suggest that maybe the pg_dump has to store the datestyle in order to
> prevent this problems, no?
In 7.3.3 and later, pg_dump forces ISO datestyle while dumping, to
forestall exactly this problem.  Sorry that you got bit.
When you're doing an update, it's often a good idea to use the newer
version's pg_dump to extract the data from the older database.  We
keep pg_dump compatible with back versions (presently it handles
anything back to 7.0) so that you can take advantage of bugfixes in
newer pg_dumps.  Like this one ...
            regards, tom lane
			
		On Wed, 18 Jun 2003, Tom Lane wrote: > =?iso-8859-1?Q?Conxita_Mar=EDn?= <comarin@telefonica.net> writes: > > I suggest that maybe the pg_dump has to store the datestyle in order to > > prevent this problems, no? > > In 7.3.3 and later, pg_dump forces ISO datestyle while dumping, to > forestall exactly this problem. Sorry that you got bit. > > When you're doing an update, it's often a good idea to use the newer > version's pg_dump to extract the data from the older database. We > keep pg_dump compatible with back versions (presently it handles > anything back to 7.0) so that you can take advantage of bugfixes in > newer pg_dumps. Like this one ... That reminds me, did we get the date parsing fixed so that you can't insert 22/03/2003 into a european database (or conversely, 03/22/2003 into a US database) ? I.e the problem where the date parser assumed you meant it the other way...
"scott.marlowe" <scott.marlowe@ihs.com> writes:
> That reminds me, did we get the date parsing fixed so that you can't
> insert 22/03/2003 into a european database (or conversely, 03/22/2003 into
> a US database) ?  I.e the problem where the date parser assumed you meant
> it the other way...
IIRC, there was no consensus that that's a bug.
            regards, tom lane
			
		On Wed, 18 Jun 2003, Tom Lane wrote: > "scott.marlowe" <scott.marlowe@ihs.com> writes: > > That reminds me, did we get the date parsing fixed so that you can't > > insert 22/03/2003 into a european database (or conversely, 03/22/2003 into > > a US database) ? I.e the problem where the date parser assumed you meant > > it the other way... > > IIRC, there was no consensus that that's a bug. I thought there was, and someone had said they were gonna fix it. IMHO it is a bug. We don't let postgresql "guess" about a lot of more obvious things (i.e. int4 to int8 casting, etc...) and letting it guess about dates makes it non-ACID compliant. If it isn't a bug, how do I implement a check constraint to stop it from happening? I'd like to know my database accepts properly formatted input and rejects the rest. That's what the C in ACID means, right?
"scott.marlowe" <scott.marlowe@ihs.com> writes:
> IMHO it is a bug.  We don't let postgresql "guess" about a lot of more
> obvious things (i.e. int4 to int8 casting, etc...) and letting it guess
> about dates makes it non-ACID compliant.
How do you arrive at that conclusion?
> If it isn't a bug, how do I implement a check constraint to stop it from
> happening?  I'd like to know my database accepts properly formatted input
> and rejects the rest.  That's what the C in ACID means, right?
Do the checking in your application.  Something you think is improperly
formatted probably shouldn't get to the database in the first place.
If you aren't doing any format checking at all, you're possibly
vulnerable to SQL injection attacks.
I do now seem to recall an agreement that a GUC switch to disable
date-interpretation guessing would be okay, though.
            regards, tom lane
			
		On Wed, 18 Jun 2003, Tom Lane wrote: > "scott.marlowe" <scott.marlowe@ihs.com> writes: > > IMHO it is a bug. We don't let postgresql "guess" about a lot of more > > obvious things (i.e. int4 to int8 casting, etc...) and letting it guess > > about dates makes it non-ACID compliant. > > How do you arrive at that conclusion? The same way I come to all my conclusions, logic. :-) but seriously... Why not accept a date of 04/44/2003 and just wrap it into May? It's the same kind of thing. I told my database where I live, and expect it to only accept dates that are valid in my locale. If a user feeds it a date that isn't right, I expect the database to error out. > > If it isn't a bug, how do I implement a check constraint to stop it from > > happening? I'd like to know my database accepts properly formatted input > > and rejects the rest. That's what the C in ACID means, right? > > Do the checking in your application. I do. I make sure it's ##/##/#### (i.e. a simple regex works) The database already does the rest of the checking for me, it just happens to think it might be helpful to coerce some bad dates for me, but others that are obviously wrong are tossed out. Here's a scenario for how we can wind up teaching a user to enter dates the wrong way. The day is 22 feb. They enter this date, in the US, where mm/dd/yyyy is standard: 22/02/2003 The database converts it to 02/22/2003 silently. Next day, they enter 23/02/2003 Again, it takes it silently. So on and so forth. On the first day of march they put in: 01/03/2003 which the database takes as January 03, and happily puts it in. with a couple of weeks of "training" the user now believes they are putting the date right, but it is wrong. No error. Next day, we get 02/03/2003. The database puts in Feb 03. Again, the user doesn't know. We continue the rest of the year this way. Somewhere along the line, the user notices all their reports have the wrong date. Which ones were for feb 03 and which ones were for march 02? We don't know. Why bother checking for wrong dates sometimes (i.e. 03/33/2003) but not all the time (i.e. 13/03/2003)? > Something you think is improperly > formatted probably shouldn't get to the database in the first place. Agreed. But that's not the point. It is properly formatted, i.e. mm/dd/yyyy, it's just out of range. That's not the same at all. > If you aren't doing any format checking at all, you're possibly > vulnerable to SQL injection attacks. I do plenty of format checking, this isn't the same. This is range checking. I expect my database to do that for me. > I do now seem to recall an agreement that a GUC switch to disable > date-interpretation guessing would be okay, though. I'm pretty sure it was the other way around, make strict locale / date checking the standard and a GUC to turn it off for folks who really want to use a broken database. :-)
"scott.marlowe" <scott.marlowe@ihs.com> writes:
> On Wed, 18 Jun 2003, Tom Lane wrote:
>> I do now seem to recall an agreement that a GUC switch to disable
>> date-interpretation guessing would be okay, though.
> I'm pretty sure it was the other way around, make strict locale / date
> checking the standard and a GUC to turn it off for folks who really want
> to use a broken database.  :-)
This is definitely a case where what is "broken" is in the eye of the
beholder.  If the current behavior is broken, why have we had so few
complaints about it?  It's been like that for quite a few years now.
I think that on grounds of backwards compatibility alone, we should
leave the out-of-the-box default behavior as it is.
            regards, tom lane
			
		This is an old postgres "gotcha" that I've lived with since a long, long time ago. It forced me to always use the datestyle YYYY-MM-DD so there is never any confusion in my programming. I then convert the date to whatever the current user preference is, at the application level or with an SQL query call to to_char(). I suggest that this is a good practice to follow. - Andrew
On 18/06/2003 22:09 scott.marlowe wrote: > [snip] > Why bother checking for wrong dates sometimes (i.e. 03/33/2003) but not > all the time (i.e. 13/03/2003)? ISTR that DB2 used to do the same (maybe still does - haven't used it for a few years). Still, IMHO, it is frustrating. I would rather have the db be ultra-strict, forcing me to handle dates correctly and rigorously in my app. > [snip] > I do plenty of format checking, this isn't the same. This is range > checking. I expect my database to do that for me. Agreed, although if the user gets his data thrown back at him by the db then I feel that this is a bug in my program. YMMV. regards -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
On Wed, 2003-06-18 at 18:09, Paul Thomas wrote: > On 18/06/2003 22:09 scott.marlowe wrote: > > [snip] > > Why bother checking for wrong dates sometimes (i.e. 03/33/2003) but not > > all the time (i.e. 13/03/2003)? > > ISTR that DB2 used to do the same (maybe still does - haven't used it for > a few years). Still, IMHO, it is frustrating. I would rather have the db > be ultra-strict, forcing me to handle dates correctly and rigorously in my > app. > > > [snip] > > I do plenty of format checking, this isn't the same. This is range > > checking. I expect my database to do that for me. > > Agreed, although if the user gets his data thrown back at him by the db > then I feel that this is a bug in my program. YMMV. I hate to say, "Me too!", but I do agree with Paul and Scott. -- +-----------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Oh, great altar of passive entertainment, bestow upon me | | thy discordant images at such speed as to render linear | | thought impossible" (Calvin, regarding TV) | +-----------------------------------------------------------
(Comments from the peanut gallery here) > > > IMHO it is a bug. We don't let postgresql "guess" about a lot of more > > > obvious things (i.e. int4 to int8 casting, etc...) and letting it guess > > > about dates makes it non-ACID compliant. > > > > How do you arrive at that conclusion? > > The same way I come to all my conclusions, logic. :-) but seriously... > > Why not accept a date of 04/44/2003 and just wrap it into May? It's > the same kind of thing. Is it? Similar, perhaps, but at least you can be pretty sure that 44 and 2003 are not valid months. (Not that I want the database fixing that for me, either.) > I told my database where I live, and expect it to > only accept dates that are valid in my locale. I wouldn't suggest that. Locale is not dependable because there is simply no dependable way of mapping, for example, IP address to a physical location, much less to a cultural location. Also, even if you had a truly accurate way to determine that your user was Japanese, you wouldn't know whether the user intended western era or Japanese era, whether the user was doing year-month-day according tradition here, or whether the user was trying to anticipate a western order because the web site was in English. (I'd give about a 90% odds that the date 01.02.03 entered into a web site by someone Japanese is going to mean 3 Feb 2001, 8% that it's going to mean 3rd of February of the year Heisei 1, and 2% that the user is going to try to second-guess and enter it in one of the western orders.) In the US, you will also have users who may be accustomed to using military or genealogy order, as well. > If a user feeds it a date > that isn't right, I expect the database to error out. I personally wouldn't want the order checked in the database. I think I'd prefer that the application set the order and that the database limit itself to range-checking the elements. > > > If it isn't a bug, how do I implement a check constraint to stop it from > > > happening? Good question. Another good question is how the database would implement the check. > > > I'd like to know my database accepts properly formatted input > > > and rejects the rest. That's what the C in ACID means, right? > > > > Do the checking in your application. > > I do. I make sure it's ##/##/#### (i.e. a simple regex works) That's not much of a check, of course. > The database already does the rest of the checking for me, it just happens > to think it might be helpful to coerce some bad dates for me, but others > that are obviously wrong are tossed out. If this means what it appears to mean, that, when the database is expecting day-month-year, it "fixes" 02/22/2003 to 2003.FEB.22, that would worry me. I'd prefer that sort of behavior to be off by default. I'd much rather have it just cough on a month 22. If attempted, I'd want it in the application layer. > Here's a scenario for how we can wind up teaching a user to enter dates > the wrong way. The day is 22 feb. They enter this date, in the US, where > mm/dd/yyyy is standard: > > 22/02/2003 > > The database converts it to > > 02/22/2003 silently. The application layer should report, explicitly, how it interpreted what was entered. Explicity feedback is the only way to make reasonably sure the user and the database are on the same wavelength. > ... > We continue the rest of the year this way. Somewhere along the line, the > user notices all their reports have the wrong date. Which ones were for > feb 03 and which ones were for march 02? We don't know. > > Why bother checking for wrong dates sometimes (i.e. 03/33/2003) but not > all the time (i.e. 13/03/2003)? > > > Something you think is improperly > > formatted probably shouldn't get to the database in the first place. > > Agreed. But that's not the point. It is properly formatted, i.e. > mm/dd/yyyy, it's just out of range. That's not the same at all. I think order would still be format, which is why I wouldn't want the database checking it, and especially not trying to fix it. >... > > I do now seem to recall an agreement that a GUC switch to disable > > date-interpretation guessing would be okay, though. > > I'm pretty sure it was the other way around, make strict locale / date > checking the standard and a GUC to turn it off for folks who really want > to use a broken database. :-) I would not want the database guessing the order from the locale, either. My JPY 2. -- Joel Rees, programmer, Kansai Systems Group Altech Corporation (Alpsgiken), Osaka, Japan http://www.alpsgiken.co.jp
On Wed, 18 Jun 2003, Tom Lane wrote: > > I'm pretty sure it was the other way around, make strict locale / date > > checking the standard and a GUC to turn it off > > I think that on grounds of backwards compatibility alone, we should > leave the out-of-the-box default behavior as it is. Who can rely on pg to accept dates that are "wrong"? If some program (or person) generates a date where the month is 22 according to the DateStyle, then pg can guess that it's a day and correct the date. But if someone relies on this then they will most likely also generate other dates that will be accepted and intepreted by pg in a different way. And you got broken data in the database. The reason it have worked before is because everyone who had it wrong before had to fix it anyway or get broken data into the database. -- /Dennis
On Thu, 2003-06-19 at 01:06, Joel Rees wrote: > (Comments from the peanut gallery here) > > > > > IMHO it is a bug. We don't let postgresql "guess" about a lot of more > > > > obvious things (i.e. int4 to int8 casting, etc...) and letting it guess > > > > about dates makes it non-ACID compliant. > > > > > > How do you arrive at that conclusion? > > > > The same way I come to all my conclusions, logic. :-) but seriously... > > > > Why not accept a date of 04/44/2003 and just wrap it into May? It's > > the same kind of thing. > > Is it? Similar, perhaps, but at least you can be pretty sure that 44 and > 2003 are not valid months. (Not that I want the database fixing that for > me, either.) > > > I told my database where I live, and expect it to > > only accept dates that are valid in my locale. > > I wouldn't suggest that. Locale is not dependable because there is > simply no dependable way of mapping, for example, IP address to a > physical location, much less to a cultural location. The locale specified by the SysAdmin should be canononical. [snip] > Good question. Another good question is how the database would implement > the check. Other databases do it. It can't be *that* hard to do. OTOH, Andrew Snow's method (alway use ANSI standard YYYY-MM-DD) is guaranteed to work. Have your app convert to that format before inserting, and then PostgreSQL is guaranteed to puke if there's a problem. -- +-----------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Oh, great altar of passive entertainment, bestow upon me | | thy discordant images at such speed as to render linear | | thought impossible" (Calvin, regarding TV) | +-----------------------------------------------------------
On Thu, Jun 19, 2003 at 02:43:12 -0500,
  Ron Johnson <ron.l.johnson@cox.net> wrote:
>
> OTOH, Andrew Snow's method (alway use ANSI standard YYYY-MM-DD)
> is guaranteed to work.  Have your app convert to that format before
> inserting, and then PostgreSQL is guaranteed to puke if there's
> a problem.
No it isn't. In 7.4:
area=> select '2003-20-02'::date;
    date
------------
 2003-02-20
(1 row)
			
		On Thu, 2003-06-19 at 05:35, Bruno Wolff III wrote: > On Thu, Jun 19, 2003 at 02:43:12 -0500, > Ron Johnson <ron.l.johnson@cox.net> wrote: > > > > OTOH, Andrew Snow's method (alway use ANSI standard YYYY-MM-DD) > > is guaranteed to work. Have your app convert to that format before > > inserting, and then PostgreSQL is guaranteed to puke if there's > > a problem. > > No it isn't. In 7.4: > area=> select '2003-20-02'::date; > date > ------------ > 2003-02-20 > (1 row) Whoa... It shouldn't be difficult, though, to create a function to validate dates, and have it be an implicit CHECK on date fields. Should it? -- +-----------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Oh, great altar of passive entertainment, bestow upon me | | thy discordant images at such speed as to render linear | | thought impossible" (Calvin, regarding TV) | +-----------------------------------------------------------
The problem is that you can't do the check on the _field_ since it has already been converted to a date. Jon On 19 Jun 2003, Ron Johnson wrote: > On Thu, 2003-06-19 at 05:35, Bruno Wolff III wrote: > > On Thu, Jun 19, 2003 at 02:43:12 -0500, > > Ron Johnson <ron.l.johnson@cox.net> wrote: > > > > > > OTOH, Andrew Snow's method (alway use ANSI standard YYYY-MM-DD) > > > is guaranteed to work. Have your app convert to that format before > > > inserting, and then PostgreSQL is guaranteed to puke if there's > > > a problem. > > > > No it isn't. In 7.4: > > area=> select '2003-20-02'::date; > > date > > ------------ > > 2003-02-20 > > (1 row) > > Whoa... > > It shouldn't be difficult, though, to create a function to validate > dates, and have it be an implicit CHECK on date fields. Should it? > > -- > +-----------------------------------------------------------+ > | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | > | Jefferson, LA USA http://members.cox.net/ron.l.johnson | > | | > | "Oh, great altar of passive entertainment, bestow upon me | > | thy discordant images at such speed as to render linear | > | thought impossible" (Calvin, regarding TV) | > +----------------------------------------------------------- > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
On Wed, 18 Jun 2003 15:09:44 -0600 (MDT), scott.marlowe wrote:
> Why not accept a date of 04/44/2003 and just wrap it into May?
Unbelievably, to_date() already does that in 7.3.3:
pmh=> select to_date('2003-04-44','yyyy-mm-dd');
  to_date
------------
 2003-05-14
(1 row)
I raised this issue along with the date component order switching in April.
Someone (possibly Karel Zak) did say that they were going to look into it,
but it doesn't look like anything got done about it in 7.3. Maybe it's
better in 7.4?
--
    Peter Haworth    pmh@edison.ioppublishing.com
"you have been evaluated.  you have a negative reference count.
 prepare to be garbage collected.  persistence is futile."
        -- Erik Naggum
			
		scott.marlowe writes: > That reminds me, did we get the date parsing fixed so that you can't > insert 22/03/2003 into a european database (or conversely, 03/22/2003 into > a US database) ? I.e the problem where the date parser assumed you meant > it the other way... I vaguely recall that we wanted to provide some kind of configuration option to make the date format checking more strict. If I'm mistaken, I would certainly be in favor of it now. But exactly how strict it ought to be is up for discussion. -- Peter Eisentraut peter_e@gmx.net
On Wed, 18 Jun 2003, Tom Lane wrote: > "scott.marlowe" <scott.marlowe@ihs.com> writes: > > On Wed, 18 Jun 2003, Tom Lane wrote: > >> I do now seem to recall an agreement that a GUC switch to disable > >> date-interpretation guessing would be okay, though. > > > I'm pretty sure it was the other way around, make strict locale / date > > checking the standard and a GUC to turn it off for folks who really want > > to use a broken database. :-) > > This is definitely a case where what is "broken" is in the eye of the > beholder. If the current behavior is broken, why have we had so few > complaints about it? It's been like that for quite a few years now. > > I think that on grounds of backwards compatibility alone, we should > leave the out-of-the-box default behavior as it is. I thought of another "silent failure" scenario. Imports. If you have say 10,000 rows to import, and all the dates are in euro style format going into a us style database, then all the ones where the "month" is <13 will be put in wrong, and all the ones with a "month" >12 will be silently converted to be right. Now half the dates are right, and half are wrong, and there's no error. That's the worst of possibilities. Better to fail grandly than silently corrupt data.
"Peter Haworth" <pmh@edison.ioppublishing.com> writes:
> On Wed, 18 Jun 2003 15:09:44 -0600 (MDT), scott.marlowe wrote:
>> Why not accept a date of 04/44/2003 and just wrap it into May?
> Unbelievably, to_date() already does that in 7.3.3:
It's only "unbelievable" if Oracle doesn't do the same thing.  to_date
and friends are Oracle compatibility functions and are intended to
slavishly imitate what Oracle does, warts and all (and there are
certainly plenty of warts there...)
Anyone care to run some tests to see how lax Oracle's to_date is?
            regards, tom lane
			
		On Thu, 19 Jun 2003, Andrew Snow wrote:
>
> This is an old postgres "gotcha" that I've lived with since a long, long
> time ago.  It forced me to always use the datestyle YYYY-MM-DD so there
> is never any confusion in my programming.
create table p (d date);
CREATE TABLE
insert into p values ('2003-13-03');
INSERT 4530055 1
insert into p values ('2003-03-13');
INSERT 4530056 1
select * from p;
     d
------------
 2003-03-13
 2003-03-13
That doesn't fix it, it just makes it more obvious to the user how they
should insert.  If you switch the month/day pgsql STILL swaps them back
for you.
			
		On Thu, 2003-06-19 at 08:39, Jonathan Bartlett wrote: > The problem is that you can't do the check on the _field_ since it has > already been converted to a date. You're right. How about a "before insert" trigger? > On 19 Jun 2003, Ron Johnson wrote: > > > On Thu, 2003-06-19 at 05:35, Bruno Wolff III wrote: > > > On Thu, Jun 19, 2003 at 02:43:12 -0500, > > > Ron Johnson <ron.l.johnson@cox.net> wrote: > > > > > > > > OTOH, Andrew Snow's method (alway use ANSI standard YYYY-MM-DD) > > > > is guaranteed to work. Have your app convert to that format before > > > > inserting, and then PostgreSQL is guaranteed to puke if there's > > > > a problem. > > > > > > No it isn't. In 7.4: > > > area=> select '2003-20-02'::date; > > > date > > > ------------ > > > 2003-02-20 > > > (1 row) > > > > Whoa... > > > > It shouldn't be difficult, though, to create a function to validate > > dates, and have it be an implicit CHECK on date fields. Should it? -- +-----------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Oh, great altar of passive entertainment, bestow upon me | | thy discordant images at such speed as to render linear | | thought impossible" (Calvin, regarding TV) | +-----------------------------------------------------------
On Thu, 19 Jun 2003, Bruno Wolff III wrote:
> On Thu, Jun 19, 2003 at 02:43:12 -0500,
>   Ron Johnson <ron.l.johnson@cox.net> wrote:
> >
> > OTOH, Andrew Snow's method (alway use ANSI standard YYYY-MM-DD)
> > is guaranteed to work.  Have your app convert to that format before
> > inserting, and then PostgreSQL is guaranteed to puke if there's
> > a problem.
>
> No it isn't. In 7.4:
> area=> select '2003-20-02'::date;
>     date
> ------------
>  2003-02-20
> (1 row)
If the application always passes the date to Postgres with the three-letter
month name where appropriate, and use the 4-digit year, it should be
comparatively bulletproof.  At least, bulletproof in its interpretation --
the application can always garble things.
Not sure how this translates in different languages, though.
    -frank
			
		Frank Miles <fpm@u.washington.edu> writes:
> If the application always passes the date to Postgres with the three-letter
> month name where appropriate, and use the 4-digit year, it should be
> comparatively bulletproof.
That pretty much assumes that you've already validated the input and
converted it to an unambiguous form.
I think much of this discussion is missing the point.  ISTM when you're
dealing with programmatic output, it's fairly easy to ensure that you
are on the same page as the other program, and in that case there's a
good argument for being strict about the expected field order.  But
when you are dealing with hand-entered input, you *do not know* what
the user meant by input such as '01/03/2003'.  You may think you know,
but you're just fooling yourself.  The only really bulletproof way of
handling the matter is to close the loop by repeating the data back to
the user in an obviously unambiguous format, say 03-Jan-2003 or
01-Mar-2003.  If that wasn't what he meant, he can change it.  When you
handle things that way, there's a very good case for being as permissive
as possible in the parsing of the initial input.
PG's existing date parsing code is intended to support the second
scenario.  I don't mind offering an option to make it support the first
scenario better --- but I will resist ripping out support for the second.
            regards, tom lane
			
		On Thu, 19 Jun 2003, Frank Miles wrote: > On Thu, 19 Jun 2003, Bruno Wolff III wrote: > > > On Thu, Jun 19, 2003 at 02:43:12 -0500, > > Ron Johnson <ron.l.johnson@cox.net> wrote: > > > > > > OTOH, Andrew Snow's method (alway use ANSI standard YYYY-MM-DD) > > > is guaranteed to work. Have your app convert to that format before > > > inserting, and then PostgreSQL is guaranteed to puke if there's > > > a problem. > > > > No it isn't. In 7.4: > > area=> select '2003-20-02'::date; > > date > > ------------ > > 2003-02-20 > > (1 row) > > If the application always passes the date to Postgres with the three-letter > month name where appropriate, and use the 4-digit year, it should be > comparatively bulletproof. At least, bulletproof in its interpretation -- > the application can always garble things. I would say that whether the old 02/14/2003 -> 14/02/2003 conversion stuff stays in, the 2003-14-02 -> 2003-02-14 stuff should NOT. And the fact that "other databases" do it that way is not an argument. Postgresql has always had a higher standard re: data integrity than most databases. I can't imagine anyone actually preferring the silent conversion over the error, since it's a hit or miss thing and can result in bad data silently.
> > The problem is that you can't do the check on the _field_ since it has > > already been converted to a date. > > You're right. How about a "before insert" trigger? This suffers from the same problem, I believe. If I weren't a lazy bastard, I check it :) Jon
I think he was thiniing about making the **IN** and out of the date functions to be ANSI only. Bruno Wolff III wrote: > On Thu, Jun 19, 2003 at 02:43:12 -0500, > Ron Johnson <ron.l.johnson@cox.net> wrote: > >>OTOH, Andrew Snow's method (alway use ANSI standard YYYY-MM-DD) >>is guaranteed to work. Have your app convert to that format before >>inserting, and then PostgreSQL is guaranteed to puke if there's >>a problem. > > > No it isn't. In 7.4: > area=> select '2003-20-02'::date; > date > ------------ > 2003-02-20 > (1 row) > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html >
WHOA!
Peter Haworth wrote:
> On Wed, 18 Jun 2003 15:09:44 -0600 (MDT), scott.marlowe wrote:
>
>>Why not accept a date of 04/44/2003 and just wrap it into May?
>
>
> Unbelievably, to_date() already does that in 7.3.3:
>
> pmh=> select to_date('2003-04-44','yyyy-mm-dd');
>   to_date
> ------------
>  2003-05-14
> (1 row)
>
> I raised this issue along with the date component order switching in April.
> Someone (possibly Karel Zak) did say that they were going to look into it,
> but it doesn't look like anything got done about it in 7.3. Maybe it's
> better in 7.4?
>
			
		On Thu, 2003-06-19 at 09:54, Jonathan Bartlett wrote: > > > The problem is that you can't do the check on the _field_ since it has > > > already been converted to a date. > > > > You're right. How about a "before insert" trigger? > > This suffers from the same problem, I believe. If I weren't a lazy > bastard, I check it :) But you could code programatic to validate the date. -- +-----------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Oh, great altar of passive entertainment, bestow upon me | | thy discordant images at such speed as to render linear | | thought impossible" (Calvin, regarding TV) | +-----------------------------------------------------------
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > But when you are dealing with hand-entered input, you *do not know* what > the user meant by input such as '01/03/2003'. You may think you know, > but you're just fooling yourself. I agree with this, but the actual test case someone posted was: 2003-13-03 2003-03-13 Parsing these both as the same date just seems fundamentally wrong in my opinion. Besides, although MM-DD-YYYY and DD-MM-YYYY are both rather common, YYYY-MM-DD is much more common than YYYY-DD-MM. I really think it should throw an error if it does not fit into YYYY-MM-DD. - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200306191110 -----BEGIN PGP SIGNATURE----- Comment: http://www.turnstep.com/pgp.html iD8DBQE+8dMSvJuQZxSWSsgRAiaYAKDrZMWBy7pv81BjCeSzMdbyTsu3eQCg9exM vXj4GEQRhckbz2ygBmkX8z4= =WtaY -----END PGP SIGNATURE-----
Shouldn't dates be validated using the *LOCALE setting and not try to guess? Tom Lane wrote: > > Frank Miles <fpm@u.washington.edu> writes: > > If the application always passes the date to Postgres with the three-letter > > month name where appropriate, and use the 4-digit year, it should be > > comparatively bulletproof. > > That pretty much assumes that you've already validated the input and > converted it to an unambiguous form. > > I think much of this discussion is missing the point. ISTM when you're > dealing with programmatic output, it's fairly easy to ensure that you > are on the same page as the other program, and in that case there's a > good argument for being strict about the expected field order. But > when you are dealing with hand-entered input, you *do not know* what > the user meant by input such as '01/03/2003'. You may think you know, > but you're just fooling yourself. The only really bulletproof way of > handling the matter is to close the loop by repeating the data back to > the user in an obviously unambiguous format, say 03-Jan-2003 or > 01-Mar-2003. If that wasn't what he meant, he can change it. When you > handle things that way, there's a very good case for being as permissive > as possible in the parsing of the initial input. > > PG's existing date parsing code is intended to support the second > scenario. I don't mind offering an option to make it support the first > scenario better --- but I will resist ripping out support for the second. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> Anyone care to run some tests to see how lax Oracle's to_date is? I just did some checking on 9i to make sure things haven't changed since Oracle 7, it appears they have not. Oracle's to_date function is not lax at all. In fact, one of the things that has always bugged me about Oracle's to_date function is that it treats an invalid date as a SQL error, so you can't write a SQL script that fixes legacy data. (Returning NULL for an invalid date would help make that possible, for example.) I think to_number does the same thing for data that doesn't conform to the numeric model you give it. Oracle's to_date function will only accept a month from 01-12, it will only accept a day from 01-31 and will only accept a day which is valid for that month, so it won't take 02/29/2001 but will take 02/29/2000. Oracle's to_date function recognizes that 1900 was not a leap year and 2000 was. (Years that end in 00 must be divisible by 400 to be leap years.) I won't live to see it, but I wonder how many programs out there will have problems with the fact that 2100 is not a leap year? I've never had need for using the full range of dates, but according to the documentation for Oracle 7 (the last version I have the printed manuals for), Oracle's internal date format can store dates from 1/1/4712 BC to 12/31/4712 AD. Oracle 7 documentation claims that the year 0 does not exist, but if I subtract 5 days from 01/01/0001 I get 12/27/0000. (This may be a date reporting issue rather than a date conversion/storage, though.) I had never noticed this before, so I'm not sure whether Oracle or UNIX handles the Julian to Gregorian calendar change 'better'. It probably has to do with when one recognizes the calendar change. (This sounds like something from the History Channel, doesn't it?) 'cal 1752' indicates that September 3rd through the 13th were dropped to sync the calendar with the seasons. However, Oracle skips 10 days in October of 1582 and treats the dates 10/05/1582 - 10/14/1582 as missing dates. Dates entered in that range via to_date will be stored as 10/15/1582. -- Mike Nolan
Jean-Luc Lachance <jllachan@nsd.ca> writes:
> Shouldn't dates be validated using the *LOCALE setting and not try to
> guess?
It would make sense to offer a "strict" mode in which the date order
has to be what DateStyle suggests.  I'm astonished that no one seems
to get the point that there are also good uses for "lax" parsing.
            regards, tom lane
			
		nolan@celery.tssi.com writes:
>> Anyone care to run some tests to see how lax Oracle's to_date is?
> Oracle's to_date function is not lax at all.
Then we need to fix ours.  Karel?
            regards, tom lane
			
		On Thu, 19 Jun 2003, Tom Lane wrote: > Jean-Luc Lachance <jllachan@nsd.ca> writes: > > Shouldn't dates be validated using the *LOCALE setting and not try to > > guess? > > It would make sense to offer a "strict" mode in which the date order > has to be what DateStyle suggests. I'm astonished that no one seems > to get the point that there are also good uses for "lax" parsing. Which are? Honestly, I see no good use for the lax parsing.
On Thu, 2003-06-19 at 11:22, Tom Lane wrote: > Jean-Luc Lachance <jllachan@nsd.ca> writes: > > Shouldn't dates be validated using the *LOCALE setting and not try to > > guess? > > It would make sense to offer a "strict" mode in which the date order > has to be what DateStyle suggests. I'm astonished that no one seems > to get the point that there are also good uses for "lax" parsing. Hey, I also only like the SERIALIZABLE transaction mode. -- +-----------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Oh, great altar of passive entertainment, bestow upon me | | thy discordant images at such speed as to render linear | | thought impossible" (Calvin, regarding TV) | +-----------------------------------------------------------
> nolan@celery.tssi.com writes: > >> Anyone care to run some tests to see how lax Oracle's to_date is? > > > Oracle's to_date function is not lax at all. > > Then we need to fix ours. Karel? To make it as strict as Oracle's? I'm not entirely convinced that is a Good Thing. However, if that is to be done, would it also be possible to define a 'check_date' function which does the same as 'to_date' but returns NULL (or some kind of error code) rather than a SQL error on an invalid date? I've never tried it (and am somewhat embarassed to say that I had not even considered the idea until today), but it should possible in Oracle's PL/SQL to use exception handling to trap to_date errors. (I always wound up using oraperl's error handling capabilities to detect bad dates when porting data to Oracle.) Would that be possible in pgplsql, does it support the same levels of exception handling as PL/SQL? -- Mike Nolan
nolan@celery.tssi.com writes:
>>> Oracle's to_date function is not lax at all.
>>
>> Then we need to fix ours.  Karel?
> To make it as strict as Oracle's?  I'm not entirely convinced that is a
> Good Thing.
But to_date is a different animal, because it's working from a format
string that tells it exactly which field is which.  So as far as I can
see, there's no ambiguity; rather, the present behavior seems like it's
accepting demonstrably bogus data.
            regards, tom lane
			
		At 6/19/03 9:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >It would make sense to offer a "strict" mode in which the date order >has to be what DateStyle suggests. I'm astonished that no one seems >to get the point that there are also good uses for "lax" parsing. Hmmmm.... I would strongly disagree that there is any use for this. Research has shown that good human interface design is (in part) a product of systems that act consistently, allowing the user to develop a mental map of how the system works. Allowing "lax" parsing teaches the user an invalid mental map. For example, if a user starts working at a company on January 15, and finds he can successfully enter dates as 2003-15-01 (because the entry screen confirms the date as "Jan-15-2003"), the user could reasonably come to the conclusion that YYYY-DD-MM is the "correct" way to enter a date. Along comes February 1. The user enters the date as "2003-01-02", but he has been on the job for two weeks and no longer bothers reading the confirmation screen: he doesn't notice that it inexplicably shows "Jan-02-2003". The resulting bogus data stems directly from the system previously trying to "help" the user by doing what it thought he meant, instead of what he typed. Since it is impossible to support odd formats such as YYYY-DD-MM and interpret input in a consistent manner, PostgreSQL should not allow it. "Do what I mean instead of what I typed" data interpretation is already dodgy; "Do what I mean instead of what I typed, but only from January 13 through January 31, and not from February 1 to February 12" seems obviously flawed. -- Robert L Mathews, Tiger Technologies
Tom Lane writes: > It would make sense to offer a "strict" mode in which the date order > has to be what DateStyle suggests. I'm astonished that no one seems > to get the point that there are also good uses for "lax" parsing. There are different kinds of lax parsing. Lax parsing is great if you can enter any of January 8, 1999 1999-01-08 1/8/1999 990108 January 8 04:05:06 1999 PST and it will know what you mean, because of all these have their uses and are unambiguous (given a known day/month order). But automatically switching the declared day/month order based on heuristics is not that great. A human is not going to mentally switch his preferred day/month order within the same SQL session. -- Peter Eisentraut peter_e@gmx.net
Tom Lane writes:
> It would make sense to offer a "strict" mode in which the date order
> has to be what DateStyle suggests.
Something like this?
   set datestyle='iso,strict';
I could live with that.
--
    Peter Haworth    pmh@edison.ioppublishing.com
AD&D Lesson #4: In a Crisis, everyones' alignment is CE.
			
		On Fri, 2003-06-20 at 04:05, Peter Eisentraut wrote: > Tom Lane writes: > > > It would make sense to offer a "strict" mode in which the date order > > has to be what DateStyle suggests. I'm astonished that no one seems > > to get the point that there are also good uses for "lax" parsing. > > There are different kinds of lax parsing. > > Lax parsing is great if you can enter any of > > January 8, 1999 > 1999-01-08 > 1/8/1999 > 990108 > January 8 04:05:06 1999 PST > > and it will know what you mean, because of all these have their uses and > are unambiguous (given a known day/month order). Speaking just for myself (but maybe more people, based on their support of "strict date conversion"), lax dates should be handled by the app and have the app convert them into "strict format" (whatever that may be, based on LOCALE or "set" statement. -- +-----------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Oh, great altar of passive entertainment, bestow upon me | | thy discordant images at such speed as to render linear | | thought impossible" (Calvin, regarding TV) | +-----------------------------------------------------------
Reading the subject, "creepy ... dates", that is exactly how I feel
about the described current date behavior --- "creepy".
Because I have only seen one person defend our current behavior, and
many object, I am going to add to TODO:
    * Allow current datestyle to restrict dates;  prevent month/day swapping
          from making invalid dates valid
    * Prevent month/day swapping of ISO dates to make invalid dates valid
I know someone was mentioning how bad it was that MySQL allows NULL in a
NOT NULL date field, and inserts 00-00-0000.  I think we are pretty
close to that with our current behavior.
---------------------------------------------------------------------------
Peter Eisentraut wrote:
> Tom Lane writes:
>
> > It would make sense to offer a "strict" mode in which the date order
> > has to be what DateStyle suggests.  I'm astonished that no one seems
> > to get the point that there are also good uses for "lax" parsing.
>
> There are different kinds of lax parsing.
>
> Lax parsing is great if you can enter any of
>
> January 8, 1999
> 1999-01-08
> 1/8/1999
> 990108
> January 8 04:05:06 1999 PST
>
> and it will know what you mean, because of all these have their uses and
> are unambiguous (given a known day/month order).
>
> But automatically switching the declared day/month order based on
> heuristics is not that great.  A human is not going to mentally switch his
> preferred day/month order within the same SQL session.
>
> --
> Peter Eisentraut   peter_e@gmx.net
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
>
--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
			
		Bruce Momjian wrote: > > Reading the subject, "creepy ... dates", that is exactly how I feel > about the described current date behavior --- "creepy". > > Because I have only seen one person defend our current behavior, and > many object, I am going to add to TODO: > > * Allow current datestyle to restrict dates; prevent month/day swapping > from making invalid dates valid? > * Prevent month/day swapping of ISO dates to make invalid dates valid I added a question mark to the first item so we can consider it later. Most agreed on the second item, but a few thought the first one might be OK as is. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Sun, 2003-06-22 at 00:05, Bruce Momjian wrote: > Bruce Momjian wrote: > > > > Reading the subject, "creepy ... dates", that is exactly how I feel > > about the described current date behavior --- "creepy". > > > > Because I have only seen one person defend our current behavior, and > > many object, I am going to add to TODO: > > > > * Allow current datestyle to restrict dates; prevent month/day swapping > > from making invalid dates valid? > > * Prevent month/day swapping of ISO dates to make invalid dates valid > > I added a question mark to the first item so we can consider it later. > Most agreed on the second item, but a few thought the first one might be > OK as is. How about situations where reversing the month and date would still have "valid but wrong" dates, based upon the LOCALE mask? I.e., "05/04/2003" is "05-April-2003" or "04-May-2003", depending on whether the LOCALE implies "DD/MM/YYYY" or "MM/DD/YYYY". -- +-----------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Oh, great altar of passive entertainment, bestow upon me | | thy discordant images at such speed as to render linear | | thought impossible" (Calvin, regarding TV) | +-----------------------------------------------------------
Ron Johnson wrote: > On Sun, 2003-06-22 at 00:05, Bruce Momjian wrote: > > Bruce Momjian wrote: > > > > > > Reading the subject, "creepy ... dates", that is exactly how I feel > > > about the described current date behavior --- "creepy". > > > > > > Because I have only seen one person defend our current behavior, and > > > many object, I am going to add to TODO: > > > > > > * Allow current datestyle to restrict dates; prevent month/day swapping > > > from making invalid dates valid? > > > * Prevent month/day swapping of ISO dates to make invalid dates valid > > > > I added a question mark to the first item so we can consider it later. > > Most agreed on the second item, but a few thought the first one might be > > OK as is. > > How about situations where reversing the month and date would > still have "valid but wrong" dates, based upon the LOCALE mask? > > I.e., "05/04/2003" is "05-April-2003" or "04-May-2003", depending > on whether the LOCALE implies "DD/MM/YYYY" or "MM/DD/YYYY". > My assumption is that we already handlle these OK because we base it on datestyle. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Sun, 2003-06-22 at 12:46, Bruce Momjian wrote: > Ron Johnson wrote: > > On Sun, 2003-06-22 at 00:05, Bruce Momjian wrote: > > > Bruce Momjian wrote: > > > > > > > > Reading the subject, "creepy ... dates", that is exactly how I feel > > > > about the described current date behavior --- "creepy". > > > > > > > > Because I have only seen one person defend our current behavior, and > > > > many object, I am going to add to TODO: > > > > > > > > * Allow current datestyle to restrict dates; prevent month/day swapping > > > > from making invalid dates valid? > > > > * Prevent month/day swapping of ISO dates to make invalid dates valid > > > > > > I added a question mark to the first item so we can consider it later. > > > Most agreed on the second item, but a few thought the first one might be > > > OK as is. > > > > How about situations where reversing the month and date would > > still have "valid but wrong" dates, based upon the LOCALE mask? > > > > I.e., "05/04/2003" is "05-April-2003" or "04-May-2003", depending > > on whether the LOCALE implies "DD/MM/YYYY" or "MM/DD/YYYY". > > > > My assumption is that we already handlle these OK because we base it on > datestyle. According to the OP, no. -- +-----------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Oh, great altar of passive entertainment, bestow upon me | | thy discordant images at such speed as to render linear | | thought impossible" (Calvin, regarding TV) | +-----------------------------------------------------------
Ron Johnson wrote:
> On Sun, 2003-06-22 at 12:46, Bruce Momjian wrote:
> > Ron Johnson wrote:
> > > On Sun, 2003-06-22 at 00:05, Bruce Momjian wrote:
> > > > Bruce Momjian wrote:
> > > > >
> > > > > Reading the subject, "creepy ... dates", that is exactly how I feel
> > > > > about the described current date behavior --- "creepy".
> > > > >
> > > > > Because I have only seen one person defend our current behavior, and
> > > > > many object, I am going to add to TODO:
> > > > >
> > > > >     * Allow current datestyle to restrict dates;  prevent month/day swapping
> > > > >           from making invalid dates valid?
> > > > >     * Prevent month/day swapping of ISO dates to make invalid dates valid
> > > >
> > > > I added a question mark to the first item so we can consider it later.
> > > > Most agreed on the second item, but a few thought the first one might be
> > > > OK as is.
> > >
> > > How about situations where reversing the month and date would
> > > still have "valid but wrong" dates, based upon the LOCALE mask?
> > >
> > > I.e., "05/04/2003" is "05-April-2003" or "04-May-2003", depending
> > > on whether the LOCALE implies "DD/MM/YYYY" or "MM/DD/YYYY".
> > >
> >
> > My assumption is that we already handlle these OK because we base it on
> > datestyle.
>
> According to the OP, no.
Oh, you are right.  We base it on datestyle, rather than locale.  Is it
desiarable to default postgresql.conf datestyle to match the locale?
    #
    #       Locale settings
    #
    # (initialized by initdb -- may be changed)
    LC_MESSAGES = 'C'
    LC_MONETARY = 'C'
    LC_NUMERIC = 'C'
    LC_TIME = 'C'
--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
			
		On Sun, 2003-06-22 at 15:47, Bruce Momjian wrote: > Ron Johnson wrote: > > On Sun, 2003-06-22 at 12:46, Bruce Momjian wrote: > > > Ron Johnson wrote: > > > > On Sun, 2003-06-22 at 00:05, Bruce Momjian wrote: > > > > > Bruce Momjian wrote: > > > > > > > > > > > > Reading the subject, "creepy ... dates", that is exactly how I feel > > > > > > about the described current date behavior --- "creepy". > > > > > > > > > > > > Because I have only seen one person defend our current behavior, and > > > > > > many object, I am going to add to TODO: > > > > > > > > > > > > * Allow current datestyle to restrict dates; prevent month/day swapping > > > > > > from making invalid dates valid? > > > > > > * Prevent month/day swapping of ISO dates to make invalid dates valid > > > > > > > > > > I added a question mark to the first item so we can consider it later. > > > > > Most agreed on the second item, but a few thought the first one might be > > > > > OK as is. > > > > > > > > How about situations where reversing the month and date would > > > > still have "valid but wrong" dates, based upon the LOCALE mask? > > > > > > > > I.e., "05/04/2003" is "05-April-2003" or "04-May-2003", depending > > > > on whether the LOCALE implies "DD/MM/YYYY" or "MM/DD/YYYY". > > > > > > > > > > My assumption is that we already handlle these OK because we base it on > > > datestyle. > > > > According to the OP, no. > > Oh, you are right. We base it on datestyle, rather than locale. Is it > desiarable to default postgresql.conf datestyle to match the locale? > > # > # Locale settings > # > # (initialized by initdb -- may be changed) > LC_MESSAGES = 'C' > LC_MONETARY = 'C' > LC_NUMERIC = 'C' > LC_TIME = 'C' As long as it's overridable by a "masking set statement", does it matter? Well, it probably does, for consistency's sake. P.S. - candle.pha.pa.us rejects email from smtp.east.cox.net because "Delivery blocked --- Previous SPAM received from your mail server". That's blocking a *lot* of valid email, since cox.net is pretty large. -- +-----------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Oh, great altar of passive entertainment, bestow upon me | | thy discordant images at such speed as to render linear | | thought impossible" (Calvin, regarding TV) | +-----------------------------------------------------------
Ron Johnson wrote: > > # > > # (initialized by initdb -- may be changed) > > LC_MESSAGES = 'C' > > LC_MONETARY = 'C' > > LC_NUMERIC = 'C' > > LC_TIME = 'C' > > As long as it's overridable by a "masking set statement", does > it matter? Well, it probably does, for consistency's sake. Certainly SET can mask postgresql.conf. > P.S. - candle.pha.pa.us rejects email from smtp.east.cox.net > because "Delivery blocked --- Previous SPAM received from your > mail server". That's blocking a *lot* of valid email, since > cox.net is pretty large. And blocks lots of spam ... block removed --- sorry. I am still adjusting the masks for the big ISP's. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
> Is it > desiarable to default postgresql.conf datestyle to match the locale? > > # > # Locale settings > # > # (initialized by initdb -- may be changed) > LC_MESSAGES = 'C' > LC_MONETARY = 'C' > LC_NUMERIC = 'C' > LC_TIME = 'C' Another JPY 2 from the nattou gallery: Defaulting the datestyle to the locale setting makes sense to me. I wouldn't want it hardwired to the locale, of course. I would strongly request having heuristics off in the default settings. Two conditions I really want on the use of heuristics -- I want the date itself to include some "fuzzy" flag, and I want some way to reconstruct the original string. That way, if I want to design an application with tools for getting operator attention, etc., there are hooks in the data. But that feels to me like something for the application layer. -- Joel Rees, programmer, Kansai Systems Group Altech Corporation (Alpsgiken), Osaka, Japan http://www.alpsgiken.co.jp
On Sun, 22 Jun 2003, Bruce Momjian wrote: > Bruce Momjian wrote: > > > > Reading the subject, "creepy ... dates", that is exactly how I feel > > about the described current date behavior --- "creepy". > > > > Because I have only seen one person defend our current behavior, and > > many object, I am going to add to TODO: > > > > * Allow current datestyle to restrict dates; prevent month/day swapping > > from making invalid dates valid? > > * Prevent month/day swapping of ISO dates to make invalid dates valid > > I added a question mark to the first item so we can consider it later. > Most agreed on the second item, but a few thought the first one might be > OK as is. What are ISO dates? Are those the ones like 22 Feb 2003? Just wondering. The one thing that should absolutely be turned off is day/month swapping on dates of the form: 2003-02-22. I've seen little actual defense of the current behaviour, I'd say more like questioning whether or not we should change directions in mid stream than defense.
scott.marlowe wrote: > On Sun, 22 Jun 2003, Bruce Momjian wrote: > > > Bruce Momjian wrote: > > > > > > Reading the subject, "creepy ... dates", that is exactly how I feel > > > about the described current date behavior --- "creepy". > > > > > > Because I have only seen one person defend our current behavior, and > > > many object, I am going to add to TODO: > > > > > > * Allow current datestyle to restrict dates; prevent month/day swapping > > > from making invalid dates valid? > > > * Prevent month/day swapping of ISO dates to make invalid dates valid > > > > I added a question mark to the first item so we can consider it later. > > Most agreed on the second item, but a few thought the first one might be > > OK as is. > > What are ISO dates? Are those the ones like 22 Feb 2003? Just wondering. YYYY-MM-DD > The one thing that should absolutely be turned off is day/month swapping > on dates of the form: 2003-02-22. Right. > I've seen little actual defense of the current behaviour, I'd say more > like questioning whether or not we should change directions in mid stream > than defense. True. I do think we will have to make the change some day, and report it in the release notes. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
"scott.marlowe" <scott.marlowe@ihs.com> writes:
> The one thing that should absolutely be turned off is day/month swapping
> on dates of the form: 2003-02-22.
Agreed on that.  YYYY-DD-MM isn't used in the real world AFAIK, and it's
reasonable to treat it as an error.
> I've seen little actual defense of the current behaviour,
Other than me, I think you mean.  dd/mm/yyyy and mm/dd/yyyy are
inherently ambiguous in the real world, and when you can clearly
determine what the intended meaning is, I think it's more reasonable
to assume the datestyle isn't set correctly than to reject the data.
            regards, tom lane
			
		I think rejecting the data is the best approach. I raises a big flag to the sysadmin or user. Tom Lane wrote: > "scott.marlowe" <scott.marlowe@ihs.com> writes: > >>The one thing that should absolutely be turned off is day/month swapping >>on dates of the form: 2003-02-22. > > > Agreed on that. YYYY-DD-MM isn't used in the real world AFAIK, and it's > reasonable to treat it as an error. > > >>I've seen little actual defense of the current behaviour, > > > Other than me, I think you mean. dd/mm/yyyy and mm/dd/yyyy are > inherently ambiguous in the real world, and when you can clearly > determine what the intended meaning is, I think it's more reasonable > to assume the datestyle isn't set correctly than to reject the data. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
Tom Lane writes: > Other than me, I think you mean. dd/mm/yyyy and mm/dd/yyyy are > inherently ambiguous in the real world, and when you can clearly > determine what the intended meaning is, I think it's more reasonable > to assume the datestyle isn't set correctly than to reject the data. That might even make the slightest sense if the supposedly wrong datestyle would then stay switched. But the automatic switching only happens for a certain subsets of inputs and only in that instance. So if a user did really mean the opposite setting he will not be happy, and if the user did not mean the opposite setting he will not be happy either. So no one is happy. -- Peter Eisentraut peter_e@gmx.net
On Mon, 23 Jun 2003, Tom Lane wrote: > "scott.marlowe" <scott.marlowe@ihs.com> writes: > > The one thing that should absolutely be turned off is day/month swapping > > on dates of the form: 2003-02-22. > > Agreed on that. YYYY-DD-MM isn't used in the real world AFAIK, and it's > reasonable to treat it as an error. > > > I've seen little actual defense of the current behaviour, > > Other than me, I think you mean. dd/mm/yyyy and mm/dd/yyyy are > inherently ambiguous in the real world, and when you can clearly > determine what the intended meaning is, I think it's more reasonable > to assume the datestyle isn't set correctly than to reject the data. I thought the locale set that kind of behaviour didn't it? If so, then it's better to fail loudly then quietly accept bad data. But if the locale doesn't define such a thing, or it can't be set in postgresql.conf, the it's best to just avoid that date style altogether.
Peter Eisentraut wrote: > Tom Lane writes: > > > Other than me, I think you mean. dd/mm/yyyy and mm/dd/yyyy are > > inherently ambiguous in the real world, and when you can clearly > > determine what the intended meaning is, I think it's more reasonable > > to assume the datestyle isn't set correctly than to reject the data. > > That might even make the slightest sense if the supposedly wrong datestyle > would then stay switched. But the automatic switching only happens for a > certain subsets of inputs and only in that instance. So if a user did > really mean the opposite setting he will not be happy, and if the user did > not mean the opposite setting he will not be happy either. So no one is > happy. I think we have had enough discussion to remove the question mark from this TODO item: * Allow current datestyle to restrict dates; prevent month/day swapping from making invalid dates valid Of course, if later discussion changes, I will re-add it. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
scott.marlowe wrote:
> On Mon, 23 Jun 2003, Tom Lane wrote:
>
> > "scott.marlowe" <scott.marlowe@ihs.com> writes:
> > > The one thing that should absolutely be turned off is day/month swapping
> > > on dates of the form: 2003-02-22.
> >
> > Agreed on that.  YYYY-DD-MM isn't used in the real world AFAIK, and it's
> > reasonable to treat it as an error.
> >
> > > I've seen little actual defense of the current behaviour,
> >
> > Other than me, I think you mean.  dd/mm/yyyy and mm/dd/yyyy are
> > inherently ambiguous in the real world, and when you can clearly
> > determine what the intended meaning is, I think it's more reasonable
> > to assume the datestyle isn't set correctly than to reject the data.
>
> I thought the locale set that kind of behaviour didn't it?  If so, then
> it's better to fail loudly then quietly accept bad data.  But if the
> locale doesn't define such a thing, or it can't be set in postgresql.conf,
> the it's best to just avoid that date style altogether.
Added to TODO, with question mark:
    * Have initdb set DateStyle based on locale?
--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
			
		Scott wrote: > What are ISO dates? Are those the ones like 22 Feb 2003? Just wondering. Nope, they're the things like 1999-01-08. The harder to read '930214T131030' is also a valid ISO_8601 format. http://www.mcs.vuw.ac.nz/technical/software/SGML/doc/iso8601/ISO8601.html ISO 8601 specifies periods of time too. For example, ISO-8601's 'P18Y9M' = 18 years, 9 months. Would people be interested if I submitted a patch that converted ISO 8601 "periods of time" to postgresql "intervals"? Ron
At 03:24 PM 6/23/2003 -0400, Bruce Momjian wrote: >Added to TODO, with question mark: > > * Have initdb set DateStyle based on locale? Given various issues with locale (indexes, ordering etc) I'd think that having a DB follow the O/S locale should be special case and require explicit configuration. More so if certain locales are significantly slower than others which seemed to be the case at least in recent memory. What if a European DB backed website is hosted on a US server with English, French and German data? If apps/programs are talking to DBs more than people are then it may make more sense to store things in an application friendly format e.g. (date = YYYY-MM-DD, or seconds since epoch) format and having the app convert it based on the user's preferences. After all even in English, apps may choose to display Tuesday as T, Tue, Tuesday, or whatever the Boss wants. Unless postgresql has special features allowing switching from one locale to another on the fly (including indexes, ordering etc) within a DB session, I'd rather stick to say the C locale, or whatever it is that's fastest. Another point of consideration: if someone accidentally loads multibyte/other locale data into a C locale DB (or whatever is chosen as default DB locale), would dumping the loaded data and reloading it into a multibyte locale result in information/precision loss? Link.
On Tue, 2003-06-24 at 18:01, Lincoln Yeoh wrote: > > What if a European DB backed website is hosted on a US server with English, > French and German data? Dates are no big deal. The front end can translate PostgreSQL format dates to/from whatever format you want. My big challenge (I am in the above case - UK server [rackspace] and French data) is the decimal separator. I can't figure how to get a "," as my decimal seperator. If I try modifying postgresql.conf the postmaster refuses to start. I _don't_ want to write a fance procedure into each and every JSP to work around this. And my client refuses to use the "."... Cheers Tony Grant -- www.tgds.net Library management software toolkit, redhat linux on Sony Vaio C1XD, Dreamweaver MX with Tomcat and PostgreSQL
We are actually considering not honoring locale for initdb encodings, so it might make no sense to do this --- that another reason for the question mark, but until we decide, it is an open issue. --------------------------------------------------------------------------- Lincoln Yeoh wrote: > At 03:24 PM 6/23/2003 -0400, Bruce Momjian wrote: > > >Added to TODO, with question mark: > > > > * Have initdb set DateStyle based on locale? > > Given various issues with locale (indexes, ordering etc) I'd think that > having a DB follow the O/S locale should be special case and require > explicit configuration. > > More so if certain locales are significantly slower than others which > seemed to be the case at least in recent memory. > > What if a European DB backed website is hosted on a US server with English, > French and German data? > > If apps/programs are talking to DBs more than people are then it may make > more sense to store things in an application friendly format e.g. (date = > YYYY-MM-DD, or seconds since epoch) format and having the app convert it > based on the user's preferences. After all even in English, apps may choose > to display Tuesday as T, Tue, Tuesday, or whatever the Boss wants. > > Unless postgresql has special features allowing switching from one locale > to another on the fly (including indexes, ordering etc) within a DB > session, I'd rather stick to say the C locale, or whatever it is that's > fastest. > > Another point of consideration: if someone accidentally loads > multibyte/other locale data into a C locale DB (or whatever is chosen as > default DB locale), would dumping the loaded data and reloading it into a > multibyte locale result in information/precision loss? > > Link. > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
I thought it was more correctly we were considering not using the the system locale automatically, but that if someone wished to use --locale=en_US we'd let that work, right? I would assume that if someone actually went to the bother of setting a locale, then it should be the deciding factor in how we handle dates, et. al. On Tue, 24 Jun 2003, Bruce Momjian wrote: > > We are actually considering not honoring locale for initdb encodings, so > it might make no sense to do this --- that another reason for the > question mark, but until we decide, it is an open issue. > > --------------------------------------------------------------------------- > > Lincoln Yeoh wrote: > > At 03:24 PM 6/23/2003 -0400, Bruce Momjian wrote: > > > > >Added to TODO, with question mark: > > > > > > * Have initdb set DateStyle based on locale? > > > > Given various issues with locale (indexes, ordering etc) I'd think that > > having a DB follow the O/S locale should be special case and require > > explicit configuration. > > > > More so if certain locales are significantly slower than others which > > seemed to be the case at least in recent memory. > > > > What if a European DB backed website is hosted on a US server with English, > > French and German data? > > > > If apps/programs are talking to DBs more than people are then it may make > > more sense to store things in an application friendly format e.g. (date = > > YYYY-MM-DD, or seconds since epoch) format and having the app convert it > > based on the user's preferences. After all even in English, apps may choose > > to display Tuesday as T, Tue, Tuesday, or whatever the Boss wants. > > > > Unless postgresql has special features allowing switching from one locale > > to another on the fly (including indexes, ordering etc) within a DB > > session, I'd rather stick to say the C locale, or whatever it is that's > > fastest. > > > > Another point of consideration: if someone accidentally loads > > multibyte/other locale data into a C locale DB (or whatever is chosen as > > default DB locale), would dumping the loaded data and reloading it into a > > multibyte locale result in information/precision loss? > > > > Link. > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 8: explain analyze is your friend > > > >
Good point. --------------------------------------------------------------------------- scott.marlowe wrote: > I thought it was more correctly we were considering not using the the > system locale automatically, but that if someone wished to use > --locale=en_US we'd let that work, right? > > I would assume that if someone actually went to the bother of setting a > locale, then it should be the deciding factor in how we handle dates, et. > al. > > On Tue, 24 Jun 2003, Bruce Momjian wrote: > > > > > We are actually considering not honoring locale for initdb encodings, so > > it might make no sense to do this --- that another reason for the > > question mark, but until we decide, it is an open issue. > > > > --------------------------------------------------------------------------- > > > > Lincoln Yeoh wrote: > > > At 03:24 PM 6/23/2003 -0400, Bruce Momjian wrote: > > > > > > >Added to TODO, with question mark: > > > > > > > > * Have initdb set DateStyle based on locale? > > > > > > Given various issues with locale (indexes, ordering etc) I'd think that > > > having a DB follow the O/S locale should be special case and require > > > explicit configuration. > > > > > > More so if certain locales are significantly slower than others which > > > seemed to be the case at least in recent memory. > > > > > > What if a European DB backed website is hosted on a US server with English, > > > French and German data? > > > > > > If apps/programs are talking to DBs more than people are then it may make > > > more sense to store things in an application friendly format e.g. (date = > > > YYYY-MM-DD, or seconds since epoch) format and having the app convert it > > > based on the user's preferences. After all even in English, apps may choose > > > to display Tuesday as T, Tue, Tuesday, or whatever the Boss wants. > > > > > > Unless postgresql has special features allowing switching from one locale > > > to another on the fly (including indexes, ordering etc) within a DB > > > session, I'd rather stick to say the C locale, or whatever it is that's > > > fastest. > > > > > > Another point of consideration: if someone accidentally loads > > > multibyte/other locale data into a C locale DB (or whatever is chosen as > > > default DB locale), would dumping the loaded data and reloading it into a > > > multibyte locale result in information/precision loss? > > > > > > Link. > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 8: explain analyze is your friend > > > > > > > > > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Wasn't a 'set' command also discussed to override locale? On Tue, 2003-06-24 at 16:02, Bruce Momjian wrote: > Good point. > > --------------------------------------------------------------------------- > > scott.marlowe wrote: > > I thought it was more correctly we were considering not using the the > > system locale automatically, but that if someone wished to use > > --locale=en_US we'd let that work, right? > > > > I would assume that if someone actually went to the bother of setting a > > locale, then it should be the deciding factor in how we handle dates, et. > > al. > > > > On Tue, 24 Jun 2003, Bruce Momjian wrote: > > > > > > > > We are actually considering not honoring locale for initdb encodings, so > > > it might make no sense to do this --- that another reason for the > > > question mark, but until we decide, it is an open issue. > > > > > > --------------------------------------------------------------------------- > > > > > > Lincoln Yeoh wrote: > > > > At 03:24 PM 6/23/2003 -0400, Bruce Momjian wrote: > > > > > > > > >Added to TODO, with question mark: > > > > > > > > > > * Have initdb set DateStyle based on locale? > > > > > > > > Given various issues with locale (indexes, ordering etc) I'd think that > > > > having a DB follow the O/S locale should be special case and require > > > > explicit configuration. > > > > > > > > More so if certain locales are significantly slower than others which > > > > seemed to be the case at least in recent memory. > > > > > > > > What if a European DB backed website is hosted on a US server with English, > > > > French and German data? > > > > > > > > If apps/programs are talking to DBs more than people are then it may make > > > > more sense to store things in an application friendly format e.g. (date = > > > > YYYY-MM-DD, or seconds since epoch) format and having the app convert it > > > > based on the user's preferences. After all even in English, apps may choose > > > > to display Tuesday as T, Tue, Tuesday, or whatever the Boss wants. > > > > > > > > Unless postgresql has special features allowing switching from one locale > > > > to another on the fly (including indexes, ordering etc) within a DB > > > > session, I'd rather stick to say the C locale, or whatever it is that's > > > > fastest. > > > > > > > > Another point of consideration: if someone accidentally loads > > > > multibyte/other locale data into a C locale DB (or whatever is chosen as > > > > default DB locale), would dumping the loaded data and reloading it into a > > > > multibyte locale result in information/precision loss? > > > > > > > > Link. -- +-----------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Oh, great altar of passive entertainment, bestow upon me | | thy discordant images at such speed as to render linear | | thought impossible" (Calvin, regarding TV) | +-----------------------------------------------------------
There's already a DateStyle guc that can be set. Would that be the one, or would we need something else? On 24 Jun 2003, Ron Johnson wrote: > Wasn't a 'set' command also discussed to override locale? > > On Tue, 2003-06-24 at 16:02, Bruce Momjian wrote: > > Good point. > > > > --------------------------------------------------------------------------- > > > > scott.marlowe wrote: > > > I thought it was more correctly we were considering not using the the > > > system locale automatically, but that if someone wished to use > > > --locale=en_US we'd let that work, right? > > > > > > I would assume that if someone actually went to the bother of setting a > > > locale, then it should be the deciding factor in how we handle dates, et. > > > al. > > > > > > On Tue, 24 Jun 2003, Bruce Momjian wrote: > > > > > > > > > > > We are actually considering not honoring locale for initdb encodings, so > > > > it might make no sense to do this --- that another reason for the > > > > question mark, but until we decide, it is an open issue. > > > > > > > > --------------------------------------------------------------------------- > > > > > > > > Lincoln Yeoh wrote: > > > > > At 03:24 PM 6/23/2003 -0400, Bruce Momjian wrote: > > > > > > > > > > >Added to TODO, with question mark: > > > > > > > > > > > > * Have initdb set DateStyle based on locale? > > > > > > > > > > Given various issues with locale (indexes, ordering etc) I'd think that > > > > > having a DB follow the O/S locale should be special case and require > > > > > explicit configuration. > > > > > > > > > > More so if certain locales are significantly slower than others which > > > > > seemed to be the case at least in recent memory. > > > > > > > > > > What if a European DB backed website is hosted on a US server with English, > > > > > French and German data? > > > > > > > > > > If apps/programs are talking to DBs more than people are then it may make > > > > > more sense to store things in an application friendly format e.g. (date = > > > > > YYYY-MM-DD, or seconds since epoch) format and having the app convert it > > > > > based on the user's preferences. After all even in English, apps may choose > > > > > to display Tuesday as T, Tue, Tuesday, or whatever the Boss wants. > > > > > > > > > > Unless postgresql has special features allowing switching from one locale > > > > > to another on the fly (including indexes, ordering etc) within a DB > > > > > session, I'd rather stick to say the C locale, or whatever it is that's > > > > > fastest. > > > > > > > > > > Another point of consideration: if someone accidentally loads > > > > > multibyte/other locale data into a C locale DB (or whatever is chosen as > > > > > default DB locale), would dumping the loaded data and reloading it into a > > > > > multibyte locale result in information/precision loss? > > > > > > > > > > Link. > >