Обсуждение: How to add month.year column validation
Database contains CHAR(7) type column which must be NOT NULL and must be in the format mm.yyyy where: mm - month number, always two digits in range 01 .. 12 . - separator must be point always. yyyy - must be four digits in range approx. 1980 .. 2110 . How to add column validation to table column which forces this ? Andrus.
Hi Andrus, > > How to add column validation to table column which forces this ? You're looking for a constraint - I presume you know what conditions you want to check for - I've done a similar thing recentlyensuring the first character in a column is a decimal point, or that if one column's value is something, anothercolumn's value is within a certain range etc. http://www.postgresql.org/docs/8.4/static/ddl-constraints.html (assuming you're running 8.4) Regards, Andy
On Dec 22, 2009, at 15:03 , Andrus wrote:
> Database contains CHAR(7) type column which must be NOT NULL and
> must be in the format
>
> mm.yyyy
>
> where:
>
> mm - month number, always two digits in range 01 .. 12
>
> . - separator  must be point always.
>
> yyyy - must be four digits in range approx. 1980 .. 2110 .
>
> How to add column validation to table column which forces this ?
A check constraint with regex might work, something like (untested)
CHECK (val ~ $re$^(19|20|21)[0-9]{2}.[01][0-9]$$re$)
However, I strongly recommend using a date column with, perhaps, a
restriction that the day field is always 1 or some other agreed-upon
(and documented) value (e.g., CHECK (val = date_truc('month', val))).
If the data is date data, you're likely going to want to do other
operations on the field which will be much easier if it's already a
date value.
Michael Glaesemann
grzm seespotcode net
			
		> -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of Andrus > Sent: Tuesday, December 22, 2009 12:03 PM > To: pgsql-general@postgresql.org > Subject: [GENERAL] How to add month.year column validation > > Database contains CHAR(7) type column which must be NOT NULL and must > be in > the format > > mm.yyyy > > where: > > mm - month number, always two digits in range 01 .. 12 > > . - separator must be point always. > > yyyy - must be four digits in range approx. 1980 .. 2110 . > > How to add column validation to table column which forces this ? If you use a date (IOW, just always use a day of 1) then error detection is automatic. It also only consumes 4 bytes of storage. You can also use a check constraint, or create a domain. http://www.postgresql.org/docs/8.4/static/ddl-constraints.html http://www.postgresql.org/docs/current/static/sql-createdomain.html
Michael,
Thank you very much.
I have very few knowledge on rexexps.
> CHECK (val ~ $re$^(19|20|21)[0-9]{2}.[01][0-9]$$re$)
1. I tried
create temp table test5 ( tmkuu char (7) CHECK (tmkuu ~
$re$[01][0-9].^(19|20)[0-9]{2}$re$) ) on commit drop;
insert into test5 values('01.2009');
but got error
ERROR:  new row for relation "test5" violates check constraint
"test5_tmkuu_check"
2. How to restrict month numbers to range 01 .. 12 ?
This regex seems to accept month numbers like 13
3.  How to add this check to existing column for 8.0 and later servers ?
I tried
alter table test5 alter tmkuu add CHECK (tmkuu ~
$re$[01][0-9].^(19|20)[0-9]{2}$re$  )
causes error:
ERROR:  syntax error at or near "add"
> However, I strongly recommend using a date column with, perhaps, a
> restriction that the day field is always 1 or some other agreed-upon  (and
> documented) value (e.g., CHECK (val = date_truc('month', val))).  If the
> data is date data, you're likely going to want to do other  operations on
> the field which will be much easier if it's already a  date value.
This is existing database and many application are using it.
I cannot change column type to date since other applications are expecting
char(7) column.
As far as I know it is not possible to makse such change so that externally
it is visible and writeable as char(7) column for 8.0+ servers.
Andrus.
			
		CREATE DOMAIN Nasty_Month_year AS CHAR(7)
CHECK ( SUBSTR(VALUE, 1, 2) IN ('01', '02', '03', '04', '05', '06',
'07', '08', '09', '10', '11', '12') AND SUBSTR(VALUE, 3, 1) = '.' AND
SUBSTR(VALUE, 4, 4)::int BETWEEN 1980 and 2110 );
CREATE TABLE foo (bar Nasty_Month_year );
This Works:
INSERT INTO foo VALUES ('02.1997');
This:
INSERT INTO foo VALUES ('13.1997');
Gives this:
ERROR:  value for domain nasty_month_year violates check constraint
"nasty_month_year_check"
********** Error **********
ERROR: value for domain nasty_month_year violates check constraint
"nasty_month_year_check"
SQL state: 23514
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Andrus
> Sent: Tuesday, December 22, 2009 12:47 PM
> To: Michael Glaesemann
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] How to add month.year column validation
>
> Michael,
>
> Thank you very much.
> I have very few knowledge on rexexps.
>
> > CHECK (val ~ $re$^(19|20|21)[0-9]{2}.[01][0-9]$$re$)
>
> 1. I tried
>
> create temp table test5 ( tmkuu char (7) CHECK (tmkuu ~
> $re$[01][0-9].^(19|20)[0-9]{2}$re$) ) on commit drop;
> insert into test5 values('01.2009');
>
> but got error
>
> ERROR:  new row for relation "test5" violates check constraint
> "test5_tmkuu_check"
>
> 2. How to restrict month numbers to range 01 .. 12 ?
> This regex seems to accept month numbers like 13
>
> 3.  How to add this check to existing column for 8.0 and later servers
> ?
> I tried
>
> alter table test5 alter tmkuu add CHECK (tmkuu ~
> $re$[01][0-9].^(19|20)[0-9]{2}$re$  )
>
> causes error:
>
> ERROR:  syntax error at or near "add"
>
>
> > However, I strongly recommend using a date column with, perhaps, a
> > restriction that the day field is always 1 or some other agreed-upon
> (and
> > documented) value (e.g., CHECK (val = date_truc('month', val))).  If
> the
> > data is date data, you're likely going to want to do other
> operations on
> > the field which will be much easier if it's already a  date value.
>
> This is existing database and many application are using it.
> I cannot change column type to date since other applications are
> expecting
> char(7) column.
>
> As far as I know it is not possible to makse such change so that
> externally
> it is visible and writeable as char(7) column for 8.0+ servers.
>
> Andrus.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
			
		On Tue, Dec 22, 2009 at 1:47 PM, Andrus <kobruleht2@hot.ee> wrote:
> Michael,
>
> Thank you very much.
> I have very few knowledge on rexexps.
>
>> CHECK (val ~ $re$^(19|20|21)[0-9]{2}.[01][0-9]$$re$)
>
> 1. I tried
>
> create temp table test5 ( tmkuu char (7) CHECK (tmkuu ~
> $re$[01][0-9].^(19|20)[0-9]{2}$re$) ) on commit drop;
> insert into test5 values('01.2009');
>
> but got error
>
> ERROR:  new row for relation "test5" violates check constraint
> "test5_tmkuu_check"
Yeah, you've got some characters in odd places there (^ in particular,
$ missing at end.)  Here's one that pretty much works:
create table test (a text check (a ~
$r$^(0[1-9]|1[0-2]).(19[89][0-9]|20[0-9]{2}|210[0-9]|2110)$$r$));
>
> 2. How to restrict month numbers to range 01 .. 12 ?
> This regex seems to accept month numbers like 13
Make one that doesn't accept 13.
> 3.  How to add this check to existing column for 8.0 and later servers ?
> I tried
>
> alter table test5 alter tmkuu add CHECK (tmkuu ~
> $re$[01][0-9].^(19|20)[0-9]{2}$re$  )
>
> causes error:
>
> ERROR:  syntax error at or near "add"
Left out a keyword, "constraint"
alter table test add constraint test_a_check check (a ~
$r$^(0[1-9]|1[0-2]).(19[89][0-9]|20[0-9]{2}|210[0-9]|2110)$$r$);
>> However, I strongly recommend using a date column with, perhaps, a
>> restriction that the day field is always 1 or some other agreed-upon  (and
>> documented) value (e.g., CHECK (val = date_truc('month', val))).  If the
>> data is date data, you're likely going to want to do other  operations on
>> the field which will be much easier if it's already a  date value.
>
> This is existing database and many application are using it.
> I cannot change column type to date since other applications are expecting
> char(7) column.
Just so you know, down this road lies madness.  But you can make it
work.  I've been down this path before too.  Eventually your
developers doing dba work will paint you into a corner.
			
		
On Dec 22, 2009, at 16:11 , Scott Marlowe wrote:
> alter table test add constraint test_a_check check (a ~
> $r$^(0[1-9]|1[0-2]).(19[89][0-9]|20[0-9]{2}|210[0-9]|2110)$$r$);
>
>>> However, I strongly recommend using a date column with, perhaps, a
>>> restriction that the day field is always 1 or some other agreed-
>>> upon  (and
>>> documented) value (e.g., CHECK (val = date_truc('month', val))).
>>> If the
>>> data is date data, you're likely going to want to do other
>>> operations on
>>> the field which will be much easier if it's already a  date value.
>>
>> This is existing database and many application are using it.
>> I cannot change column type to date since other applications are
>> expecting
>> char(7) column.
>
> Just so you know, down this road lies madness.
I completely agree. Schedule some downtime and make the column a date
column.
Michael Glaesemann
grzm seespotcode net
			
		
----- "Dann Corbit" <DCorbit@connx.com> wrote:
> CREATE DOMAIN Nasty_Month_year AS CHAR(7)
> CHECK ( SUBSTR(VALUE, 1, 2) IN ('01', '02', '03', '04', '05', '06',
> '07', '08', '09', '10', '11', '12') AND SUBSTR(VALUE, 3, 1) = '.' AND
> SUBSTR(VALUE, 4, 4)::int BETWEEN 1980 and 2110 );
>
> CREATE TABLE foo (bar Nasty_Month_year );
>
> This Works:
> INSERT INTO foo VALUES ('02.1997');
>
> This:
> INSERT INTO foo VALUES ('13.1997');
> Gives this:
> ERROR:  value for domain nasty_month_year violates check constraint
> "nasty_month_year_check"
>
> ********** Error **********
>
> ERROR: value for domain nasty_month_year violates check constraint
> "nasty_month_year_check"
> SQL state: 23514
>
My attempt at this problem:
test=# \d date_check
          Table "public.date_check"
  Column  |         Type         | Modifiers
----------+----------------------+-----------
 id       | integer              |
 date_fld | character varying(7) |
Check constraints:
    "date_check_date_fld_check" CHECK (isfinite(replace(date_fld::text, '.'::text, '/01/'::text)::date) AND
length(date_fld::text)= 7) 
test=# INSERT INTO date_check VALUES (1,'12.2009');
INSERT 0 1
test=# INSERT INTO date_check VALUES (1,'13.2009');
ERROR:  date/time field value out of range: "13/01/2009"
HINT:  Perhaps you need a different "datestyle" setting.
test=# INSERT INTO date_check VALUES (1,'12.09');
ERROR:  new row for relation "date_check" violates check constraint "date_check_date_fld_check"
test=# SELECT * from date_check ;
 id | date_fld
----+----------
  1 | 12.2009
(1 row)
Adrian Klaver
aklaver@comcast.net
			
		Dann,
>CREATE DOMAIN Nasty_Month_year AS CHAR(7)
>CHECK ( SUBSTR(VALUE, 1, 2) IN ('01', '02', '03', '04', '05', '06',
>'07', '08', '09', '10', '11', '12') AND SUBSTR(VALUE, 3, 1) = '.' AND
>SUBSTR(VALUE, 4, 4)::int BETWEEN 1980 and 2110 );
>CREATE TABLE foo (bar Nasty_Month_year );
Thank you. This looks better than regexp since it conforms to SQL standard.
regexps are PostgreSql specific.
I created test script to test changing existing database column with
possibly wrong data:
CREATE DOMAIN MonthYear AS CHAR(7) NOT NULL
CHECK ( SUBSTR(VALUE, 1, 2) IN ('01', '02', '03', '04', '05', '06',
'07', '08', '09', '10', '11', '12') AND SUBSTR(VALUE, 3, 1) = '.' AND
SUBSTR(VALUE, 4, 4)::int BETWEEN 1980 and 2100 );
CREATE TABLE foo (tmkuu char(7));
-- add bad data
INSERT INTO foo VALUES ('');
INSERT INTO foo VALUES ('02.x');
INSERT INTO foo VALUES ('02.1970');
INSERT INTO foo VALUES ('02.2101');
INSERT INTO foo VALUES (NULL);
-- add good data
INSERT INTO foo VALUES ('12.2009');
delete from foo where tmkuu is null OR NOT ( SUBSTR(tmkuu, 1, 2) IN ('01',
'02', '03', '04', '05', '06',
'07', '08', '09', '10', '11', '12') AND SUBSTR(tmkuu, 3, 1) = '.' AND
SUBSTR(tmkuu, 4, 4)::int BETWEEN 1980 and 2100 );
alter table foo alter tmkuu type MonthYear;
select * from foo;
but got error on DELETE:
ERROR:  invalid input syntax for integer: "x"
How to apply this constraint to existing data ?
Andrus.
			
		Andrus,
>
> -- add bad data
> INSERT INTO foo VALUES ('');
> INSERT INTO foo VALUES ('02.x');
> INSERT INTO foo VALUES ('02.1970');
> INSERT INTO foo VALUES ('02.2101');
> INSERT INTO foo VALUES (NULL);
> -- add good data
> INSERT INTO foo VALUES ('12.2009');
>
> delete from foo where tmkuu is null OR NOT ( SUBSTR(tmkuu, 1, 2) IN ('01', '02', '03', '04', '05', '06',
> '07', '08', '09', '10', '11', '12') AND SUBSTR(tmkuu, 3, 1) = '.' AND
> SUBSTR(tmkuu, 4, 4)::int BETWEEN 1980 and 2100 );
>
> alter table foo alter tmkuu type MonthYear;
> select * from foo;
>
> but got error on DELETE:
>
> ERROR:  invalid input syntax for integer: "x"
>
> How to apply this constraint to existing data ?
Remove the data that doesn't conform to the constraint.
You got the error because you're trying to cast SUBSTR(tmkuu, 4, 4) to an integer in your DELETE statement - but in the
caseof the second record, that expression cannot be an integer (because of the x) hence the error. 
Regards,
Andy
			
		On Wed, Dec 23, 2009 at 12:02 PM, Andy Shellam
<andy-lists@networkmail.eu> wrote:
> Andrus,
>
>>
>> -- add bad data
>> INSERT INTO foo VALUES ('');
>> INSERT INTO foo VALUES ('02.x');
>> INSERT INTO foo VALUES ('02.1970');
>> INSERT INTO foo VALUES ('02.2101');
>> INSERT INTO foo VALUES (NULL);
>> -- add good data
>> INSERT INTO foo VALUES ('12.2009');
>>
>> delete from foo where tmkuu is null OR NOT ( SUBSTR(tmkuu, 1, 2) IN ('01', '02', '03', '04', '05', '06',
>> '07', '08', '09', '10', '11', '12') AND SUBSTR(tmkuu, 3, 1) = '.' AND
>> SUBSTR(tmkuu, 4, 4)::int BETWEEN 1980 and 2100 );
>>
>> alter table foo alter tmkuu type MonthYear;
>> select * from foo;
>>
>> but got error on DELETE:
>>
>> ERROR:  invalid input syntax for integer: "x"
>>
>> How to apply this constraint to existing data ?
>
>
> Remove the data that doesn't conform to the constraint.
>
> You got the error because you're trying to cast SUBSTR(tmkuu, 4, 4) to an integer in your DELETE statement - but in
thecase of the second record, that expression cannot be an integer (because of the x) hence the error. 
You can use the regex I posted to get rid of the data easily, then go
back to the substr one for  a check constraint after that.
			
		Scott, >You can use the regex I posted to get rid of the data easily, then go >back to the substr one for a check constraint after that. regex is non-standard. How to implement this in standard SQL ? Andrus.
On Thursday 24 December 2009 3:47:23 pm Andrus wrote: > Scott, > > >You can use the regex I posted to get rid of the data easily, then go > >back to the substr one for a check constraint after that. > > regex is non-standard. > How to implement this in standard SQL ? > > Andrus. Why should it matter? The initial data clean up is a one time event. Once the fields are cleaned up the check constraint will keep them that way. -- Adrian Klaver aklaver@comcast.net
On Thu, Dec 24, 2009 at 4:47 PM, Andrus <kobruleht2@hot.ee> wrote: > Scott, > >> You can use the regex I posted to get rid of the data easily, then go >> back to the substr one for a check constraint after that. > > regex is non-standard. > How to implement this in standard SQL ? I take it you need a way to scrub your data in various databases, not just pgsql? I'm not sure there is a simple SQL standard way. It's likely that this one time job might require various non-standard ways of scrubbing your data this one time. You're gonna have to figure out how to make databases other than pgsql happy without me, the only one I'm even a little familiar with is Oracle, and my oracle-fu is rather rusty after a three year or so lay off from it.