Обсуждение: Normalization and regexp
Hi,
Since Canada Post hasn't programmed their automated web tools for calculating transportation costs to work with
anythingoutside of Windows Internet Explorer, I'm obliged to write a web based program optimized for all those *other*
browsers,myself. Part of this task requires that I set up tables in postgresql that match postal codes to
transportationcost.
Canada Post provides a booklet for calculating transportation cost based on package weight and the first three
charactersof the postal code sequence. For instance, if I want to send a package to an address that includes G8F 1X1 as
thepostal code, I take the first 3 characters G8F and look them up in table 1.
Table 1
PostalCode Tarrif number
---------------------------
G4V 14
G8E-G8G 14
G4R-G4S 13
Since G8F falls in the G8E-G8G range, I now know that the tarrif number is 14. Taking the number 14, I go to table 2
Table 2
For tarrif Code 14
Weight(kg) Price
----------------------
1.0 5.37
1.5 5.61
2.0 5.82
If the weight of my package is 1kg, the price is 5.37 to send the package to the address bearing G8F as the first 3
charactersof the postal code.
To render this in the database, I have done the following:
_____________________________
p_code |
=============================
pcode_id | tarrif |
-----------------------------
G4V | 14 |
-----------------------------
G8E | 14 |
-----------------------------
G8F | 14 |
-----------------------------
G8G | 14 |
-----------------------------
G4R | 13 |
-----------------------------
G4S | 13 |
-----------------------------
__________________________________
price_weight |
==================================
tarrif | weight(kg)| price |
----------------------------------
14 | 1.0 | 5.37 |
----------------------------------
14 | 1.5 | 5.61 |
----------------------------------
14 | 2.0 | 5.82 |
----------------------------------
13 | 1.0 | 5.20 |
----------------------------------
13 | 1.5 | 5.32 |
----------------------------------
13 | 2.0 | 5.42 |
Therefore my sql statement would look something like this:
SELECT price FROM price_weight
WHERE p_code.tarrif = price_weight.tarrif
AND pcode = 'G8F'
AND weight = '1.0';
I think this will work, but before I proceed, I'd like to ask 2 questions:
1.
Is it possible to further normalize the data in the p_code and price_weight tables above?
2.
Is it possible to abbreviate the number of records in the p_code table using regular expressions. For instance, to
avoidrepetition, I thought I'd use regular expressions, so that instead of entering the postal code into separate rows
assuch:
G4V 14
G8E 14
G8F 14
G8G 14
I could do something like this:
(G4V | G8[E-G]) 14
Somehow I don't think this is possible, but I'm looking for any way to minimize the number of postal codes that I have
toenter, since there's a lot of them.
Anyway, I realize these questions may have more to do with database design than postgresql per se. If there's a better
placeto ask them, please point me in the right direction.
Thanks,
Mark
In order to reduce the number of rows in the p_code table, you could
modify it so that it had the following columns:
create table p_code (
pCode_Start varchar(3),
pCode_End varchar(3),
tariff integer,
primary key(pCode_Start, pCode_End)
);
If you need to insert a single code, use the following:
insert into pCode('G4V','G4V',14);
and a range as follows:
insert into pCode('G8E', 'G8G',14);
Your select statement is now:
SELECT price FROM price_weight pw, p_code pc
WHERE pc.tarrif = pw.tarrif
AND pCode_Start <= 'G8F'
AND pCode_End >= 'G8F'
AND weight = '1.0';
I think that should work OK...
John Sidney-Woollett
MT said:
> Hi,
>
> Since Canada Post hasn't programmed their automated web tools for
> calculating transportation costs to work with anything outside of Windows
> Internet Explorer, I'm obliged to write a web based program optimized for
> all those *other* browsers, myself. Part of this task requires that I set
> up tables in postgresql that match postal codes to transportation cost.
>
> Canada Post provides a booklet for calculating transportation cost based
> on package weight and the first three characters of the postal code
> sequence. For instance, if I want to send a package to an address that
> includes G8F 1X1 as the postal code, I take the first 3 characters G8F and
> look them up in table 1.
>
> Table 1
>
> PostalCode Tarrif number
> ---------------------------
> G4V 14
> G8E-G8G 14
> G4R-G4S 13
>
> Since G8F falls in the G8E-G8G range, I now know that the tarrif number is
> 14. Taking the number 14, I go to table 2
>
> Table 2
>
> For tarrif Code 14
> Weight(kg) Price
> ----------------------
> 1.0 5.37
> 1.5 5.61
> 2.0 5.82
>
> If the weight of my package is 1kg, the price is 5.37 to send the package
> to the address bearing G8F as the first 3 characters of the postal code.
>
> To render this in the database, I have done the following:
>
> _____________________________
> p_code |
> =============================
> pcode_id | tarrif |
> -----------------------------
> G4V | 14 |
> -----------------------------
> G8E | 14 |
> -----------------------------
> G8F | 14 |
> -----------------------------
> G8G | 14 |
> -----------------------------
> G4R | 13 |
> -----------------------------
> G4S | 13 |
> -----------------------------
>
> __________________________________
> price_weight |
> ==================================
> tarrif | weight(kg)| price |
> ----------------------------------
> 14 | 1.0 | 5.37 |
> ----------------------------------
> 14 | 1.5 | 5.61 |
> ----------------------------------
> 14 | 2.0 | 5.82 |
> ----------------------------------
> 13 | 1.0 | 5.20 |
> ----------------------------------
> 13 | 1.5 | 5.32 |
> ----------------------------------
> 13 | 2.0 | 5.42 |
>
>
> Therefore my sql statement would look something like this:
>
> SELECT price FROM price_weight
> WHERE p_code.tarrif = price_weight.tarrif
> AND pcode = 'G8F'
> AND weight = '1.0';
>
> I think this will work, but before I proceed, I'd like to ask 2 questions:
>
> 1.
> Is it possible to further normalize the data in the p_code and
> price_weight tables above?
>
> 2.
> Is it possible to abbreviate the number of records in the p_code table
> using regular expressions. For instance, to avoid repetition, I thought
> I'd use regular expressions, so that instead of entering the postal code
> into separate rows as such:
>
> G4V 14
> G8E 14
> G8F 14
> G8G 14
>
> I could do something like this:
>
> (G4V | G8[E-G]) 14
>
> Somehow I don't think this is possible, but I'm looking for any way to
> minimize the number of postal codes that I have to enter, since there's a
> lot of them.
>
> Anyway, I realize these questions may have more to do with database design
> than postgresql per se. If there's a better place to ask them, please
> point me in the right direction.
>
> Thanks,
>
> Mark
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>
There's no reason you couldn't store a regex string into a column. The trick has only to do with proper escaping for the various steps, namely the loading method (regular SQL has some escaping requirements and so you need double back-slashes and escaped quotes). A work-around for this is to use other input methods like COPY. Then there's the escaping required for the regex tool of choice: either client-side or server(db)-side. If you're doing the regex processing client-side this might be easier. Depends on usage. In short, the principal of what you propose seems completely sound. However, there might be tricks to handling character escaping correctly. == Ezra Epstein "MT" <m_tessier@sympatico.ca> wrote in message news:20031222181239.476cfa47.m_tessier@sympatico.ca... > Hi, > > Since Canada Post hasn't programmed their automated web tools for calculating transportation costs to work with anything outside of Windows Internet Explorer, I'm obliged to write a web based program optimized for all those *other* browsers, myself. Part of this task requires that I set up tables in postgresql that match postal codes to transportation cost. > > Canada Post provides a booklet for calculating transportation cost based on package weight and the first three characters of the postal code sequence. For instance, if I want to send a package to an address that includes G8F 1X1 as the postal code, I take the first 3 characters G8F and look them up in table 1. > > Table 1 > > PostalCode Tarrif number > --------------------------- > G4V 14 > G8E-G8G 14 > G4R-G4S 13 > > Since G8F falls in the G8E-G8G range, I now know that the tarrif number is 14. Taking the number 14, I go to table 2 > > Table 2 > > For tarrif Code 14 > Weight(kg) Price > ---------------------- > 1.0 5.37 > 1.5 5.61 > 2.0 5.82 > > If the weight of my package is 1kg, the price is 5.37 to send the package to the address bearing G8F as the first 3 characters of the postal code. > > To render this in the database, I have done the following: > > _____________________________ > p_code | > ============================= > pcode_id | tarrif | > ----------------------------- > G4V | 14 | > ----------------------------- > G8E | 14 | > ----------------------------- > G8F | 14 | > ----------------------------- > G8G | 14 | > ----------------------------- > G4R | 13 | > ----------------------------- > G4S | 13 | > ----------------------------- > > __________________________________ > price_weight | > ================================== > tarrif | weight(kg)| price | > ---------------------------------- > 14 | 1.0 | 5.37 | > ---------------------------------- > 14 | 1.5 | 5.61 | > ---------------------------------- > 14 | 2.0 | 5.82 | > ---------------------------------- > 13 | 1.0 | 5.20 | > ---------------------------------- > 13 | 1.5 | 5.32 | > ---------------------------------- > 13 | 2.0 | 5.42 | > > > Therefore my sql statement would look something like this: > > SELECT price FROM price_weight > WHERE p_code.tarrif = price_weight.tarrif > AND pcode = 'G8F' > AND weight = '1.0'; > > I think this will work, but before I proceed, I'd like to ask 2 questions: > > 1. > Is it possible to further normalize the data in the p_code and price_weight tables above? > > 2. > Is it possible to abbreviate the number of records in the p_code table using regular expressions. For instance, to avoid repetition, I thought I'd use regular expressions, so that instead of entering the postal code into separate rows as such: > > G4V 14 > G8E 14 > G8F 14 > G8G 14 > > I could do something like this: > > (G4V | G8[E-G]) 14 > > Somehow I don't think this is possible, but I'm looking for any way to minimize the number of postal codes that I have to enter, since there's a lot of them. > > Anyway, I realize these questions may have more to do with database design than postgresql per se. If there's a better place to ask them, please point me in the right direction. > > Thanks, > > Mark > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings >