Обсуждение: How to clean up phone-numbers with regex?

Поиск
Список
Период
Сортировка

How to clean up phone-numbers with regex?

От
Andreas
Дата:
<font face="Arial">Hi<br /><br /> I need to clean up phone-numbers. Somehow I got a Excel list that has weird graphical
characterstrailing some of the entries.</font><br /><font face="Arial"><font face="Arial">My DB is UTF8 so it would
storethis mess but I don<small>'t like to import it in the first place.<br /><br /> OK, I know how to read the stuff
intoa temporary table to clean it up before the actual import.<br /></small></font>How can I do an update on the column
thatdeletes every char that is not in a given set of chars like '+- 0123456/()'?<br /><br /><small></small><br />
Secondbut similar question:<br /> How can I select records that have fields that contain characters not included in a
givenalphabet?<br /> E.G. find fields that contain some char not in 0-9,a-z,A-Z, +-()/?<br /><br /><br /> regards<br />
Andreas<br/></font> 

Re: How to clean up phone-numbers with regex?

От
Rob Sargent
Дата:
Have you looked into regular expressions?

Sent from my iPhone

On May 19, 2014, at 2:54 AM, Andreas <maps.on@gmx.net> wrote:

Hi

I need to clean up phone-numbers. Somehow I got a Excel list that has weird graphical characters trailing some of the entries.

My DB is UTF8 so it would store this mess but I don't like to import it in the first place.

OK, I know how to read the stuff into a temporary table to clean it up before the actual import.
How can I do an update on the column that deletes every char that is not in a given set of chars like '+- 0123456/()'?


Second but similar question:
How can I select records that have fields that contain characters not included in a given alphabet?
E.G. find fields that contain some char not in 0-9,a-z,A-Z, +-()/?


regards
Andreas

Re: How to clean up phone-numbers with regex?

От
Steve Crawford
Дата:
<div class="moz-cite-prefix">On 05/19/2014 01:54 AM, Andreas wrote:<br /></div><blockquote
cite="mid:5379C6C1.5020904@gmx.net"type="cite"><font face="Arial">Hi<br /><br /> I need to clean up phone-numbers.
SomehowI got a Excel list that has weird graphical characters trailing some of the entries.</font><br /><font
face="Arial"><fontface="Arial">My DB is UTF8 so it would store this mess but I don<small>'t like to import it in the
firstplace.<br /><br /> OK, I know how to read the stuff into a temporary table to clean it up before the actual
import.<br/></small></font>How can I do an update on the column that deletes every char that is not in a given set of
charslike '+- 0123456/()'?<br /><br /></font></blockquote><br /> See: <a class="moz-txt-link-freetext"
href="http://www.postgresql.org/docs/current/static/functions-matching.html">http://www.postgresql.org/docs/current/static/functions-matching.html</a><br
/><br/> For the first case, the regexp_replace function is probably your best bet. But note that, depending on the
qualityof your input, just removing characters outside that range may still not yield the desired result.<br /><br />
selectregexp_replace('(12s3)-456-635/6(a+sdk', '[^0-9()+-/]', '', 'g');<br />   regexp_replace   <br />
-------------------<br/>  (123)-456-635/6(+<br /><br /> You can remove all formatting by requiring only digits then
checkand/or reformat later as desired.<br /> steve=> select regexp_replace('(12s3)-456-6356(a+sdk', '[^0-9]', '',
'g');<br/>  regexp_replace <br /> ----------------<br />  1234566356<br /><br /><blockquote
cite="mid:5379C6C1.5020904@gmx.net"type="cite"><font face="Arial"><br /> Second but similar question:<br /> How can I
selectrecords that have fields that contain characters not included in a given alphabet?<br /> E.G. find fields that
containsome char not in 0-9,a-z,A-Z, +-()/?<br /></font><br /></blockquote> See regexp_match on the above-referenced
page.<br/><br /> Cheers,<br /> Steve<br /><br /> 

Re: How to clean up phone-numbers with regex?

От
Steve Crawford
Дата:
On 05/19/2014 06:39 AM, Rob Sargent wrote:
Have you looked into regular expressions?

I think his subject line answered that question...

Cheers,
Steve

Re: How to clean up phone-numbers with regex?

От
Rob Sargent
Дата:
On 05/19/2014 09:44 AM, Steve Crawford wrote:
On 05/19/2014 06:39 AM, Rob Sargent wrote:
Have you looked into regular expressions?

I think his subject line answered that question...

Cheers,
Steve

You're right.  Apologies.

Re: How to clean up phone-numbers with regex?

От
David G Johnston
Дата:
Steve Crawford wrote
> On 05/19/2014 01:54 AM, Andreas wrote:
> 
>>
>> Second but similar question:
>> How can I select records that have fields that contain characters not 
>> included in a given alphabet?
>> E.G. find fields that contain some char not in 0-9,a-z,A-Z, +-()/?
>>
> See regexp_match on the above-referenced page.
> 
> Cheers,
> Steve

Actually, section "9.7.3. POSIX Regular Expressions" - specifically table
9-11 at the beginning of that section - is the most common way to perform
the tests in a where clause.  regexp_matches(...) is for when you want to
extract data.

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/How-to-clean-up-phone-numbers-with-regex-tp5804450p5804493.html
Sent from the PostgreSQL - sql mailing list archive at Nabble.com.