Re: How to clean up phone-numbers with regex?

Поиск
Список
Период
Сортировка
От Steve Crawford
Тема Re: How to clean up phone-numbers with regex?
Дата
Msg-id 537A26B5.3030309@pinpointresearch.com
обсуждение исходный текст
Ответ на How to clean up phone-numbers with regex?  (Andreas <maps.on@gmx.net>)
Ответы Re: How to clean up phone-numbers with regex?
Список pgsql-sql
On 05/19/2014 01:54 AM, Andreas wrote:
Hi

I need to clean up phone-numbers. Somehow I got a Excel list that has weird graphical characters trailing some of the entries.

My DB is UTF8 so it would store this mess but I don't like to import it in the first place.

OK, I know how to read the stuff into a temporary table to clean it up before the actual import.
How can I do an update on the column that deletes every char that is not in a given set of chars like '+- 0123456/()'?


See: http://www.postgresql.org/docs/current/static/functions-matching.html

For the first case, the regexp_replace function is probably your best bet. But note that, depending on the quality of your input, just removing characters outside that range may still not yield the desired result.

select regexp_replace('(12s3)-456-635/6(a+sdk', '[^0-9()+-/]', '', 'g');
  regexp_replace  
-------------------
 (123)-456-635/6(+

You can remove all formatting by requiring only digits then check and/or reformat later as desired.
steve=> select regexp_replace('(12s3)-456-6356(a+sdk', '[^0-9]', '', 'g');
 regexp_replace
----------------
 1234566356


Second but similar question:
How can I select records that have fields that contain characters not included in a given alphabet?
E.G. find fields that contain some char not in 0-9,a-z,A-Z, +-()/?

See regexp_match on the above-referenced page.

Cheers,
Steve

В списке pgsql-sql по дате отправления: