Re: How to clean up phone-numbers with regex?
| От | Steve Crawford |
|---|---|
| Тема | Re: How to clean up phone-numbers with regex? |
| Дата | |
| Msg-id | 537A26B5.3030309@pinpointresearch.com обсуждение исходный текст |
| Ответ на | How to clean up phone-numbers with regex? (Andreas <maps.on@gmx.net>) |
| Ответы |
Re: How to clean up phone-numbers with regex?
|
| Список | pgsql-sql |
On 05/19/2014 01:54 AM, Andreas wrote:
Hi
I need to clean up phone-numbers. Somehow I got a Excel list that has weird graphical characters trailing some of the entries.
My DB is UTF8 so it would store this mess but I don't like to import it in the first place.
OK, I know how to read the stuff into a temporary table to clean it up before the actual import.
How can I do an update on the column that deletes every char that is not in a given set of chars like '+- 0123456/()'?
See: http://www.postgresql.org/docs/current/static/functions-matching.html
For the first case, the regexp_replace function is probably your best bet. But note that, depending on the quality of your input, just removing characters outside that range may still not yield the desired result.
select regexp_replace('(12s3)-456-635/6(a+sdk', '[^0-9()+-/]', '', 'g');
regexp_replace
-------------------
(123)-456-635/6(+
You can remove all formatting by requiring only digits then check and/or reformat later as desired.
steve=> select regexp_replace('(12s3)-456-6356(a+sdk', '[^0-9]', '', 'g');
regexp_replace
----------------
1234566356
See regexp_match on the above-referenced page.
Second but similar question:
How can I select records that have fields that contain characters not included in a given alphabet?
E.G. find fields that contain some char not in 0-9,a-z,A-Z, +-()/?
Cheers,
Steve
В списке pgsql-sql по дате отправления: