Обсуждение: Unexpected Results from regexp_replace

Поиск
Список
Период
Сортировка

Unexpected Results from regexp_replace

От
Susan Hurst
Дата:
I would like to extract only letters and numbers [0-9a-z] from an input
string.  Using regexp_replace in postgres, I get unexpected results.
The pattern matching seems to stop before the end of the string value is
reached.  Some of the values that should have been excluded are still
there.  What is the trick for removing all instances of unwanted
characters and spaces?  I tried regexp_matches but the results were no
better.

In Oracle, the same code produces the expected results.  Below is the
code and the output from postgres and from oracle, for your reference.
Thanks for your help!

Sue

-----------------------------------------------------------------------
POSTGRESQL:  Incorrect results...left over unwanted characters
create table test_str (greeting varchar(256),v_greeting varchar(256));

select * from test_str;

insert into test_str (greeting) values ('Hello  Worldy');
insert into test_str (greeting) values ('Hello, World!');
insert into test_str (greeting) values ('hello_world');
insert into test_str (greeting) values ('Hello! World.');
insert into test_str (greeting) values ('hello - world');
insert into test_str (greeting) values ('hello_world n');
insert into test_str (greeting) values ('hello world n n');

select greeting,regexp_replace (lower(greeting),'[^0-9a-z]{1,}','') str
   from test_str
;

greeting    str
Hello  Worldy    helloworldy
Hello, World!    helloworld!
hello_world    helloworld
Hello! World.    helloworld.
hello - world    helloworld
hello_world n    helloworld n
hello world n n    helloworld n n

------------------------------------------------------------------------
ORACLE: Correct results...all unwanted characters are gone
create table test_str (greeting varchar2(256),v_greeting varchar2(256));

select * from test_str;

insert into test_str (greeting) values ('Hello  Worldy');
insert into test_str (greeting) values ('Hello, World!');
insert into test_str (greeting) values ('hello_world');
insert into test_str (greeting) values ('Hello! World.');
insert into test_str (greeting) values ('hello - world');
insert into test_str (greeting) values ('hello_world n');
insert into test_str (greeting) values ('hello world n n');

select greeting,regexp_replace (lower(greeting),'[^0-9a-z]{1,}','') str
   from test_str
;
GREETING    STR
Hello  Worldy    helloworldy
Hello, World!    helloworld
hello_world    helloworld
Hello! World.    helloworld
hello - world    helloworld
hello_world n    helloworldn
hello world n n    helloworldnn

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Susan E Hurst
Principal Consultant
Email: susan.hurst@brookhurstdata.com
Mobile: 314-486-3261


Re: Unexpected Results from regexp_replace

От
"David G. Johnston"
Дата:
On Tue, Nov 10, 2015 at 1:12 PM, Susan Hurst <susan.hurst@brookhurstdata.com> wrote:
What is the trick for removing all instances of unwanted characters and spaces?

​(g)lobal flag.

SELECT regexp_replace('hello - world n', '[^0-9a-z]+', '', 'g')
​;

David J.​


Re: Unexpected Results from regexp_replace

От
Susan Hurst
Дата:
Excellent!  I tried the global flag before but I had it in the wrong
place.

Thanks for the quick service, David!

---
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Susan E Hurst
Principal Consultant
Email: susan.hurst@brookhurstdata.com
Mobile: 314-486-3261

On 2015-11-10 14:19, David G. Johnston wrote:
> On Tue, Nov 10, 2015 at 1:12 PM, Susan Hurst
> <susan.hurst@brookhurstdata.com> wrote:
>
>> What is the trick for removing all instances of unwanted characters
>> and spaces?
>
> ​(g)lobal flag.
>
> SELECT regexp_replace('hello - world n', '[^0-9a-z]+', '', 'g')
> ​;
>
> David J.​