Re: Regex performance issue

Поиск
Список
Период
Сортировка
От Oleg Bartunov
Тема Re: Regex performance issue
Дата
Msg-id Pine.LNX.4.64.0612022353190.16338@sn.sai.msu.ru
обсуждение исходный текст
Ответ на Re: Regex performance issue  ("Alexandru Coseru" <alexandru.coseru@totaltelecom.ro>)
Список pgsql-performance
I may miss something but I'd use tsearch2. Check
intdict dictionary for basic idea -  http://www.sai.msu.su/~megera/wiki/Gendict

Oleg
On Sat, 2 Dec 2006, Alexandru Coseru wrote:

> Hello...
>
> I cannot use LIKE , because the order of the match is reversed.
> The prefix column is containing telephone destinations.
> IE:    ^001  - US  , ^0039 Italy , etc..
>
> Here is a small sample:
>
> asterisk=> select * from destlist LIMIT 10;
> id | id_ent | dir |   prefix   |   country   |      network       | tip
> ----+--------+-----+------------+-------------+--------------------+-----
> 1 |     -1 |   0 | (^0093)    | AFGHANISTAN | AFGHANISTAN        |   6
> 2 |     -1 |   0 | (^00937)   | AFGHANISTAN | AFGHANISTAN Mobile |   5
> 3 |     -1 |   0 | (^00355)   | ALBANIA     | ALBANIA            |   6
> 4 |     -1 |   0 | (^0035538) | ALBANIA     | ALBANIA Mobile     |   5
> 5 |     -1 |   0 | (^0035568) | ALBANIA     | ALBANIA Mobile     |   5
> 6 |     -1 |   0 | (^0035569) | ALBANIA     | ALBANIA Mobile     |   5
> 7 |     -1 |   0 | (^00213)   | ALGERIA     | ALGERIA            |   6
> 8 |     -1 |   0 | (^0021361) | ALGERIA     | ALGERIA Mobile     |   5
> 9 |     -1 |   0 | (^0021362) | ALGERIA     | ALGERIA Mobile     |   5
> 10 |     -1 |   0 | (^0021363) | ALGERIA     | ALGERIA Mobile     |   5
>
>
> Now , I have to match a dialednumber   (let's say   00213618833) and find
> it's destination...(It's algeria mobile).
> I tried to make with a query of using LIKE , but i was not able to get
> something..
>
>
> Regards
>   Alex
>
>
>
>
>
> ----- Original Message ----- From: "Dave Dutcher" <dave@tridecap.com>
> To: "'Alexandru Coseru'" <alexandru.coseru@totaltelecom.ro>;
> <pgsql-performance@postgresql.org>
> Sent: Saturday, December 02, 2006 10:36 PM
> Subject: RE: [PERFORM] Regex performance issue
>
>
>> -----Original Message-----
>> From: pgsql-performance-owner@postgresql.org On Behalf Of Alexandru Coseru
>> asterisk=> explain analyze SELECT * FROM destlist WHERE
>> '0039051248787' ~
>> prefix AND id_ent='-2' AND dir=0 ORDER by length(prefix) DESC;
>>
>>
>> QUERY PLAN
>> --------------------------------------------------------------
>> ----------------------------------------------------------------------
>>  Sort  (cost=7925.07..7925.15 rows=31 width=67) (actual
>> time=857.715..857.716 rows=2 loops=1)
>>    Sort Key: length((prefix)::text)
>>    ->  Bitmap Heap Scan on destlist  (cost=60.16..7924.30
>> rows=31 width=67)
>> (actual time=2.156..857.686 rows=2 loops=1)
>>          Recheck Cond: ((id_ent = -2) AND (dir = 0))
>>          Filter: ('0039051248787'::text ~ (prefix)::text)
>>          ->  Bitmap Index Scan on destlist_indx2  (cost=0.00..60.16
>> rows=6193 width=0) (actual time=1.961..1.961 rows=5205 loops=1)
>>                Index Cond: ((id_ent = -2) AND (dir = 0))
>>  Total runtime: 857.804 ms
>> (8 rows)
>>
>>
>>     "mmumu" btree (prefix varchar_pattern_ops)
>>
>
> I'm surpised Postgres isn't using the index on prefix seeing as the index
> uses the varchar_pattern_ops operator class.  It could be that the index
> isn't selective enough, or is Postgres not able to use an index with Posix
> regular expressions?  The docs seem to say that it can, but I'd be curious
> to see what happens if you use LIKE instead of ~.
>
> Dave
>
>
>
>
>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Alexandru Coseru"
Дата:
Сообщение: Re: Regex performance issue
Следующее
От: "Heikki Linnakangas"
Дата:
Сообщение: Re: Regex performance issue