Re: POSIX regex performance bug in 7.3 Vs. 7.2

Поиск
Список
Период
Сортировка
От Tatsuo Ishii
Тема Re: POSIX regex performance bug in 7.3 Vs. 7.2
Дата
Msg-id 20030205.113236.74754942.t-ishii@sra.co.jp
обсуждение исходный текст
Ответ на Re: POSIX regex performance bug in 7.3 Vs. 7.2  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: POSIX regex performance bug in 7.3 Vs. 7.2  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Ok. The original complain can be sasily solved at least for single
byte encoding databases. With the small patches(against 7.3.1)
included, I got following result.

test1:
select count(*) from tenk1 where 'quotidian' ~ string4;count 
-------    0
(1 row)

Time: 113.81 ms

test2:
select count(*) from tenk1 where 'quotidian' ~ stringu1;count 
-------    0
(1 row)

Time: 419.36 ms

test3:
select count(*) from tenk1 where 'quotidian' ~* stringu1;count 
-------    0
(1 row)

Time: 1633.21 ms

The ratio for test3/test1 is now 14.35. Although not great as the
Spencer's new code according to Tom (with the code test3/test1 =
9.75), it seems much better than the original 7.3 code (test3/test1 =
689.71).

Note that if the database encoding is not a single byte one, it should
be as slow as the original 7.3. There is no easy way to fix it.

P.S. With the patches all the regression tests have passed on my Linux
box.
--
Tatsuo Ishii

----------------------------------- cut here -------------------------------------
*** postgresql-7.3.1/src/backend/regex/regcomp.c.orig    2002-09-05 05:31:24.000000000 +0900
--- postgresql-7.3.1/src/backend/regex/regcomp.c    2003-02-05 10:05:03.000000000 +0900
***************
*** 178,183 ****
--- 178,186 ----     int            i;     size_t        len;     pg_wchar   *wcp;
+     size_t    csetsize;
+ 
+     csetsize = (pg_database_encoding_max_length() == 1)?(SCHAR_MAX - SCHAR_MIN + 1):NC;      if (cclasses == NULL)
    cclasses = cclass_init();
 
***************
*** 211,217 ****      /* do the mallocs early so failure handling is easy */     g = (struct re_guts *)
malloc(sizeof(structre_guts) +
 
!                                   (NC - 1) * sizeof(cat_t));     if (g == NULL)         return REG_ESPACE;
p->ssize= len / (size_t) 2 *(size_t) 3 + (size_t) 1;        /* ugh */
 
--- 214,220 ----      /* do the mallocs early so failure handling is easy */     g = (struct re_guts *)
malloc(sizeof(structre_guts) +
 
!                                   (csetsize - 1) * sizeof(cat_t));     if (g == NULL)         return REG_ESPACE;
p->ssize= len / (size_t) 2 *(size_t) 3 + (size_t) 1;        /* ugh */
 
***************
*** 235,241 ****         p->pbegin[i] = 0;         p->pend[i] = 0;     }
!     g->csetsize = NC;     g->sets = NULL;     g->setbits = NULL;     g->ncsets = 0;
--- 238,244 ----         p->pbegin[i] = 0;         p->pend[i] = 0;     }
!     g->csetsize = csetsize;     g->sets = NULL;     g->setbits = NULL;     g->ncsets = 0;
***************
*** 248,254 ****     g->nsub = 0;     g->ncategories = 1;            /* category 0 is "everything else" */
g->categories= &g->catspace[-(CHAR_MIN)];
 
!     memset((char *) g->catspace, 0, NC * sizeof(cat_t));     g->backrefs = 0;      /* do it */
--- 251,257 ----     g->nsub = 0;     g->ncategories = 1;            /* category 0 is "everything else" */
g->categories= &g->catspace[-(CHAR_MIN)];
 
!     memset((char *) g->catspace, 0, csetsize * sizeof(cat_t));     g->backrefs = 0;      /* do it */


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Andrew Dunstan"
Дата:
Сообщение: Re: PGP signing releases
Следующее
От: Bruno Wolff III
Дата:
Сообщение: Re: PGP signing releases