Estimation problem with a LIKE clause containing a /

Поиск

Список

Период

Сортировка

От	Guillaume Smet
Тема	Estimation problem with a LIKE clause containing a /
Дата	7 ноября 2007 г. 08:53:23
Msg-id	1d4e0c10711070453q214f089cpd144cbb5193d3c4f@mail.gmail.com обсуждение исходный текст
Ответы	Re: Estimation problem with a LIKE clause containing a / Re: Estimation problem with a LIKE clause containing a /
Список	pgsql-performance

Дерево обсуждения

Hi all,

While studying a query taking forever after an ANALYZE on a never
analyzed database (a bad estimate causes a nested loop on a lot of
tuples), I found the following problem:
- without any stats (I removed the line from pg_statistic):
ccm_prod_20071106=# explain analyze select * from cms_items where
ancestors LIKE '1062/%';
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Seq Scan on cms_items  (cost=0.00..689.26 rows=114 width=587) (actual
time=0.008..21.692 rows=11326 loops=1)
   Filter: ((ancestors)::text ~~ '1062/%'::text)
 Total runtime: 31.097 ms
-> the estimate is bad (it's expected) but it's sufficient to prevent
the nested loop so it's my current workaround

- after analyzing the cms_items table (statistics is set to 10 but
it's exactly the same for 100):
ccm_prod_20071106=# explain analyze select * from cms_items where
ancestors LIKE '1062/%';
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Seq Scan on cms_items  (cost=0.00..689.26 rows=*1* width=103) (actual
time=0.010..22.024 rows=11326 loops=1)
   Filter: ((ancestors)::text ~~ '1062/%'::text)
 Total runtime: 31.341 ms
-> this estimate leads PostgreSQL to choose a nested loop which is
executed more than 11k times and causes the query to take forever.

- if I remove the / from the LIKE clause (which I can't as ancestors
is more or less a path):
ccm_prod_20071106=# explain analyze select * from cms_items where
ancestors LIKE '1062%';
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Seq Scan on cms_items  (cost=0.00..689.26 rows=*9097* width=103)
(actual time=0.043..25.251 rows=11326 loops=1)
   Filter: ((ancestors)::text ~~ '1062%'::text)
 Total runtime: 34.778 ms

Which is a really good estimate.

Is it something expected?

The histogram does contain values beginning with '1062/' (5 out of 10)
and the cms_items table has ~ 22k rows.

Version is PostgreSQL 8.1.8 on i686-redhat-linux-gnu, compiled by GCC
gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-3). I checked the release
notes between 8.1.8 and 8.1.10 and I didn't find anything relevant to
fix this problem.

Thanks for any help.

Regards,

--
Guillaume

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Estimation problem with a LIKE clause containing a /