Estimation question...

Поиск
Список
Период
Сортировка
От Matt Daw
Тема Estimation question...
Дата
Msg-id CAA2LLOEy2n6swjgBRGp2DZvVtMg4AGYSyA2FKFd6FKxscv5rSQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: Estimation question...  (Matt Daw <matt@shotgunsoftware.com>)
Список pgsql-performance
Howdy, the query generator in my app sometimes creates redundant
filters of the form:

project_id IN ( <list of projects user has permission to see> ) AND
project_id = <single project user is looking at >

... and this is leading to a bad estimate (and thus a bad plan) on a
few complex queries. I've included simplified examples below. This
server is running 9.0.10 and the statistics target has been updated to
1000 on the project_id column. I've also loaded the one table into a
9.2.2 instance and replicated the behaviour.

I can change how the query is being generated, but I'm curious why I'm
getting a bad estimate. Is this an expected result?

Thanks!

Matt

=============

1) Filter on project_id only, row estimate for Bitmap Index Scan quite good.

explain (analyze,buffers) select count(id) from versions WHERE project_id=115;

 QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1218111.01..1218111.02 rows=1 width=4) (actual
time=1531.341..1531.342 rows=1 loops=1)
   Buffers: shared hit=452619
   ->  Bitmap Heap Scan on versions  (cost=34245.06..1215254.86
rows=1142461 width=4) (actual time=148.394..1453.383 rows=1114197
loops=1)
         Recheck Cond: (project_id = 115)
         Buffers: shared hit=452619
         ->  Bitmap Index Scan on versions_project_id
(cost=0.00..33959.45 rows=1142461 width=0) (actual
time=139.709..139.709 rows=1116037 loops=1)
               Index Cond: (project_id = 115)
               Buffers: shared hit=22077
 Total runtime: 1531.399 ms

2) Filter on project_id IN () AND project_id. Row estimate is ~10x lower.

explain (analyze,buffers) select count(id) from versions WHERE
project_id IN (80,115) AND project_id=115;;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=327066.18..327066.19 rows=1 width=4) (actual
time=1637.889..1637.889 rows=1 loops=1)
   Buffers: shared hit=458389
   ->  Bitmap Heap Scan on versions  (cost=3546.56..326793.17
rows=109201 width=4) (actual time=155.107..1557.453 rows=1114180
loops=1)
         Recheck Cond: ((project_id = ANY ('{80,115}'::integer[])) AND
(project_id = 115))
         Buffers: shared hit=458389
         ->  Bitmap Index Scan on versions_project_id
(cost=0.00..3519.26 rows=109201 width=0) (actual time=145.502..145.502
rows=1125436 loops=1)
               Index Cond: ((project_id = ANY ('{80,115}'::integer[]))
AND (project_id = 115))
               Buffers: shared hit=22076
 Total runtime: 1637.941 ms


В списке pgsql-performance по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: Are bitmap index scans slow to start?
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: High CPU usage / load average after upgrading to Ubuntu 12.04