Planner doesn't take indexes into account

Поиск
Список
Период
Сортировка
От Grzegorz Olszewski
Тема Planner doesn't take indexes into account
Дата
Msg-id DUB126-W4724F21FDFBF1316E4FF90913A0@phx.gbl
обсуждение исходный текст
Ответы Re: Planner doesn't take indexes into account
Список pgsql-performance
Hi,

I wonder why planner uses Seq Scan instead of Index Scan.

Here is my table (partial):
content.contents
-------------------------+-----------------------------+-----------------------------------------------------------------
 id                      | bigint                      | niepusty domyślnie nextval('content.contents_id_seq'::regclass)
 version                 | integer                     | niepusty
 date_published          | timestamp without time zone |
 moderation_status       | character varying(50)       |
 publication_status      | character varying(30)       |

And indexes (there are some other indexes too):
    "contents_id_pkey" PRIMARY KEY, btree (id)
    "contents_date_published_idx" btree (date_published)
    "contents_moderation_status_idx" btree (moderation_status)
    "contents_publication_status_idx" btree (publication_status)

I tried also creating following indexes:
    "contents_date_published_publication_status_moderation_statu_idx" btree (date_published, publication_status, moderation_status)
    "contents_publication_status_idx1" btree ((publication_status::text))
    "contents_moderation_status_idx1" btree ((moderation_status::text))

Then for this query (genrated by Hibernate):
explain (analyze, buffers) select count(*) as y0_ from content.contents this_ inner join content.content_categories cat1_ on this_.CONTENT_CATEGORY_ID=cat1_.ID where cat1_.name in ([...])
and this_.date_published<='2014-05-26 12:23:31.557000 +02:00:00'
and (this_.PUBLICATION_STATUS is null or this_.PUBLICATION_STATUS<>'DRAFT')
and (this_.moderation_status is null or this_.moderation_status<>'DANGEROUS')
and exists(select * from content.content_visibilities cv where cv.content_id = this_.ID and cv.user_group_id in (1,2));

Planner creates such plan:
                                                                                                                                  QUERY PLAN                                                                                                                                 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Semi Join  (cost=31706.84..106020.81 rows=21871 width=2076) (actual time=1197.658..6012.406 rows=430218 loops=1)
   Hash Cond: (this_.id = cv.content_id)
   Buffers: shared hit=5 read=59031 written=3, temp read=47611 written=47549
   ->  Hash Join  (cost=2.22..56618.11 rows=22881 width=2076) (actual time=0.163..1977.304 rows=430221 loops=1)
         Hash Cond: (this_.content_category_id = cat1_.id)
         Buffers: shared hit=1 read=46829 written=1
         ->  Seq Scan on contents this_  (cost=0.00..54713.92 rows=446176 width=2030) (actual time=0.048..915.724 rows=450517 loops=1)
               Filter: ((date_published <= '2014-05-26 12:23:31.557'::timestamp without time zone) AND ((publication_status IS NULL) OR ((publication_status)::text <> 'DRAFT'::text)) AND ((moderation_status IS NULL) OR ((moderation_status)::text <> 'DANGEROUS'::text)))
               Rows Removed by Filter: 50
               Buffers: shared read=46829 written=1
         ->  Hash  (cost=2.17..2.17 rows=4 width=46) (actual time=0.089..0.089 rows=4 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 1kB
               Buffers: shared hit=1
               ->  Seq Scan on content_categories cat1_  (cost=0.00..2.17 rows=4 width=46) (actual time=0.053..0.076 rows=4 loops=1)
                     Filter: ((name)::text = ANY ('{przeglad-prasy/rp,przeglad-prasy/parkiet,komunikat-z-rynku-pap-emitent,komunikat-z-rynku-pap-depesze}'::text[]))
                     Rows Removed by Filter: 74
                     Buffers: shared hit=1
   ->  Hash  (cost=24435.09..24435.09 rows=443083 width=8) (actual time=1197.146..1197.146 rows=447624 loops=1)
         Buckets: 4096  Batches: 32  Memory Usage: 560kB
         Buffers: shared hit=4 read=12202 written=2, temp written=1467
         ->  Bitmap Heap Scan on content_visibilities cv  (cost=7614.55..24435.09 rows=443083 width=8) (actual time=61.034..647.729 rows=447624 loops=1)
               Recheck Cond: (user_group_id = ANY ('{1,2}'::bigint[]))
               Buffers: shared hit=4 read=12202 written=2
               ->  Bitmap Index Scan on content_visibilities_user_group_id_idx  (cost=0.00..7503.78 rows=443083 width=0) (actual time=58.680..58.680 rows=447626 loops=1)
                     Index Cond: (user_group_id = ANY ('{1,2}'::bigint[]))
                     Buffers: shared hit=3 read=1226
 Total runtime: 6364.689 ms
(27 wierszy)

The suspicious part is:
         ->  Seq Scan on contents this_  (cost=0.00..54713.92 rows=446176 width=2030) (actual time=0.048..915.724 rows=450517 loops=1)
               Filter: ((date_published <= '2014-05-26 12:23:31.557'::timestamp without time zone) AND ((publication_status IS NULL) OR ((publication_status)::text <> 'DRAFT'::text)) AND ((moderation_status IS NULL) OR ((moderation_status)::text <> 'DANGEROUS'::text)))

I don't understand why planner doesn't use indexes. The problem is there are about 0.5M rows satisfying condition (almost every row in the table). Could you please explain this behavior?

I'm using PostgreSQL 9.2.8 on Ubuntu 12.04 LTS x86_64

Best regards,
Grzegorz Olszewski

В списке pgsql-performance по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: NFS, file system cache and shared_buffers
Следующее
От: AI Rumman
Дата:
Сообщение: Re: Planner doesn't take indexes into account