Re: BUG #13817: Query planner strange choose while select/count small part of big table - complete

Поиск
Список
Период
Сортировка
От Marcin Sieńko
Тема Re: BUG #13817: Query planner strange choose while select/count small part of big table - complete
Дата
Msg-id CAButoGGBgzkeOw8wm53XZuLH_VPKEhtyn+hbLsdKTEh7TD9tRA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #13817: Query planner strange choose while select/count small part of big table - complete  (Feike Steenbergen <feikesteenbergen@gmail.com>)
Список pgsql-bugs
Hi,

Thanks for reply.

First about questions:
- Yes I'have ssd disk but my co-worker has the same base on traditional disk and behaviour is similar (but 2 times slower).
- It is always slow regardless of repetitions and disks.
- I think default. I didn't change it.  Yes it is 4.0.
- After setting random_page_cost to 1.5 - no difference.
- There are result with buffers on:

sequence scan on:

"Aggregate  (cost=118063.42..118063.43 rows=1 width=8) (actual time=2478.105..2478.105 rows=1 loops=1)"
"  Buffers: shared hit=401 read=49722"
"  ->  Hash Join  (cost=7.88..108082.13 rows=3992515 width=8) (actual time=2478.087..2478.091 rows=3 loops=1)"
"        Hash Cond: (this_.shipment_order_item_id = orderitem1_.id)"
"        Buffers: shared hit=401 read=49722"
"        ->  Seq Scan on shipment_order_sub_item this_  (cost=0.00..90031.15 rows=3992515 width=16) (actual time=0.081..1134.453 rows=3992110 loops=1)"
"              Buffers: shared hit=384 read=49722"
"        ->  Hash  (cost=7.87..7.87 rows=1 width=16) (actual time=0.151..0.151 rows=3 loops=1)"
"              Buckets: 1024  Batches: 1  Memory Usage: 1kB"
"              Buffers: shared hit=17"
"              ->  Hash Semi Join  (cost=6.69..7.87 rows=1 width=16) (actual time=0.143..0.149 rows=3 loops=1)"
"                    Hash Cond: (orderitem1_.id = oi_.id)"
"                    Buffers: shared hit=17"
"                    ->  Seq Scan on shipment_order_item orderitem1_  (cost=0.00..1.13 rows=13 width=8) (actual time=0.005..0.006 rows=13 loops=1)"
"                          Buffers: shared hit=1"
"                    ->  Hash  (cost=6.68..6.68 rows=1 width=8) (actual time=0.122..0.122 rows=3 loops=1)"
"                          Buckets: 1024  Batches: 1  Memory Usage: 1kB"
"                          Buffers: shared hit=16"
"                          ->  Nested Loop  (cost=0.28..6.68 rows=1 width=8) (actual time=0.059..0.116 rows=3 loops=1)"
"                                Join Filter: (order1_.user_id = user2_.users_id)"
"                                Rows Removed by Join Filter: 9"
"                                Buffers: shared hit=16"
"                                ->  Nested Loop  (cost=0.28..5.59 rows=1 width=16) (actual time=0.047..0.086 rows=3 loops=1)"
"                                      Buffers: shared hit=13"
"                                      ->  Nested Loop  (cost=0.00..2.29 rows=1 width=24) (actual time=0.026..0.051 rows=3 loops=1)"
"                                            Buffers: shared hit=4"
"                                            ->  Seq Scan on shipment_order_item oi_  (cost=0.00..1.16 rows=1 width=16) (actual time=0.007..0.011 rows=3 loops=1)"
"                                                  Filter: (order_id = 610)"
"                                                  Rows Removed by Filter: 10"
"                                                  Buffers: shared hit=1"
"                                            ->  Seq Scan on shipment_order order1_  (cost=0.00..1.11 rows=1 width=24) (actual time=0.006..0.006 rows=1 loops=3)"
"                                                  Filter: ((id = 610) AND (court_department_id = ANY ('{1,292,32768}'::bigint[])))"
"                                                  Rows Removed by Filter: 6"
"                                                  Buffers: shared hit=3"
"                                      ->  Index Only Scan using court_department_pkey on court_department courtdepar3_  (cost=0.28..3.29 rows=1 width=8) (actual time=0.008..0.009 rows=1 loops=3)"
"                                            Index Cond: (department_id = order1_.court_department_id)"
"                                            Heap Fetches: 3"
"                                            Buffers: shared hit=9"
"                                ->  Seq Scan on application_user user2_  (cost=0.00..1.04 rows=4 width=8) (actual time=0.002..0.003 rows=4 loops=3)"
"                                      Buffers: shared hit=3"
"Planning time: 1.547 ms"
"Execution time: 2478.306 ms"


sequence scan off:

"Aggregate  (cost=136819.78..136819.79 rows=1 width=8) (actual time=1.274..1.274 rows=1 loops=1)"
"  Buffers: shared hit=32 read=7"
"  ->  Nested Loop  (cost=1.24..126838.50 rows=3992515 width=8) (actual time=1.242..1.265 rows=3 loops=1)"
"        Buffers: shared hit=32 read=7"
"        ->  Nested Loop Semi Join  (cost=0.81..17.81 rows=1 width=16) (actual time=1.112..1.119 rows=3 loops=1)"
"              Join Filter: (orderitem1_.id = oi_.id)"
"              Rows Removed by Join Filter: 33"
"              Buffers: shared hit=24 read=4"
"              ->  Index Only Scan using shipment_order_item_pkey on shipment_order_item orderitem1_  (cost=0.14..4.83 rows=13 width=8) (actual time=0.508..0.515 rows=13 loops=1)"
"                    Heap Fetches: 13"
"                    Buffers: shared hit=1 read=1"
"              ->  Materialize  (cost=0.67..12.79 rows=1 width=8) (actual time=0.040..0.044 rows=3 loops=13)"
"                    Buffers: shared hit=23 read=3"
"                    ->  Nested Loop  (cost=0.67..12.78 rows=1 width=8) (actual time=0.509..0.556 rows=3 loops=1)"
"                          Buffers: shared hit=23 read=3"
"                          ->  Nested Loop  (cost=0.54..9.62 rows=1 width=16) (actual time=0.217..0.253 rows=3 loops=1)"
"                                Buffers: shared hit=18 read=2"
"                                ->  Nested Loop  (cost=0.27..6.32 rows=1 width=24) (actual time=0.198..0.221 rows=3 loops=1)"
"                                      Buffers: shared hit=9 read=2"
"                                      ->  Index Scan using fk_fk_mt3v1s9gl7rr0lo83il8gy00d_idx on shipment_order_item oi_  (cost=0.14..3.15 rows=1 width=16) (actual time=0.129..0.132 rows=3 loops=1)"
"                                            Index Cond: (order_id = 610)"
"                                            Buffers: shared hit=4 read=1"
"                                      ->  Index Scan using shipment_order_pkey on shipment_order order1_  (cost=0.13..3.15 rows=1 width=24) (actual time=0.026..0.027 rows=1 loops=3)"
"                                            Index Cond: (id = 610)"
"                                            Filter: (court_department_id = ANY ('{1,292,32768}'::bigint[]))"
"                                            Buffers: shared hit=5 read=1"
"                                ->  Index Only Scan using court_department_pkey on court_department courtdepar3_  (cost=0.28..3.29 rows=1 width=8) (actual time=0.007..0.008 rows=1 loops=3)"
"                                      Index Cond: (department_id = order1_.court_department_id)"
"                                      Heap Fetches: 3"
"                                      Buffers: shared hit=9"
"                          ->  Index Only Scan using application_user_pkey on application_user user2_  (cost=0.13..3.15 rows=1 width=8) (actual time=0.097..0.098 rows=1 loops=3)"
"                                Index Cond: (users_id = order1_.user_id)"
"                                Heap Fetches: 3"
"                                Buffers: shared hit=5 read=1"
"        ->  Index Scan using fk_fk_rr5k2n8n892ye3uposkh3xp6v_idx on shipment_order_sub_item this_  (cost=0.43..86895.54 rows=3992515 width=16) (actual time=0.045..0.046 rows=1 loops=3)"
"              Index Cond: (shipment_order_item_id = orderitem1_.id)"
"              Buffers: shared hit=8 read=3"
"Planning time: 1.684 ms"
"Execution time: 1.448 ms"


I see that calculated cost is less for seq scan but i don't get how it is possible if fk_fk_rr5k2n8n892ye3uposkh3xp6v_idx intable shipment_order_sub_item is on shipment_order_item_id.  If I need to pick 8 of 4 millions rows exactly by this index why it is not used and sequence scan instead? :|. Disabling seq scan works 2216/12=185 times faster.

If it could help I can sent my data base (about 75 MB) on ftp or whenever.

Thanks again.

Regards,
Marcin Sieńko



Pozdrawiam,
Marcin

2015-12-16 17:18 GMT+01:00 Feike Steenbergen <feikesteenbergen@gmail.com>:
Hi,

Looking at the explained plans, it makes sense the seq scan is preferred as it is expected to be cheaper.

Seq scan enabled:
Hash Join  (cost=12.88..108087.13 rows=3992515 width=177)

The main costs (83%) of this plan are:
Seq Scan on shipment_order_sub_item this_ (cost=0.00..90,031.15 rows=3,992,515 width=125)

Seq scan disabled:
Nested Loop  (cost=1.24..138607.34 rows=3992515 width=177)

The main costs (71%) of this plan are:
Index Scan using fk_fk_rr5k2n8n892ye3uposkh3xp6v_idx on  shipment_order_sub_item this_  (cost=0.43..98636.88 rows=3992515 width=125)

The expected costs for the seq scan enabled is 78% of that of the disable seq scan.



Questions:

- What kind of disks do you have (ssd's?)
- Is the seq scan slow if you repeat it immediately after the first run?
- What is your current random_page_cost
- Could you try to reissue the query after lowering the value of random_page_cost, like so:

    SET random_page_cost TO 1.5;

- Could you redo the explain with

    EXPLAIN (ANALYZE ON, BUFFERS ON)

regards,

Feike

В списке pgsql-bugs по дате отправления:

Предыдущее
От: george.wesington@gmail.com
Дата:
Сообщение: BUG #13823: Database crash - import requests in plpython3u
Следующее
От: "Millepied, Pascal (GE Healthcare)"
Дата:
Сообщение: Known issues on PostgreSQL server 8.1.19