Обсуждение: [PERFORM] Execution plan analysis

Поиск
Список
Период
Сортировка

[PERFORM] Execution plan analysis

От
Neto pr
Дата:
Dear all

Someone help me analyze the execution plans below, is the  query 12 of
TPC-H benchmark [1].
I need to find out why the query without index runs faster (7 times)
than with index, although the costs are smaller (see table).
I have other cases that happened in the same situation. The server
parameters have been set with PGTUNE. I use postgresql version 9.6.4
on Debian 8 OS with 4 GB memory.

Query|Index(yes/no) |Time Spend    |Cost Total
===================================
12       Yes               00:08:58          2710805.51
12        No                00:01:42         3365996.34


-----------------   Explain Analyze  Query 12  WITH INDEX
----------------------------
Sort  (cost=2710805.51..2710805.51 rows=1 width=27) (actual
time=537713.672..537713.672 rows=2 loops=1)
  Sort Key: lineitem.l_shipmode
    Sort Method:  quicksort  Memory: 25kB
      ->  HashAggregate  (cost=2710805.47..2710805.50 rows=1 width=27)
(actual time=537713.597..537713.598 rows=2 loops=1)
              ->  Merge Join  (cost=1994471.69..2708777.28 rows=270426
width=27) (actual time=510717.977..536818.802 rows=311208 loops=1)
                  Merge Cond: (orders.o_orderkey = lineitem.l_orderkey)
                    ->  Index Scan using orders_pkey on orders
(cost=0.00..672772.57 rows=15000045 width=20) (actual
time=0.019..20898.325 rows=14999972 loops=1)
                          ->  Sort  (cost=1994455.40..1995131.47
rows=270426 width=19) (actual time=510690.114..510915.678 rows=311208
loops=1)
                                 Sort Key: lineitem.l_orderkey
                                    Sort Method:  external sort  Disk: 11568kB
                                         ->  Bitmap Heap Scan on
lineitem  (cost=336295.10..1970056.39 rows=270426 width=19) (actual
time=419620.817..509685.421 rows=311208 loops=1)
                                               Recheck Cond:
(l_shipmode = ANY (_{TRUCK,AIR}_::bpchar[]))
                                                    Filter:
((l_commitdate < l_receiptdate) AND (l_shipdate < l_commitdate) AND
(l_receiptdate >= _1997-01-01_::date) AND (l_receiptdate < _1998-01-01
00:00:00_::timestamp without time zone))
                                                        ->  Bitmap
Index Scan on idx_l_shipmodelineitem000  (cost=0.00..336227.49
rows=15942635 width=0) (actual time=419437.172..419437.172
rows=17133713 loops=1)
                                                              Index
Cond: (l_shipmode = ANY (_{TRUCK,AIR}_::bpchar[]))

Total runtime: 537728.848 ms


-----------------  Explain Analyze Query 12  WITHOUT INDEX
----------------------------
Sort  (cost=3365996.33..3365996.34 rows=1 width=27) (actual
time=101850.883..101850.884 rows=2 loops=1)
  Sort Key: lineitem.l_shipmode  Sort Method:  quicksort  Memory: 25kB
    ->  HashAggregate  (cost=3365996.30..3365996.32 rows=1 width=27)
(actual time=101850.798..101850.800 rows=2 loops=1)
            ->  Merge Join  (cost=2649608.28..3363936.68 rows=274616
width=27) (actual time=75497.181..100938.830 rows=311208 loops=1)
                 Merge Cond: (orders.o_orderkey = lineitem.l_orderkey)
                     ->  Index Scan using orders_pkey on orders
(cost=0.00..672771.90 rows=15000000 width=20) (actual
time=0.020..20272.828 rows=14999972 loops=1)
                              ->  Sort  (cost=2649545.68..2650232.22
rows=274616 width=19) (actual time=75364.450..75618.772 rows=311208
loops=1)
                                    Sort Key: lineitem.l_orderkey
                                        Sort Method:  external sort
Disk: 11568kB
                                           ->  Seq Scan on lineitem
(cost=0.00..2624738.17 rows=274616 width=19) (actual
time=0.839..74391.087 rows=311208 loops=1)
                                                 Filter: ((l_shipmode
= ANY (_{TRUCK,AIR}_::bpchar[])) AND (l_commitdate < l_receiptdate)
AND (l_shipdate < l_commitdate) AND (l_receiptdate >=
_1997-01-01_::date) AND (l_receiptdate < _1998-01-01
00:00:00_::timestamp without time zone))
                                                       Total runtime:
101865.253 ms

 -=========------ SQL query 12 ----------------------
  select
    l_shipmode,
    sum(case
        when o_orderpriority = '1-URGENT'
            or o_orderpriority = '2-HIGH'
            then 1
        else 0
    end) as high_line_count,
    sum(case
        when o_orderpriority <> '1-URGENT'
            and o_orderpriority <> '2-HIGH'
            then 1
        else 0
    end) as low_line_count
from
    orders,
    lineitem
where
    o_orderkey = l_orderkey
    and l_shipmode in ('TRUCK', 'AIR')
    and l_commitdate < l_receiptdate
    and l_shipdate < l_commitdate
    and l_receiptdate >= date '1997-01-01'
    and l_receiptdate < date '1997-01-01' + interval '1' year
group by
    l_shipmode
order by
    l_shipmode


Re: [PERFORM] Execution plan analysis

От
Neto pr
Дата:
2017-08-25 5:31 GMT-03:00 Neto pr <netopr9@gmail.com>:
> Dear all
>
> Someone help me analyze the execution plans below, is the  query 12 of
> TPC-H benchmark [1].
> I need to find out why the query without index runs faster (7 times)
> than with index, although the costs are smaller (see table).
> I have other cases that happened in the same situation. The server
> parameters have been set with PGTUNE. I use postgresql version 9.6.4
> on Debian 8 OS with 4 GB memory.
>
> Query|Index(yes/no) |Time Spend    |Cost Total
> ===================================
> 12       Yes               00:08:58          2710805.51
> 12        No                00:01:42         3365996.34
>
>
> -----------------   Explain Analyze  Query 12  WITH INDEX
> ----------------------------
> Sort  (cost=2710805.51..2710805.51 rows=1 width=27) (actual
> time=537713.672..537713.672 rows=2 loops=1)
>   Sort Key: lineitem.l_shipmode
>     Sort Method:  quicksort  Memory: 25kB
>       ->  HashAggregate  (cost=2710805.47..2710805.50 rows=1 width=27)
> (actual time=537713.597..537713.598 rows=2 loops=1)
>               ->  Merge Join  (cost=1994471.69..2708777.28 rows=270426
> width=27) (actual time=510717.977..536818.802 rows=311208 loops=1)
>                   Merge Cond: (orders.o_orderkey = lineitem.l_orderkey)
>                     ->  Index Scan using orders_pkey on orders
> (cost=0.00..672772.57 rows=15000045 width=20) (actual
> time=0.019..20898.325 rows=14999972 loops=1)
>                           ->  Sort  (cost=1994455.40..1995131.47
> rows=270426 width=19) (actual time=510690.114..510915.678 rows=311208
> loops=1)
>                                  Sort Key: lineitem.l_orderkey
>                                     Sort Method:  external sort  Disk: 11568kB
>                                          ->  Bitmap Heap Scan on
> lineitem  (cost=336295.10..1970056.39 rows=270426 width=19) (actual
> time=419620.817..509685.421 rows=311208 loops=1)
>                                                Recheck Cond:
> (l_shipmode = ANY (_{TRUCK,AIR}_::bpchar[]))
>                                                     Filter:
> ((l_commitdate < l_receiptdate) AND (l_shipdate < l_commitdate) AND
> (l_receiptdate >= _1997-01-01_::date) AND (l_receiptdate < _1998-01-01
> 00:00:00_::timestamp without time zone))
>                                                         ->  Bitmap
> Index Scan on idx_l_shipmodelineitem000  (cost=0.00..336227.49
> rows=15942635 width=0) (actual time=419437.172..419437.172
> rows=17133713 loops=1)
>                                                               Index
> Cond: (l_shipmode = ANY (_{TRUCK,AIR}_::bpchar[]))
>
> Total runtime: 537728.848 ms
>
>
> -----------------  Explain Analyze Query 12  WITHOUT INDEX
> ----------------------------
> Sort  (cost=3365996.33..3365996.34 rows=1 width=27) (actual
> time=101850.883..101850.884 rows=2 loops=1)
>   Sort Key: lineitem.l_shipmode  Sort Method:  quicksort  Memory: 25kB
>     ->  HashAggregate  (cost=3365996.30..3365996.32 rows=1 width=27)
> (actual time=101850.798..101850.800 rows=2 loops=1)
>             ->  Merge Join  (cost=2649608.28..3363936.68 rows=274616
> width=27) (actual time=75497.181..100938.830 rows=311208 loops=1)
>                  Merge Cond: (orders.o_orderkey = lineitem.l_orderkey)
>                      ->  Index Scan using orders_pkey on orders
> (cost=0.00..672771.90 rows=15000000 width=20) (actual
> time=0.020..20272.828 rows=14999972 loops=1)
>                               ->  Sort  (cost=2649545.68..2650232.22
> rows=274616 width=19) (actual time=75364.450..75618.772 rows=311208
> loops=1)
>                                     Sort Key: lineitem.l_orderkey
>                                         Sort Method:  external sort
> Disk: 11568kB
>                                            ->  Seq Scan on lineitem
> (cost=0.00..2624738.17 rows=274616 width=19) (actual
> time=0.839..74391.087 rows=311208 loops=1)
>                                                  Filter: ((l_shipmode
> = ANY (_{TRUCK,AIR}_::bpchar[])) AND (l_commitdate < l_receiptdate)
> AND (l_shipdate < l_commitdate) AND (l_receiptdate >=
> _1997-01-01_::date) AND (l_receiptdate < _1998-01-01
> 00:00:00_::timestamp without time zone))
>                                                        Total runtime:
> 101865.253 ms
>
>  -=========------ SQL query 12 ----------------------
>   select
>     l_shipmode,
>     sum(case
>         when o_orderpriority = '1-URGENT'
>             or o_orderpriority = '2-HIGH'
>             then 1
>         else 0
>     end) as high_line_count,
>     sum(case
>         when o_orderpriority <> '1-URGENT'
>             and o_orderpriority <> '2-HIGH'
>             then 1
>         else 0
>     end) as low_line_count
> from
>     orders,
>     lineitem
> where
>     o_orderkey = l_orderkey
>     and l_shipmode in ('TRUCK', 'AIR')
>     and l_commitdate < l_receiptdate
>     and l_shipdate < l_commitdate
>     and l_receiptdate >= date '1997-01-01'
>     and l_receiptdate < date '1997-01-01' + interval '1' year
> group by
>     l_shipmode
> order by
>     l_shipmode

Complementing the question I'm using a server HP proliant Ml110-G9:
Processador: (1) Intel Xeon E5-1603v3 (2.8GHz/4-core/10MB/140W)
Memória RAM: 4GB DDR4
Disco Rígido: SATA 1TB 7.2K rpm LFF
More specifications

here:https://www.hpe.com/us/en/product-catalog/servers/proliant-servers/pip.specifications.hpe-proliant-ml110-gen9-server.7796454.html
154/5000

See Below parameters presents in postgresql.conf. You would indicate
which value for example: cpu_index_tuple_cost and other CPU_*, based
on this
Server.

#seq_page_cost = 1.0
#random_page_cost = 4.0
#cpu_tuple_cost = 0.01
#cpu_index_tuple_cost = 0.005
#cpu_operator_cost = 0.0025
shared_buffers = 1GB
effective_cache_size = 3GB
work_mem = 26214kB
maintenance_work_mem = 512MB
checkpoint_segments = 128
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 500

Best Regards
Neto Br


Re: [PERFORM] Execution plan analysis

От
Tomas Vondra
Дата:
Hi,

So looking at the plans, essentially the only part that is different is
the scan node at the very bottom - in one case it's a sequential scan,
in the other case (the slow one) it's the bitmap index scan.

Essentially it's this:

    ->  Seq Scan on lineitem
        (cost=0.00..2624738.17 ...)
        (actual time=0.839..74391.087 ...)

vs. this:

    ->  Bitmap Heap Scan on lineitem
        (cost=336295.10..1970056.39 ...)
        (actual time=419620.817..509685.421 ...)
        ->  Bitmap Index Scan on idx_l_shipmodelineitem000
            (cost=0.00..336227.49 ...)
            (actual time=419437.172..419437.172 ...)

All the nodes are the same and perform about the same in both cases, so
you can ignore them. This difference it the the root cause you need to
investigate.

The question is why is the sequential scan so much faster than bitmap
index scan? Ideally, the bitmap heap scan should scan the index (in a
mostly sequential way), build a bitmap, and then read just the matching
part of the table (sequentially, by skipping some of the pages).

Now, there are a few reasons why this might not work that well.

Perhaps the table fits into RAM, but table + index does not. That would
make the sequential scan much faster than the index path. Not sure if
this is the case, as you haven't mentioned which TPC-H scale are you
testing, but you only have 4GB of RAM which if fairly low.

Another bit is prefetching - with sequential scans, the OS is able to
prefetch the next bit of data automatically (read-ahead). With bitmap
index scans that's not the case, producing a lot of individual
synchronous I/O requests. See if increasing effective_cache_size (from
default 1 to 16 or 32) helps.

Try generating the plans with EXPLAIN (ANALYZE, BUFFERS), that should
tell us more about how many blocks are found in shared buffers, etc.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services