Re: Speed up the query

Поиск
Список
Период
Сортировка
От Hengky Lie
Тема Re: Speed up the query
Дата
Msg-id CADvRrxN6-=tqEiTvxn4wYcEvsLvQioLuU94BhGNAOOfYGEeEnw@mail.gmail.com
обсуждение исходный текст
Ответ на Speed up the query  (Hengky Liwandouw <hengkyliwandouw@gmail.com>)
Ответы Re: Speed up the query  (Hengky Lie <hengkyliwandouw@gmail.com>)
Список pgsql-performance
Dear Torsten and friends,

This is another good case to analyse why the query performance is not  the same :

There are 2 query :
(1)

with qry1 as (
select subkategori, kodebarang as produkid, namabarang, keluar, tbltransaksi.modal*keluar as ttlmodal,
    case
    when tbltransaksi.discount<=100 then
        keluar*(harga - (discount/100*harga))
        when tbltransaksi.discount>100
            then keluar*(harga-discount)
        end as jumlah
        from tblpenjualan
        join tbltransaksi on tblpenjualan.id=tbltransaksi.jualid
        join tblproduk on tbltransaksi.kodebarang=tblproduk.produkid
        join tblsubkategori on tblproduk.subkategoriid=tblsubkategori.tblsubkategoriid
        join tblkategori on tblkategori.kategoriid=tblsubkategori.kategoriid
        where tblpenjualan.tanggal between '01/01/13' and '31/10/13')


select subkategori,produkid, namabarang , sum(keluar) as ttlkeluar, sum(jumlah) as jumlah, sum(ttlmodal) as ttlmodal
from qry1
group by subkategori, produkid, namabarang

"QUERY PLAN"
"HashAggregate  (cost=99124.61..99780.94 rows=65633 width=334) (actual time=3422.786..3434.511 rows=24198 loops=1)"
"  Buffers: shared hit=14543"
"  CTE qry1"
"    ->  Hash Join  (cost=11676.07..76153.06 rows=656330 width=73) (actual time=181.683..2028.046 rows=657785 loops=1)"
"          Hash Cond: ((tbltransaksi.kodebarang)::text = (tblproduk.produkid)::text)"
"          Buffers: shared hit=14543"
"          ->  Hash Join  (cost=7247.75..44651.13 rows=656330 width=31) (actual time=84.885..787.029 rows=658438 loops=1)"
"                Hash Cond: (tbltransaksi.jualid = tblpenjualan.id)"
"                Buffers: shared hit=13204"
"                ->  Seq Scan on tbltransaksi  (cost=0.00..18730.83 rows=807283 width=35) (actual time=0.005..157.004 rows=807033 loops=1)"
"                      Buffers: shared hit=10658"
"                ->  Hash  (cost=5293.64..5293.64 rows=156329 width=4) (actual time=84.842..84.842 rows=154900 loops=1)"
"                      Buckets: 16384  Batches: 1  Memory Usage: 3631kB"
"                      Buffers: shared hit=2546"
"                      ->  Seq Scan on tblpenjualan  (cost=0.00..5293.64 rows=156329 width=4) (actual time=0.007..49.444 rows=154900 loops=1)"
"                            Filter: ((tanggal >= '2013-01-01'::date) AND (tanggal <= '2013-10-31'::date))"
"                            Rows Removed by Filter: 27928"
"                            Buffers: shared hit=2546"
"          ->  Hash  (cost=3364.19..3364.19 rows=85130 width=55) (actual time=96.736..96.736 rows=84701 loops=1)"
"                Buckets: 16384  Batches: 1  Memory Usage: 6323kB"
"                Buffers: shared hit=1339"
"                ->  Hash Join  (cost=5.35..3364.19 rows=85130 width=55) (actual time=0.241..62.038 rows=84701 loops=1)"
"                      Hash Cond: ((tblproduk.subkategoriid)::text = (tblsubkategori.tblsubkategoriid)::text)"
"                      Buffers: shared hit=1339"
"                      ->  Seq Scan on tblproduk  (cost=0.00..2188.30 rows=85130 width=45) (actual time=0.008..17.549 rows=85035 loops=1)"
"                            Buffers: shared hit=1337"
"                      ->  Hash  (cost=4.23..4.23 rows=90 width=17) (actual time=0.224..0.224 rows=90 loops=1)"
"                            Buckets: 1024  Batches: 1  Memory Usage: 4kB"
"                            Buffers: shared hit=2"
"                            ->  Hash Join  (cost=1.09..4.23 rows=90 width=17) (actual time=0.028..0.153 rows=90 loops=1)"
"                                  Hash Cond: ((tblsubkategori.kategoriid)::text = (tblkategori.kategoriid)::text)"
"                                  Buffers: shared hit=2"
"                                  ->  Seq Scan on tblsubkategori  (cost=0.00..1.90 rows=90 width=21) (actual time=0.005..0.029 rows=90 loops=1)"
"                                        Buffers: shared hit=1"
"                                  ->  Hash  (cost=1.04..1.04 rows=4 width=4) (actual time=0.011..0.011 rows=4 loops=1)"
"                                        Buckets: 1024  Batches: 1  Memory Usage: 1kB"
"                                        Buffers: shared hit=1"
"                                        ->  Seq Scan on tblkategori  (cost=0.00..1.04 rows=4 width=4) (actual time=0.002..0.004 rows=4 loops=1)"
"                                              Buffers: shared hit=1"
"  ->  CTE Scan on qry1  (cost=0.00..13126.60 rows=656330 width=334) (actual time=181.687..2556.526 rows=657785 loops=1)"
"        Buffers: shared hit=14543"
"Total runtime: 3454.442 ms"

(2)this is exactly the same query with no.1 except it uses subquery

select subkategori,produkid, namabarang , sum(keluar) as ttlkeluar, sum(jumlah) as jumlah, sum(ttlmodal) as ttlmodal from
( select subkategori, kodebarang as produkid, namabarang, keluar, tbltransaksi.modal*keluar as ttlmodal,
case
when tbltransaksi.discount<=100 then
keluar*(harga - (discount/100*harga))
when tbltransaksi.discount>100
then keluar*(harga-discount)
end as jumlah
from tblpenjualan
join tbltransaksi on tblpenjualan.id=tbltransaksi.jualid
join tblproduk on tbltransaksi.kodebarang=tblproduk.produkid
join tblsubkategori on tblproduk.subkategoriid=tblsubkategori.tblsubkategoriid
join tblkategori on tblkategori.kategoriid=tblsubkategori.kategoriid
where tblpenjualan.tanggal between '01/01/13' and '31/10/13')
as dt group by subkategori, produkid, namabarang
The analyse result :

"QUERY PLAN"
"GroupAggregate  (cost=124800.44..157616.94 rows=656330 width=73) (actual time=13895.782..15236.212 rows=24198 loops=1)"
"  Buffers: shared hit=14543"
"  ->  Sort  (cost=124800.44..126441.26 rows=656330 width=73) (actual time=13895.750..14024.911 rows=657785 loops=1)"
"        Sort Key: tblsubkategori.subkategori, tbltransaksi.kodebarang, tblproduk.namabarang"
"        Sort Method: quicksort  Memory: 103431kB"
"        Buffers: shared hit=14543"
"        ->  Hash Join  (cost=11676.07..61385.63 rows=656330 width=73) (actual time=177.521..1264.431 rows=657785 loops=1)"
"              Hash Cond: ((tbltransaksi.kodebarang)::text = (tblproduk.produkid)::text)"
"              Buffers: shared hit=14543"
"              ->  Hash Join  (cost=7247.75..44651.13 rows=656330 width=31) (actual time=84.473..739.064 rows=658438 loops=1)"
"                    Hash Cond: (tbltransaksi.jualid = tblpenjualan.id)"
"                    Buffers: shared hit=13204"
"                    ->  Seq Scan on tbltransaksi  (cost=0.00..18730.83 rows=807283 width=35) (actual time=0.005..146.601 rows=807033 loops=1)"
"                          Buffers: shared hit=10658"
"                    ->  Hash  (cost=5293.64..5293.64 rows=156329 width=4) (actual time=84.429..84.429 rows=154900 loops=1)"
"                          Buckets: 16384  Batches: 1  Memory Usage: 3631kB"
"                          Buffers: shared hit=2546"
"                          ->  Seq Scan on tblpenjualan  (cost=0.00..5293.64 rows=156329 width=4) (actual time=0.008..48.968 rows=154900 loops=1)"
"                                Filter: ((tanggal >= '2013-01-01'::date) AND (tanggal <= '2013-10-31'::date))"
"                                Rows Removed by Filter: 27928"
"                                Buffers: shared hit=2546"
"              ->  Hash  (cost=3364.19..3364.19 rows=85130 width=55) (actual time=92.998..92.998 rows=84701 loops=1)"
"                    Buckets: 16384  Batches: 1  Memory Usage: 6323kB"
"                    Buffers: shared hit=1339"
"                    ->  Hash Join  (cost=5.35..3364.19 rows=85130 width=55) (actual time=0.240..59.587 rows=84701 loops=1)"
"                          Hash Cond: ((tblproduk.subkategoriid)::text = (tblsubkategori.tblsubkategoriid)::text)"
"                          Buffers: shared hit=1339"
"                          ->  Seq Scan on tblproduk  (cost=0.00..2188.30 rows=85130 width=45) (actual time=0.008..16.942 rows=85035 loops=1)"
"                                Buffers: shared hit=1337"
"                          ->  Hash  (cost=4.23..4.23 rows=90 width=17) (actual time=0.221..0.221 rows=90 loops=1)"
"                                Buckets: 1024  Batches: 1  Memory Usage: 4kB"
"                                Buffers: shared hit=2"
"                                ->  Hash Join  (cost=1.09..4.23 rows=90 width=17) (actual time=0.028..0.142 rows=90 loops=1)"
"                                      Hash Cond: ((tblsubkategori.kategoriid)::text = (tblkategori.kategoriid)::text)"
"                                      Buffers: shared hit=2"
"                                      ->  Seq Scan on tblsubkategori  (cost=0.00..1.90 rows=90 width=21) (actual time=0.006..0.046 rows=90 loops=1)"
"                                            Buffers: shared hit=1"
"                                      ->  Hash  (cost=1.04..1.04 rows=4 width=4) (actual time=0.012..0.012 rows=4 loops=1)"
"                                            Buckets: 1024  Batches: 1  Memory Usage: 1kB"
"                                            Buffers: shared hit=1"
"                                            ->  Seq Scan on tblkategori  (cost=0.00..1.04 rows=4 width=4) (actual time=0.002..0.005 rows=4 loops=1)"
"                                                  Buffers: shared hit=1"
"Total runtime: 15244.038 ms"

This is my  Postgresqlconf :
max_connections=50
shared_buffers=1024MB
wall_buffers=16MB
max_prepared_transactions=0
work_mem=50MB
maintenance_work_mem=256MB

Thanks


On Sun, Dec 1, 2013 at 9:39 PM, Torsten Förtsch <torsten.foertsch@gmx.net> wrote:
On 01/12/13 13:40, Hengky Liwandouw wrote:
> Torsten, your 2nd option works now. I dont know maybe copy and paste error. I just want to report that your 2nd option with work_mem=100MB required the same amount of time (about 58 seconds), while my query required 4.9 seconds.
>
> What make this two query so different ?
>
Without the "explain (analyze,buffers) ..." it's hard to say. A CTE is
currently a way to trick the query planner because it's planned
separately. A subquery on the other hand is integrated in the outer
query and planned/optimized as one thing.

If your planner parameters are correctly set up, the subquery should
almost always outrun the CTE. Often, though, not much.

Now, you may ask why CTE then exist at all? There are things that cannot
be expressed without them, in particular WITH RECURSIVE.

The fact that it performs so badly as a subquery indicates that either
your table statistics are suboptimal or more probably the planner
parameters or work_mem.

Another point I have just noticed, how does it perform if you change

  and extract(... from tanggal)='2013'

to

  and '2013-01-01'::date <= tanggal
  and tanggal < '2013-01-01'::date + '1 year'::interval

Also, I think it would be possible to even get rid of the subquery. At
least you can get rid of the tanggal and jumlah output from the subquery.

select s.id, s.nama, t.kodebarang, p.namabarang,
       sum(case when extract(month from t.tanggal) = 1
                then t.keluar else 0 end) as jan,
       sum(case when extract(month from t.tanggal) = 2
                then t.keluar else 0 end) as feb,
       ...,
       sum(t.keluar) as total
  from tbltransaksi t
  join tblproduk p on t.kodebarang=p.produkid
  join tblsupplier s on p.supplierid=s.id
 where (t.jualid is not null or t.returjualid is not null)
   and '2013-01-01'::date <= t.tanggal
   and t.tanggal < '2013-01-01'::date + '1 year'::interval
 group by s.id, s.nama, t.kodebarang, p.namabarang
 order by total desc
 limit 1000

would be interesting to see the "explain (analyze,buffers)" output for
the query above.

Please double-check the query. I think it should do exactly the same as
your query. But you know, shit happens.

BTW, am I right in assuming that you are from Malaysia or Indonesia? I
am trying to learn a bit of Malay. I am a complete beginner, though.

Selamat berjaya      (is that possible to wish you success?)
Torsten

В списке pgsql-performance по дате отправления:

Предыдущее
От: Hengky Lie
Дата:
Сообщение: Re: Speed up the query
Следующее
От: Hengky Lie
Дата:
Сообщение: Re: Speed up the query