Обсуждение: Optimizing IN queries

Поиск
Список
Период
Сортировка

Optimizing IN queries

От
"Andrus"
Дата:
IN queries have large number of int values. Sample below is query which has
only 10 values in IN list but takes more than one minute.

In real query IN list may contain up to 5000 integers.

There are indexes in both dok.dokumnr and bilkaib.dokumnr columns so it
should run fast.
How to speed up the sample query below and if its IN list contains 5000
integers ?
Larger list takes takes 700 seconds to run.
Should I use CREATE TEMP TABLE list ON COMMIT DROP  to pass this list
instead using inline list or other idea ?
Shoult I create index on temp table also when creating temp table ?

Or should I require server upgrade ? Should I require upgrading PostgreSql,
adding RAM, disk speed or what ?

Andrus.

"PostgreSQL 8.1.4 on i686-pc-linux-gnu, compiled by GCC
i686-pc-linux-gnu-gcc (GCC) 3.4.6 (Gentoo 3.4.6-r1, ssp-3.4.5-1.0,
pie-8.7.9)"

explain analyze select count(*)::INTEGER as cnt
       from dok
     WHERE dokumnr IN
(869906,869907,869910,869911,869914,869915,869916,869917,869918,869921  )
and
 dokumnr NOT IN (SELECT dokumnr FROM bilkaib WHERE
   alusdok='LY')

"Aggregate  (cost=186516.39..186516.40 rows=1 width=0) (actual
time=72370.224..72370.228 rows=1 loops=1)"
"  ->  Bitmap Heap Scan on dok  (cost=154840.10..186516.37 rows=5 width=0)
(actual time=72370.195..72370.195 rows=0 loops=1)"
"        Recheck Cond: ((dokumnr = 869906) OR (dokumnr = 869907) OR (dokumnr
= 869910) OR (dokumnr = 869911) OR (dokumnr = 869914) OR (dokumnr = 869915)
OR (dokumnr = 869916) OR (dokumnr = 869917) OR (dokumnr = 869918) OR
(dokumnr = 869921))"
"        Filter: (NOT (subplan))"
"        ->  BitmapOr  (cost=20.03..20.03 rows=10 width=0) (actual
time=173.116..173.116 rows=0 loops=1)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0) (actual time=172.981..172.981 rows=1 loops=1)"
"                    Index Cond: (dokumnr = 869906)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0) (actual time=0.021..0.021 rows=1 loops=1)"
"                    Index Cond: (dokumnr = 869907)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0) (actual time=0.008..0.008 rows=1 loops=1)"
"                    Index Cond: (dokumnr = 869910)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0) (actual time=0.016..0.016 rows=1 loops=1)"
"                    Index Cond: (dokumnr = 869911)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0) (actual time=0.009..0.009 rows=1 loops=1)"
"                    Index Cond: (dokumnr = 869914)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0) (actual time=0.008..0.008 rows=1 loops=1)"
"                    Index Cond: (dokumnr = 869915)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0) (actual time=0.007..0.007 rows=1 loops=1)"
"                    Index Cond: (dokumnr = 869916)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0) (actual time=0.007..0.007 rows=1 loops=1)"
"                    Index Cond: (dokumnr = 869917)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0) (actual time=0.007..0.007 rows=1 loops=1)"
"                    Index Cond: (dokumnr = 869918)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0) (actual time=0.007..0.007 rows=1 loops=1)"
"                    Index Cond: (dokumnr = 869921)"
"        SubPlan"
"          ->  Materialize  (cost=154820.07..160183.25 rows=385618 width=4)
(actual time=0.216..4400.739 rows=384914 loops=10)"
"                ->  Seq Scan on bilkaib  (cost=0.00..152927.45 rows=385618
width=4) (actual time=1.925..11707.045 rows=384930 loops=1)"
"                      Filter: (alusdok = 'LY'::bpchar)"
"Total runtime: 72374.562 ms"


Re: Optimizing IN queries

От
"Isak Hansen"
Дата:
On Mon, Nov 10, 2008 at 12:35 PM, Andrus <kobruleht2@hot.ee> wrote:

> explain analyze select count(*)::INTEGER as cnt
>      from dok
>    WHERE dokumnr IN
> (869906,869907,869910,869911,869914,869915,869916,869917,869918,869921  )
> and
> dokumnr NOT IN (SELECT dokumnr FROM bilkaib WHERE
>  alusdok='LY')
>

Looks to me like most of the time is spent doing "not in (select a ton
of rows from bilkaib)".

Try something like "not exists (select null from bilkaib b where
b.dokumnr = dok.dokumnr and alusdok = 'LY')".


HTH, Isak

Re: Optimizing IN queries

От
"Andrus"
Дата:
Isak,

> Looks to me like most of the time is spent doing "not in (select a ton
> of rows from bilkaib)".
>
> Try something like "not exists (select null from bilkaib b where
> b.dokumnr = dok.dokumnr and alusdok = 'LY')".

Thank you very much.
Query seems to return now immediately.

dok.dokumnr is not null
bilkaib.dokumnr can be null

I hope that modified query produces same results as original.

Andrus.

Re: Optimizing IN queries

От
"Andrus"
Дата:
> Try something like "not exists (select null from bilkaib b where
> b.dokumnr = dok.dokumnr and alusdok = 'LY')".

I tried to optimize another similar query but it is still slow.

This query has same clause

dok.dokumnr IN
(869906,869907,869910,869911,869914,869915,869916,869917,869918 )

duplicated in in two places.

In real query this list contains much more items and set clause sets a lot
of values using FROM tables
(in this test SET contains only dummy setter) but it seems to take roughly
the same time as test query below.

All join columns are indexed. How to make this faster ?
Log below shows that triggers are called.
UPDATE doesnt update any fk columns. No idea why explain shows trigger
calls.

Andrus.

EXPLAIN analyze UPDATE DOK set
       KALKLIIK = dok.KALKLIIK
      from ( SELECT
     dok.dokumnr,
     SUM(rid.hind) AS doksumma
     FROM dok JOIN rid USING(dokumnr)
     WHERE dok.dokumnr IN
     (869906,869907,869910,869911,869914,869915,869916,869917,869918 )
     group by 1
 ) doksumma right join dok x USING(dokumnr)
 left join bilkaib on bilkaib.dokumnr=doksumma.dokumnr and
bilkaib.alusdok='LY'
     WHERE dok.dokumnr IN
(869906,869907,869910,869911,869914,869915,869916,869917,869918)
      and
     dok.dokumnr=x.dokumnr

"Hash Join  (cost=540908.66..2312297.49 rows=650 width=1179) (actual
time=66045.802..84717.094 rows=33 loops=1)"
"  Hash Cond: ("outer".dokumnr = "inner".dokumnr)"
"  ->  Merge Right Join  (cost=540854.51..1869873.52 rows=88472665 width=4)
(actual time=66045.432..80246.663 rows=1222376 loops=1)"
"        Merge Cond: ("outer".dokumnr = "inner".dokumnr)"
"        ->  Sort  (cost=194834.23..195798.74 rows=385806 width=4) (actual
time=7373.251..8003.360 rows=159000 loops=1)"
"              Sort Key: bilkaib.dokumnr"
"              ->  Seq Scan on bilkaib  (cost=0.00..153002.12 rows=385806
width=4) (actual time=1.698..4192.896 rows=406443 loops=1)"
"                    Filter: (alusdok = 'LY'::bpchar)"
"        ->  Sort  (cost=346020.28..349083.99 rows=1225481 width=8) (actual
time=57462.007..62545.451 rows=1222376 loops=1)"
"              Sort Key: doksumma.dokumnr"
"              ->  Hash Left Join  (cost=860.23..189634.54 rows=1225481
width=8) (actual time=68.117..50296.421 rows=1222352 loops=1)"
"                    Hash Cond: ("outer".dokumnr = "inner".dokumnr)"
"                    ->  Seq Scan on dok x  (cost=0.00..182646.81
rows=1225481 width=4) (actual time=15.107..40573.750 rows=1222352 loops=1)"
"                    ->  Hash  (cost=860.21..860.21 rows=9 width=4) (actual
time=52.976..52.976 rows=9 loops=1)"
"                          ->  Subquery Scan doksumma  (cost=860.01..860.21
rows=9 width=4) (actual time=52.828..52.930 rows=9 loops=1)"
"                                ->  HashAggregate  (cost=860.01..860.12
rows=9 width=14) (actual time=52.819..52.857 rows=9 loops=1)"
"                                      ->  Nested Loop  (cost=18.03..859.89
rows=24 width=14) (actual time=42.854..52.659 rows=22 loops=1)"
"                                            ->  Bitmap Heap Scan on dok
(cost=18.03..54.13 rows=9 width=4) (actual time=0.246..0.364 rows=9
loops=1)"
"                                                  Recheck Cond: ((dokumnr =
869906) OR (dokumnr = 869907) OR (dokumnr = 869910) OR (dokumnr = 869911) OR
(dokumnr = 869914) OR (dokumnr = 869915) OR (dokumnr = 869916) OR (dokumnr =
869917) OR (dokumnr = 869918))"
"                                                  ->  BitmapOr
(cost=18.03..18.03 rows=9 width=0) (actual time=0.214..0.214 rows=0
loops=1)"
"                                                        ->  Bitmap Index
Scan on dok_dokumnr_idx  (cost=0.00..2.00 rows=1 width=0) (actual
time=0.102..0.102 rows=2 loops=1)"
"                                                              Index Cond:
(dokumnr = 869906)"
"                                                        ->  Bitmap Index
Scan on dok_dokumnr_idx  (cost=0.00..2.00 rows=1 width=0) (actual
time=0.010..0.010 rows=2 loops=1)"
"                                                              Index Cond:
(dokumnr = 869907)"
"                                                        ->  Bitmap Index
Scan on dok_dokumnr_idx  (cost=0.00..2.00 rows=1 width=0) (actual
time=0.008..0.008 rows=2 loops=1)"
"                                                              Index Cond:
(dokumnr = 869910)"
"                                                        ->  Bitmap Index
Scan on dok_dokumnr_idx  (cost=0.00..2.00 rows=1 width=0) (actual
time=0.014..0.014 rows=2 loops=1)"
"                                                              Index Cond:
(dokumnr = 869911)"
"                                                        ->  Bitmap Index
Scan on dok_dokumnr_idx  (cost=0.00..2.00 rows=1 width=0) (actual
time=0.008..0.008 rows=2 loops=1)"
"                                                              Index Cond:
(dokumnr = 869914)"
"                                                        ->  Bitmap Index
Scan on dok_dokumnr_idx  (cost=0.00..2.00 rows=1 width=0) (actual
time=0.008..0.008 rows=2 loops=1)"
"                                                              Index Cond:
(dokumnr = 869915)"
"                                                        ->  Bitmap Index
Scan on dok_dokumnr_idx  (cost=0.00..2.00 rows=1 width=0) (actual
time=0.008..0.008 rows=2 loops=1)"
"                                                              Index Cond:
(dokumnr = 869916)"
"                                                        ->  Bitmap Index
Scan on dok_dokumnr_idx  (cost=0.00..2.00 rows=1 width=0) (actual
time=0.008..0.008 rows=2 loops=1)"
"                                                              Index Cond:
(dokumnr = 869917)"
"                                                        ->  Bitmap Index
Scan on dok_dokumnr_idx  (cost=0.00..2.00 rows=1 width=0) (actual
time=0.008..0.008 rows=2 loops=1)"
"                                                              Index Cond:
(dokumnr = 869918)"
"                                            ->  Index Scan using
rid_dokumnr_idx on rid  (cost=0.00..87.95 rows=126 width=14) (actual
time=4.767..5.780 rows=2 loops=9)"
"                                                  Index Cond:
("outer".dokumnr = rid.dokumnr)"
"  ->  Hash  (cost=54.13..54.13 rows=9 width=1179) (actual time=0.328..0.328
rows=9 loops=1)"
"        ->  Bitmap Heap Scan on dok  (cost=18.03..54.13 rows=9 width=1179)
(actual time=0.164..0.248 rows=9 loops=1)"
"              Recheck Cond: ((dokumnr = 869906) OR (dokumnr = 869907) OR
(dokumnr = 869910) OR (dokumnr = 869911) OR (dokumnr = 869914) OR (dokumnr =
869915) OR (dokumnr = 869916) OR (dokumnr = 869917) OR (dokumnr = 869918))"
"              ->  BitmapOr  (cost=18.03..18.03 rows=9 width=0) (actual
time=0.132..0.132 rows=0 loops=1)"
"                    ->  Bitmap Index Scan on dok_dokumnr_idx
(cost=0.00..2.00 rows=1 width=0) (actual time=0.025..0.025 rows=2 loops=1)"
"                          Index Cond: (dokumnr = 869906)"
"                    ->  Bitmap Index Scan on dok_dokumnr_idx
(cost=0.00..2.00 rows=1 width=0) (actual time=0.008..0.008 rows=2 loops=1)"
"                          Index Cond: (dokumnr = 869907)"
"                    ->  Bitmap Index Scan on dok_dokumnr_idx
(cost=0.00..2.00 rows=1 width=0) (actual time=0.007..0.007 rows=2 loops=1)"
"                          Index Cond: (dokumnr = 869910)"
"                    ->  Bitmap Index Scan on dok_dokumnr_idx
(cost=0.00..2.00 rows=1 width=0) (actual time=0.013..0.013 rows=2 loops=1)"
"                          Index Cond: (dokumnr = 869911)"
"                    ->  Bitmap Index Scan on dok_dokumnr_idx
(cost=0.00..2.00 rows=1 width=0) (actual time=0.008..0.008 rows=2 loops=1)"
"                          Index Cond: (dokumnr = 869914)"
"                    ->  Bitmap Index Scan on dok_dokumnr_idx
(cost=0.00..2.00 rows=1 width=0) (actual time=0.008..0.008 rows=2 loops=1)"
"                          Index Cond: (dokumnr = 869915)"
"                    ->  Bitmap Index Scan on dok_dokumnr_idx
(cost=0.00..2.00 rows=1 width=0) (actual time=0.008..0.008 rows=2 loops=1)"
"                          Index Cond: (dokumnr = 869916)"
"                    ->  Bitmap Index Scan on dok_dokumnr_idx
(cost=0.00..2.00 rows=1 width=0) (actual time=0.007..0.007 rows=2 loops=1)"
"                          Index Cond: (dokumnr = 869917)"
"                    ->  Bitmap Index Scan on dok_dokumnr_idx
(cost=0.00..2.00 rows=1 width=0) (actual time=0.007..0.007 rows=2 loops=1)"
"                          Index Cond: (dokumnr = 869918)"
"Trigger for constraint dok_kalktoode_fkey: time=0.262 calls=9"
"Trigger for constraint dok_krdokumnr_fkey: time=0.069 calls=9"
"Trigger for constraint dok_liikmesrii_fkey: time=0.075 calls=9"
"Trigger for constraint dok_maksetin_fkey: time=0.072 calls=9"
"Trigger for constraint dok_pais5obj_fkey: time=0.077 calls=9"
"Trigger for constraint dok_pais6obj_fkey: time=0.074 calls=9"
"Trigger for constraint dok_pais7obj_fkey: time=0.074 calls=9"
"Trigger for constraint dok_pais8obj_fkey: time=0.075 calls=9"
"Trigger for constraint dok_pais9obj_fkey: time=0.076 calls=9"
"Trigger for constraint dok_saaja_fkey: time=0.072 calls=9"
"Trigger for constraint dok_statprots_fkey: time=0.086 calls=9"
"Trigger for constraint dok_tarneklaus_fkey: time=0.087 calls=9"
"Trigger for constraint dok_tehingulii_fkey: time=0.082 calls=9"
"Trigger for constraint dok_username_fkey: time=0.073 calls=9"
"Trigger for constraint dok_vmnr_fkey: time=0.080 calls=9"
"Trigger for constraint dok_volitaisik_fkey: time=0.085 calls=9"
"Total runtime: 84815.547 ms"

"PostgreSQL 8.1.4 on i686-pc-linux-gnu, compiled by GCC
i686-pc-linux-gnu-gcc (GCC) 3.4.6 (Gentoo 3.4.6-r1, ssp-3.4.5-1.0,
pie-8.7.9)"