Re: Very slow query (3-4mn) on a table with 25millions rows

Поиск
Список
Период
Сортировка
От Abadie Lana
Тема Re: Very slow query (3-4mn) on a table with 25millions rows
Дата
Msg-id E544BB9A64ABD24DA201745FD316D94551124F74@XCH2.iter.org
обсуждение исходный текст
Ответ на Re: Very slow query (3-4mn) on a table with 25millions rows  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Very slow query (3-4mn) on a table with 25millions rows
Список pgsql-performance
Hi Tom,
Thanks for the hints..

I made various tests for index
The best I could get is the following one with
create index vat_funcvaratt_multi_idx on functionalvarattributes(split_part(split_part(attvalue,' ',1),'.',1), tag_id,
atttype_id);
analyze functionalvarattributes;

explain analyze select s.attvalue from functionalvarattributes s, tags t, variableattributetypes vat where
t.id=s.tag_idand t.status!='Internal'and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK') and
vat.id=s.atttype_idand split_part(split_part(s.attvalue,' ',1),'.',1) in (select e.name from functionalvariables e,
usertemplatevariableut where e.usertemplatevar_id=ut.id and ut.usertempl_id=15) except select s.attvalue from
functionalvarattributess, tags t, usertemplvarattribute utva, usertemplatevariable utv, variableattributetypes vat
wherevat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK') and vat.id=s.atttype_id and
utv.id=utva.usertempvariable_fkand utv.usertempl_id=15 and t.id=s.tag_id and t.status!='Internal'and
split_part(split_part(s.attvalue,'',1),'.',1) in (select e.name from functionalvariables e, usertemplatevariable ut
wheree.usertemplatevar_id=ut.id and ut.usertempl_id=15); 
                                                                                                 QUERY PLAN
               


---------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------
 HashSetOp Except  (cost=171505.51..2361978.74 rows=1116 width=8) (actual time=66476.682..66476.682 rows=0 loops=1)
   ->  Append  (cost=171505.51..2251949.02 rows=44011889 width=8) (actual time=12511.639..66476.544 rows=320 loops=1)
         ->  Subquery Scan on "*SELECT* 1"  (cost=171505.51..907368.77 rows=310121 width=8) (actual
time=12511.638..31775.404rows=2 lo 
ops=1)
               ->  Hash Join  (cost=171505.51..904267.56 rows=310121 width=8) (actual time=12511.636..31775.401 rows=2
loops=1)
                     Hash Cond: (split_part(split_part((s.attvalue)::text, ' '::text, 1), '.'::text, 1) =
(e.name)::text)
                     ->  Hash Join  (cost=193.91..726325.20 rows=310121 width=8) (actual time=1.227..24083.777
rows=308287loops=1) 
                           Hash Cond: (s.tag_id = t.id)
                           ->  Hash Join  (cost=188.03..716951.08 rows=1671210 width=16) (actual time=1.157..23810.490
rows=651155loop 
s=1)
                                 Hash Cond: (s.atttype_id = vat.id)
                                 ->  Seq Scan on functionalvarattributes s  (cost=0.00..604688.60 rows=25429960
width=24)(actual time= 
0.002..15719.449 rows=25429808 loops=1)
                                 ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual time=1.116..1.116 rows=388
loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 16kB
                                       ->  Seq Scan on variableattributetypes vat  (cost=0.00..183.18 rows=388 width=8)
(actualtime=0. 
005..0.987 rows=388 loops=1)
                                             Filter: ((fieldtype)::text = ANY
('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))
                                             Rows Removed by Filter: 5516
                           ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual time=0.064..0.064 rows=36 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 2kB
                                 ->  Seq Scan on tags t  (cost=0.00..5.43 rows=36 width=8) (actual time=0.008..0.055
rows=36loops=1) 
                                       Filter: ((status)::text <> 'Internal'::text)
                                       Rows Removed by Filter: 158
                     ->  Hash  (cost=171250.07..171250.07 rows=4923 width=24) (actual time=7377.344..7377.344 rows=16
loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 1kB
                           ->  HashAggregate  (cost=171200.84..171250.07 rows=4923 width=24) (actual
time=7377.310..7377.329rows=16 lo 
ops=1)
                                 ->  Hash Join  (cost=8.95..171188.53 rows=4923 width=24) (actual time=3.178..7377.271
rows=48loops=1) 
                                       Hash Cond: (e.usertemplatevar_id = ut.id)
                                       ->  Seq Scan on functionalvariables e  (cost=0.00..155513.07 rows=4164607
width=32)(actual time 
=1.271..5246.277 rows=4164350 loops=1)
                                       ->  Hash  (cost=8.75..8.75 rows=16 width=8) (actual time=0.026..0.026 rows=16
loops=1)
                                             Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                             ->  Index Scan using usertemp_utv_idx on usertemplatevariable ut
(cost=0.29..8.75rows=16 
 width=8) (actual time=0.011..0.020 rows=16 loops=1)
                                                   Index Cond: (usertempl_id = 15)
         ->  Subquery Scan on "*SELECT* 2"  (cost=172514.13..1344580.25 rows=43701768 width=8) (actual
time=11551.477..34701.030rows=3 
18 loops=1)
               ->  Hash Join  (cost=172514.13..907562.57 rows=43701768 width=8) (actual time=11551.475..34700.876
rows=318loops=1) 
                     Hash Cond: (split_part(split_part((s_1.attvalue)::text, ' '::text, 1), '.'::text, 1) =
(e_1.name)::text)
                     ->  Hash Join  (cost=193.91..726325.20 rows=310121 width=8) (actual time=1.281..27733.991
rows=308287loops=1) 
                           Hash Cond: (s_1.tag_id = t_1.id)
                           ->  Hash Join  (cost=188.03..716951.08 rows=1671210 width=16) (actual time=1.194..27391.475
rows=651155loop 
s=1)
                                 Hash Cond: (s_1.atttype_id = vat_1.id)
                                 ->  Seq Scan on functionalvarattributes s_1  (cost=0.00..604688.60 rows=25429960
width=24)(actual tim 
e=0.001..17189.172 rows=25429808 loops=1)
                                 ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual time=1.153..1.153 rows=388
loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 16kB
                                       ->  Seq Scan on variableattributetypes vat_1  (cost=0.00..183.18 rows=388
width=8)(actual time= 
0.007..1.015 rows=388 loops=1)
                                             Filter: ((fieldtype)::text = ANY
('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))
                                             Rows Removed by Filter: 5516
                           ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual time=0.065..0.065 rows=36 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 2kB
                                 ->  Seq Scan on tags t_1  (cost=0.00..5.43 rows=36 width=8) (actual time=0.010..0.053
rows=36loops=1) 
                                       Filter: ((status)::text <> 'Internal'::text)
                                       Rows Removed by Filter: 158
                     ->  Hash  (cost=172318.46..172318.46 rows=141 width=24) (actual time=6553.620..6553.620 rows=2544
loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 134kB
                           ->  Nested Loop  (cost=171201.54..172318.46 rows=141 width=24) (actual
time=6550.096..6552.789rows=2544 loo 
ps=1)
                                 ->  Nested Loop  (cost=171201.12..172243.46 rows=16 width=32) (actual
time=6550.077..6550.305rows=256 
 loops=1)
                                       ->  HashAggregate  (cost=171200.84..171250.07 rows=4923 width=24) (actual
time=6542.508..6542.53
5 rows=16 loops=1)
                                             ->  Hash Join  (cost=8.95..171188.53 rows=4923 width=24) (actual
time=12.705..6542.472row 
s=48 loops=1)
                                                   Hash Cond: (e_1.usertemplatevar_id = ut_1.id)
                                                   ->  Seq Scan on functionalvariables e_1  (cost=0.00..155513.07
rows=4164607width=32 
) (actual time=7.324..5008.051 rows=4164350 loops=1)
                                                   ->  Hash  (cost=8.75..8.75 rows=16 width=8) (actual
time=0.033..0.033rows=16 loops= 
1)
                                                         Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                                         ->  Index Scan using usertemp_utv_idx on usertemplatevariable
ut_1 (cost=0.29 
..8.75 rows=16 width=8) (actual time=0.018..0.026 rows=16 loops=1)
                                                               Index Cond: (usertempl_id = 15)
                                       ->  Materialize  (cost=0.29..8.83 rows=16 width=8) (actual time=0.473..0.478
rows=16loops=16) 
                                             ->  Index Scan using usertemp_utv_idx on usertemplatevariable utv
(cost=0.29..8.75rows=1 
6 width=8) (actual time=0.032..0.041 rows=16 loops=1)
                                                   Index Cond: (usertempl_id = 15)
                                 ->  Index Only Scan using usertemplvarattribute_atttypeid_key on usertemplvarattribute
utva (cost=0.4 
2..4.60 rows=9 width=8) (actual time=0.002..0.004 rows=10 loops=256)
                                       Index Cond: (usertempvariable_fk = utv.id)
                                       Heap Fetches: 0
 Total runtime: 66476.942 ms
(67 rows)

Is this acceptable or can I get better results?
Thanks
Lana

>>-----Original Message-----
>>From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>>Sent: 25 July 2016 20:07
>>To: Abadie Lana
>>Cc: pgsql-performance@postgresql.org
>>Subject: Re: [PERFORM] Very slow query (3-4mn) on a table with 25millions
>>rows
>>
>>Abadie Lana <Lana.Abadie@iter.org> writes:
>>> I'm having a problem with a slow query - I tried several things to optimize the
>>queries but didn't really help. The output of explain analyse shows sequential
>>scan on a table of 25 million rows. Even though it is indexed and (I put a multi-
>>column index on the fields used in the query), the explain utility shows no usage
>>of the scan...
>>
>>That index looks pretty useless judging from the rowcounts, so I'm not surprised
>>that the planner didn't use it.  You might have better luck with an index on the
>>split_part expression
>>
>>split_part(split_part((s.attvalue)::text, ' '::text, 1), '.'::text, 1)
>>
>>since it's the join of that to e.name that seems to be actually selective.
>>(The planner doesn't appear to realize that it is, but ANALYZE'ing after creating
>>the index should fix that.)
>>
>>            regards, tom lane


В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Very slow query (3-4mn) on a table with 25millions rows
Следующее
От: Martín Marqués
Дата:
Сообщение: Re: Very slow query (3-4mn) on a table with 25millions rows