Обсуждение: Very slow query (3-4mn) on a table with 25millions rows

Поиск
Список
Период
Сортировка

Very slow query (3-4mn) on a table with 25millions rows

От
Abadie Lana
Дата:

Hi all

I’m having a problem with a slow query – I tried several things to optimize the queries but didn’t really help. The output of explain analyse shows sequential scan on a table of 25 million rows. Even though it is indexed and (I put a multi-column index on the fields used in the query), the explain utility shows no usage of the scan…

Query takes around 200 sec…

Before considering a design change…I wanted to make sure that there is no way to optimize the query….

explain analyze select s.attvalue from functionalvarattributes s, tags t, variableattributetypes vat where t.id=s.tag_id and t.status!='Internal'and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK') and vat.id=s.atttype_id and split_part(split_part(s.attvalue,' ',1),'.',1) in (select e.name from functionalvariables e, usertemplatevariable ut where e.usertemplatevar_id=ut.id and ut.usertempl_id=15) except select s.attvalue from functionalvarattributes s, tags t, usertemplvarattribute utva, usertemplatevariable utv, variableattributetypes vat where vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK') and vat.id=s.atttype_id and utv.id=utva.usertempvariable_fk and utv.usertempl_id=15 and t.id=s.tag_id and t.status!='Internal'and split_part(split_part(s.attvalue,' ',1),'.',1) in (select e.name from functionalvariables e, usertemplatevariable ut where e.usertemplatevar_id=ut.id and ut.usertempl_id=15);

                                                                                                 QUERY PLAN                           

                                                                     

---------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------

HashSetOp Except  (cost=171505.51..2086914.68 rows=1103 width=8) (actual time=186584.977..186584.977 rows=0 loops=1)

   ->  Append  (cost=171505.51..2031899.30 rows=22006150 width=8) (actual time=36550.214..186584.539 rows=320 loops=1)

         ->  Subquery Scan on "*SELECT* 1"  (cost=171505.51..905822.16 rows=155062 width=8) (actual time=36550.213..87210.878 rows=2 lo

ops=1)

               ->  Hash Join  (cost=171505.51..904271.54 rows=155062 width=8) (actual time=36550.212..87210.874 rows=2 loops=1)

                     Hash Cond: (split_part(split_part((s.attvalue)::text, ' '::text, 1), '.'::text, 1) = (e.name)::text)

                     ->  Hash Join  (cost=193.91..726328.81 rows=310124 width=8) (actual time=42.242..63701.027 rows=308287 loops=1)

                           Hash Cond: (s.tag_id = t.id)

                           ->  Hash Join  (cost=188.03..716954.60 rows=1671226 width=16) (actual time=42.154..63387.723 rows=651155 loo

ps=1)

                                 Hash Cond: (s.atttype_id = vat.id)

                                 ->  Seq Scan on functionalvarattributes s  (cost=0.00..604691.04 rows=25430204 width=24) (actual time=

0.007..53954.210 rows=25429808 loops=1)

                                 ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual time=42.113..42.113 rows=388 loops=1)

                                       Buckets: 1024  Batches: 1  Memory Usage: 16kB

                                       ->  Seq Scan on variableattributetypes vat  (cost=0.00..183.18 rows=388 width=8) (actual time=0.

003..41.984 rows=388 loops=1)

                                             Filter: ((fieldtype)::text = ANY ('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))

                                             Rows Removed by Filter: 5516

                           ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual time=0.064..0.064 rows=36 loops=1)

                                 Buckets: 1024  Batches: 1  Memory Usage: 2kB

                                 ->  Seq Scan on tags t  (cost=0.00..5.43 rows=36 width=8) (actual time=0.012..0.052 rows=36 loops=1)

                                       Filter: ((status)::text <> 'Internal'::text)

                                       Rows Removed by Filter: 158

                     ->  Hash  (cost=171250.07..171250.07 rows=4923 width=24) (actual time=23162.533..23162.533 rows=16 loops=1)

                           Buckets: 1024  Batches: 1  Memory Usage: 1kB

                           ->  HashAggregate  (cost=171200.84..171250.07 rows=4923 width=24) (actual time=23162.498..23162.518 rows=16

loops=1)

                                 ->  Hash Join  (cost=8.95..171188.53 rows=4923 width=24) (actual time=17.642..23162.464 rows=48 loops=

1)

                                       Hash Cond: (e.usertemplatevar_id = ut.id)

                                       ->  Seq Scan on functionalvariables e  (cost=0.00..155513.07 rows=4164607 width=32) (actual time

=0.008..21674.864 rows=4164350 loops=1)

                                       ->  Hash  (cost=8.75..8.75 rows=16 width=8) (actual time=0.058..0.058 rows=16 loops=1)

                                             Buckets: 1024  Batches: 1  Memory Usage: 1kB

                                             ->  Index Scan using usertemp_utv_idx on usertemplatevariable ut  (cost=0.29..8.75 rows=16

width=8) (actual time=0.043..0.052 rows=16 loops=1)

                                                   Index Cond: (usertempl_id = 15)

         ->  Subquery Scan on "*SELECT* 2"  (cost=172514.13..1126077.14 rows=21851088 width=8) (actual time=43579.873..99373.299 rows=3

18 loops=1)

               ->  Hash Join  (cost=172514.13..907566.26 rows=21851088 width=8) (actual time=43579.870..99372.820 rows=318 loops=1)

                     Hash Cond: (split_part(split_part((s_1.attvalue)::text, ' '::text, 1), '.'::text, 1) = (e_1.name)::text)

                     ->  Hash Join  (cost=193.91..726328.81 rows=310124 width=8) (actual time=2.724..71226.183 rows=308287 loops=1)

                           Hash Cond: (s_1.tag_id = t_1.id)

                           ->  Hash Join  (cost=188.03..716954.60 rows=1671226 width=16) (actual time=2.548..70764.941 rows=651155 loop

s=1)

                                 Hash Cond: (s_1.atttype_id = vat_1.id)

                                 ->  Seq Scan on functionalvarattributes s_1  (cost=0.00..604691.04 rows=25430204 width=24) (actual tim

e=0.003..57363.539 rows=25429808 loops=1)

                                 ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual time=2.450..2.450 rows=388 loops=1)

                                       Buckets: 1024  Batches: 1  Memory Usage: 16kB

                                       ->  Seq Scan on variableattributetypes vat_1  (cost=0.00..183.18 rows=388 width=8) (actual time=

0.014..2.153 rows=388 loops=1)

                                             Filter: ((fieldtype)::text = ANY ('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))

                                             Rows Removed by Filter: 5516

                           ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual time=0.131..0.131 rows=36 loops=1)

                                 Buckets: 1024  Batches: 1  Memory Usage: 2kB

                                 ->  Seq Scan on tags t_1  (cost=0.00..5.43 rows=36 width=8) (actual time=0.015..0.100 rows=36 loops=1)

                                       Filter: ((status)::text <> 'Internal'::text)

                                       Rows Removed by Filter: 158

                     ->  Hash  (cost=172318.46..172318.46 rows=141 width=24) (actual time=27594.115..27594.115 rows=2544 loops=1)

                           Buckets: 1024  Batches: 1  Memory Usage: 134kB

                           ->  Nested Loop  (cost=171201.54..172318.46 rows=141 width=24) (actual time=27586.058..27592.012 rows=2544 l

oops=1)

                                 ->  Nested Loop  (cost=171201.12..172243.46 rows=16 width=32) (actual time=27585.957..27586.510 rows=2

56 loops=1)

                                       ->  HashAggregate  (cost=171200.84..171250.07 rows=4923 width=24) (actual time=27572.535..27572.

595 rows=16 loops=1)

                                             ->  Hash Join  (cost=8.95..171188.53 rows=4923 width=24) (actual time=27.159..27572.439 ro

ws=48 loops=1)

                                                   Hash Cond: (e_1.usertemplatevar_id = ut_1.id)

                                                   ->  Seq Scan on functionalvariables e_1  (cost=0.00..155513.07 rows=4164607 width=32

) (actual time=0.163..23959.820 rows=4164350 loops=1)

                                                   ->  Hash  (cost=8.75..8.75 rows=16 width=8) (actual time=0.070..0.070 rows=16 loops=

1)

                                                         Buckets: 1024  Batches: 1  Memory Usage: 1kB

                                                         ->  Index Scan using usertemp_utv_idx on usertemplatevariable ut_1  (cost=0.29

..8.75 rows=16 width=8) (actual time=0.040..0.057 rows=16 loops=1)

                                                               Index Cond: (usertempl_id = 15)

                                       ->  Materialize  (cost=0.29..8.83 rows=16 width=8) (actual time=0.839..0.851 rows=16 loops=16)

                                             ->  Index Scan using usertemp_utv_idx on usertemplatevariable utv  (cost=0.29..8.75 rows=1

6 width=8) (actual time=0.039..0.080 rows=16 loops=1)

                                                   Index Cond: (usertempl_id = 15)

                                 ->  Index Only Scan using usertemplvarattribute_atttypeid_key on usertemplvarattribute utva  (cost=0.4

2..4.60 rows=9 width=8) (actual time=0.004..0.011 rows=10 loops=256)

                                       Index Cond: (usertempvariable_fk = utv.id)

                                       Heap Fetches: 0

Total runtime: 186585.376 ms

(67 rows)

 

 

\d functionalvarattributes;

                                          Table "public.functionalvarattributes"

       Column        |            Type             |                              Modifiers                              

---------------------+-----------------------------+----------------------------------------------------------------------

id                  | bigint                      | not null default nextval('functionalvarattributes_id_seq'::regclass)

attvalue            | character varying(4000)     | not null

createdat           | timestamp without time zone |

 description         | character varying(500)      |

 updatedat           | timestamp without time zone |

 autosaved           | boolean                     | not null

atttype_id          | bigint                      |

 codactemplvaratt_fk | bigint                      |

 funcvar_fk          | bigint                      | not null

tag_id              | bigint                      |

 usertemplvaratt_fk  | bigint                      |

 useratttype_id      | bigint                      |

 keyattvalue         | character varying(255)      |

Indexes:

    "functionalvarattributes_pkey" PRIMARY KEY, btree (id)

    "functionalvarattributes_funcvar_fk_tag_id_atttype_id_key" UNIQUE CONSTRAINT, btree (funcvar_fk, tag_id, atttype_id)

    "usertemplvaratt_funcvaratt_idx" btree (usertemplvaratt_fk)

    "vat_funcvaratt_multi_idx" btree (atttype_id, attvalue, tag_id)

Foreign-key constraints:

    "fk6b514a7b1929df33" FOREIGN KEY (useratttype_id) REFERENCES userattributetypes(id)

    "fk6b514a7b19d38f01" FOREIGN KEY (codactemplvaratt_fk) REFERENCES codactemplvarattribute(id)

    "fk6b514a7b2080a717" FOREIGN KEY (atttype_id) REFERENCES variableattributetypes(id)

    "fk6b514a7ba4d2f942" FOREIGN KEY (funcvar_fk) REFERENCES functionalvariables(id)

    "fk6b514a7bc81d711d" FOREIGN KEY (usertemplvaratt_fk) REFERENCES usertemplvarattribute(id)

    "fk6b514a7bcbbfa8b8" FOREIGN KEY (tag_id) REFERENCES tags(id)

 

Version of postgresql is 9.3 on linux RHEL

 

uname -a

Linux 4504DS-SRV-0043.codac.iter.org 2.6.32-431.20.3.el6.x86_64 #1 SMP Fri Jun 6 18:30:54 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux

Thanks for your help

Lana

 

Re: Very slow query (3-4mn) on a table with 25millions rows

От
Tom Lane
Дата:
Abadie Lana <Lana.Abadie@iter.org> writes:
> I'm having a problem with a slow query - I tried several things to optimize the queries but didn't really help. The
outputof explain analyse shows sequential scan on a table of 25 million rows. Even though it is indexed and (I put a
multi-columnindex on the fields used in the query), the explain utility shows no usage of the scan... 

That index looks pretty useless judging from the rowcounts, so I'm not
surprised that the planner didn't use it.  You might have better luck with
an index on the split_part expression

split_part(split_part((s.attvalue)::text, ' '::text, 1), '.'::text, 1)

since it's the join of that to e.name that seems to be actually selective.
(The planner doesn't appear to realize that it is, but ANALYZE'ing after
creating the index should fix that.)

            regards, tom lane


Re: Very slow query (3-4mn) on a table with 25millions rows

От
Abadie Lana
Дата:
Hi Tom,
Thanks for the hints..

I made various tests for index
The best I could get is the following one with
create index vat_funcvaratt_multi_idx on functionalvarattributes(split_part(split_part(attvalue,' ',1),'.',1), tag_id,
atttype_id);
analyze functionalvarattributes;

explain analyze select s.attvalue from functionalvarattributes s, tags t, variableattributetypes vat where
t.id=s.tag_idand t.status!='Internal'and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK') and
vat.id=s.atttype_idand split_part(split_part(s.attvalue,' ',1),'.',1) in (select e.name from functionalvariables e,
usertemplatevariableut where e.usertemplatevar_id=ut.id and ut.usertempl_id=15) except select s.attvalue from
functionalvarattributess, tags t, usertemplvarattribute utva, usertemplatevariable utv, variableattributetypes vat
wherevat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK') and vat.id=s.atttype_id and
utv.id=utva.usertempvariable_fkand utv.usertempl_id=15 and t.id=s.tag_id and t.status!='Internal'and
split_part(split_part(s.attvalue,'',1),'.',1) in (select e.name from functionalvariables e, usertemplatevariable ut
wheree.usertemplatevar_id=ut.id and ut.usertempl_id=15); 
                                                                                                 QUERY PLAN
               


---------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------
 HashSetOp Except  (cost=171505.51..2361978.74 rows=1116 width=8) (actual time=66476.682..66476.682 rows=0 loops=1)
   ->  Append  (cost=171505.51..2251949.02 rows=44011889 width=8) (actual time=12511.639..66476.544 rows=320 loops=1)
         ->  Subquery Scan on "*SELECT* 1"  (cost=171505.51..907368.77 rows=310121 width=8) (actual
time=12511.638..31775.404rows=2 lo 
ops=1)
               ->  Hash Join  (cost=171505.51..904267.56 rows=310121 width=8) (actual time=12511.636..31775.401 rows=2
loops=1)
                     Hash Cond: (split_part(split_part((s.attvalue)::text, ' '::text, 1), '.'::text, 1) =
(e.name)::text)
                     ->  Hash Join  (cost=193.91..726325.20 rows=310121 width=8) (actual time=1.227..24083.777
rows=308287loops=1) 
                           Hash Cond: (s.tag_id = t.id)
                           ->  Hash Join  (cost=188.03..716951.08 rows=1671210 width=16) (actual time=1.157..23810.490
rows=651155loop 
s=1)
                                 Hash Cond: (s.atttype_id = vat.id)
                                 ->  Seq Scan on functionalvarattributes s  (cost=0.00..604688.60 rows=25429960
width=24)(actual time= 
0.002..15719.449 rows=25429808 loops=1)
                                 ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual time=1.116..1.116 rows=388
loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 16kB
                                       ->  Seq Scan on variableattributetypes vat  (cost=0.00..183.18 rows=388 width=8)
(actualtime=0. 
005..0.987 rows=388 loops=1)
                                             Filter: ((fieldtype)::text = ANY
('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))
                                             Rows Removed by Filter: 5516
                           ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual time=0.064..0.064 rows=36 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 2kB
                                 ->  Seq Scan on tags t  (cost=0.00..5.43 rows=36 width=8) (actual time=0.008..0.055
rows=36loops=1) 
                                       Filter: ((status)::text <> 'Internal'::text)
                                       Rows Removed by Filter: 158
                     ->  Hash  (cost=171250.07..171250.07 rows=4923 width=24) (actual time=7377.344..7377.344 rows=16
loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 1kB
                           ->  HashAggregate  (cost=171200.84..171250.07 rows=4923 width=24) (actual
time=7377.310..7377.329rows=16 lo 
ops=1)
                                 ->  Hash Join  (cost=8.95..171188.53 rows=4923 width=24) (actual time=3.178..7377.271
rows=48loops=1) 
                                       Hash Cond: (e.usertemplatevar_id = ut.id)
                                       ->  Seq Scan on functionalvariables e  (cost=0.00..155513.07 rows=4164607
width=32)(actual time 
=1.271..5246.277 rows=4164350 loops=1)
                                       ->  Hash  (cost=8.75..8.75 rows=16 width=8) (actual time=0.026..0.026 rows=16
loops=1)
                                             Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                             ->  Index Scan using usertemp_utv_idx on usertemplatevariable ut
(cost=0.29..8.75rows=16 
 width=8) (actual time=0.011..0.020 rows=16 loops=1)
                                                   Index Cond: (usertempl_id = 15)
         ->  Subquery Scan on "*SELECT* 2"  (cost=172514.13..1344580.25 rows=43701768 width=8) (actual
time=11551.477..34701.030rows=3 
18 loops=1)
               ->  Hash Join  (cost=172514.13..907562.57 rows=43701768 width=8) (actual time=11551.475..34700.876
rows=318loops=1) 
                     Hash Cond: (split_part(split_part((s_1.attvalue)::text, ' '::text, 1), '.'::text, 1) =
(e_1.name)::text)
                     ->  Hash Join  (cost=193.91..726325.20 rows=310121 width=8) (actual time=1.281..27733.991
rows=308287loops=1) 
                           Hash Cond: (s_1.tag_id = t_1.id)
                           ->  Hash Join  (cost=188.03..716951.08 rows=1671210 width=16) (actual time=1.194..27391.475
rows=651155loop 
s=1)
                                 Hash Cond: (s_1.atttype_id = vat_1.id)
                                 ->  Seq Scan on functionalvarattributes s_1  (cost=0.00..604688.60 rows=25429960
width=24)(actual tim 
e=0.001..17189.172 rows=25429808 loops=1)
                                 ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual time=1.153..1.153 rows=388
loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 16kB
                                       ->  Seq Scan on variableattributetypes vat_1  (cost=0.00..183.18 rows=388
width=8)(actual time= 
0.007..1.015 rows=388 loops=1)
                                             Filter: ((fieldtype)::text = ANY
('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))
                                             Rows Removed by Filter: 5516
                           ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual time=0.065..0.065 rows=36 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 2kB
                                 ->  Seq Scan on tags t_1  (cost=0.00..5.43 rows=36 width=8) (actual time=0.010..0.053
rows=36loops=1) 
                                       Filter: ((status)::text <> 'Internal'::text)
                                       Rows Removed by Filter: 158
                     ->  Hash  (cost=172318.46..172318.46 rows=141 width=24) (actual time=6553.620..6553.620 rows=2544
loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 134kB
                           ->  Nested Loop  (cost=171201.54..172318.46 rows=141 width=24) (actual
time=6550.096..6552.789rows=2544 loo 
ps=1)
                                 ->  Nested Loop  (cost=171201.12..172243.46 rows=16 width=32) (actual
time=6550.077..6550.305rows=256 
 loops=1)
                                       ->  HashAggregate  (cost=171200.84..171250.07 rows=4923 width=24) (actual
time=6542.508..6542.53
5 rows=16 loops=1)
                                             ->  Hash Join  (cost=8.95..171188.53 rows=4923 width=24) (actual
time=12.705..6542.472row 
s=48 loops=1)
                                                   Hash Cond: (e_1.usertemplatevar_id = ut_1.id)
                                                   ->  Seq Scan on functionalvariables e_1  (cost=0.00..155513.07
rows=4164607width=32 
) (actual time=7.324..5008.051 rows=4164350 loops=1)
                                                   ->  Hash  (cost=8.75..8.75 rows=16 width=8) (actual
time=0.033..0.033rows=16 loops= 
1)
                                                         Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                                         ->  Index Scan using usertemp_utv_idx on usertemplatevariable
ut_1 (cost=0.29 
..8.75 rows=16 width=8) (actual time=0.018..0.026 rows=16 loops=1)
                                                               Index Cond: (usertempl_id = 15)
                                       ->  Materialize  (cost=0.29..8.83 rows=16 width=8) (actual time=0.473..0.478
rows=16loops=16) 
                                             ->  Index Scan using usertemp_utv_idx on usertemplatevariable utv
(cost=0.29..8.75rows=1 
6 width=8) (actual time=0.032..0.041 rows=16 loops=1)
                                                   Index Cond: (usertempl_id = 15)
                                 ->  Index Only Scan using usertemplvarattribute_atttypeid_key on usertemplvarattribute
utva (cost=0.4 
2..4.60 rows=9 width=8) (actual time=0.002..0.004 rows=10 loops=256)
                                       Index Cond: (usertempvariable_fk = utv.id)
                                       Heap Fetches: 0
 Total runtime: 66476.942 ms
(67 rows)

Is this acceptable or can I get better results?
Thanks
Lana

>>-----Original Message-----
>>From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>>Sent: 25 July 2016 20:07
>>To: Abadie Lana
>>Cc: pgsql-performance@postgresql.org
>>Subject: Re: [PERFORM] Very slow query (3-4mn) on a table with 25millions
>>rows
>>
>>Abadie Lana <Lana.Abadie@iter.org> writes:
>>> I'm having a problem with a slow query - I tried several things to optimize the
>>queries but didn't really help. The output of explain analyse shows sequential
>>scan on a table of 25 million rows. Even though it is indexed and (I put a multi-
>>column index on the fields used in the query), the explain utility shows no usage
>>of the scan...
>>
>>That index looks pretty useless judging from the rowcounts, so I'm not surprised
>>that the planner didn't use it.  You might have better luck with an index on the
>>split_part expression
>>
>>split_part(split_part((s.attvalue)::text, ' '::text, 1), '.'::text, 1)
>>
>>since it's the join of that to e.name that seems to be actually selective.
>>(The planner doesn't appear to realize that it is, but ANALYZE'ing after creating
>>the index should fix that.)
>>
>>            regards, tom lane


Re: Very slow query (3-4mn) on a table with 25millions rows

От
Martín Marqués
Дата:
El 26/07/16 a las 06:01, Abadie Lana escribió:
> Hi Tom,
> Thanks for the hints..
>
> I made various tests for index
> The best I could get is the following one with
> create index vat_funcvaratt_multi_idx on functionalvarattributes(split_part(split_part(attvalue,' ',1),'.',1),
tag_id,atttype_id); 
> analyze functionalvarattributes;

I suggest running analyze over the other tables involved in the query
(or over the whole DB) and then sending back the explain analyze, or
even better EXPLAIN (ANALYZE,BUFFERS).

Some estimates are close and others are really wrong.

I'm not saying that's going to give you a big bust but we'll be able to
see the planner with fresh stats

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: Very slow query (3-4mn) on a table with 25millions rows

От
Abadie Lana
Дата:
Dear Martin
I run an analyse on the whole database + explicit analyse on tables involved in the query.
Here the result of explain (analyse, buffer). Thanks for your help and let me know if you need more information.

explain (analyze, buffers) select s.attvalue from functionalvarattributes s, tags t, variableattributetypes vat where
t.id=s.tag_idand t.status!='Internal'and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK') and
vat.id=s.atttype_idand split_part(split_part(s.attvalue,' ',1),'.',1) in (select e.name from functionalvariables e,
usertemplatevariableut where e.usertemplatevar_id=ut.id and ut.usertempl_id=15) except select s.attvalue from
functionalvarattributess, tags t, usertemplvarattribute utva, usertemplatevariable utv, variableattributetypes vat
wherevat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK') and vat.id=s.atttype_id and
utv.id=utva.usertempvariable_fkand utv.usertempl_id=15 and t.id=s.tag_id and t.status!='Internal'and
split_part(split_part(s.attvalue,'',1),'.',1) in (select e.name from functionalvariables e, usertemplatevariable ut
wheree.usertemplatevar_id=ut.id and ut.usertempl_id=15);
 
                                                                                                 QUERY PLAN
              
 
                                                                     

---------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------
 HashSetOp Except  (cost=171506.51..2361929.77 rows=1102 width=8) (actual time=75622.307..75622.307 rows=0 loops=1)
   Buffers: shared hit=4423 read=925096
   ->  Append  (cost=171506.51..2251904.08 rows=44010276 width=8) (actual time=13510.950..75622.159 rows=320 loops=1)
         Buffers: shared hit=4423 read=925096
         ->  Subquery Scan on "*SELECT* 1"  (cost=171506.51..907352.41 rows=310110 width=8) (actual
time=13510.950..41131.939rows=2 lo
 
ops=1)
               Buffers: shared hit=1785 read=462580
               ->  Hash Join  (cost=171506.51..904251.31 rows=310110 width=8) (actual time=13510.947..41131.932 rows=2
loops=1)
                     Hash Cond: (split_part(split_part((s.attvalue)::text, ' '::text, 1), '.'::text, 1) =
(e.name)::text)
                     Buffers: shared hit=1785 read=462580
                     ->  Hash Join  (cost=193.91..726311.49 rows=310110 width=8) (actual time=1.016..33826.718
rows=308287loops=1)
 
                           Hash Cond: (s.tag_id = t.id)
                           Buffers: shared hit=1070 read=349424
                           ->  Hash Join  (cost=188.03..716937.71 rows=1671149 width=16) (actual time=0.941..33398.776
rows=651155loop
 
s=1)
                                 Hash Cond: (s.atttype_id = vat.id)
                                 Buffers: shared hit=1067 read=349424
                                 ->  Seq Scan on functionalvarattributes s  (cost=0.00..604679.32 rows=25429032
width=24)(actual time=
 
0.002..20099.045 rows=25429808 loops=1)
                                       Buffers: shared hit=965 read=349424
                                 ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual time=0.900..0.900 rows=388
loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 16kB
                                       Buffers: shared hit=102
                                       ->  Seq Scan on variableattributetypes vat  (cost=0.00..183.18 rows=388 width=8)
(actualtime=0.
 
005..0.803 rows=388 loops=1)
                                             Filter: ((fieldtype)::text = ANY
('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))
                                             Rows Removed by Filter: 5516
                                             Buffers: shared hit=102
                           ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual time=0.070..0.070 rows=36 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 2kB
                                 Buffers: shared hit=3
                                 ->  Seq Scan on tags t  (cost=0.00..5.43 rows=36 width=8) (actual time=0.007..0.057
rows=36loops=1)
 
                                       Filter: ((status)::text <> 'Internal'::text)
                                       Rows Removed by Filter: 158
                                       Buffers: shared hit=3
                     ->  Hash  (cost=171251.03..171251.03 rows=4926 width=24) (actual time=6801.452..6801.452 rows=16
loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 1kB
                           Buffers: shared hit=715 read=113156
                           ->  HashAggregate  (cost=171201.77..171251.03 rows=4926 width=24) (actual
time=6801.417..6801.435rows=16 lo
 
ops=1)
                                 Buffers: shared hit=715 read=113156
                                 ->  Hash Join  (cost=8.95..171189.45 rows=4926 width=24) (actual time=12.812..6801.387
rows=48loops=1
 
)
                                       Hash Cond: (e.usertemplatevar_id = ut.id)
                                       Buffers: shared hit=715 read=113156
                                       ->  Seq Scan on functionalvariables e  (cost=0.00..155513.72 rows=4164672
width=32)(actual time
 
=5.244..4924.135 rows=4164350 loops=1)
                                             Buffers: shared hit=711 read=113156
                                       ->  Hash  (cost=8.75..8.75 rows=16 width=8) (actual time=0.030..0.030 rows=16
loops=1)
                                             Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                             Buffers: shared hit=4
                                             ->  Index Scan using usertemp_utv_idx on usertemplatevariable ut
(cost=0.29..8.75rows=16
 
 width=8) (actual time=0.012..0.023 rows=16 loops=1)
                                                   Index Cond: (usertempl_id = 15)
                                                   Buffers: shared hit=4
         ->  Subquery Scan on "*SELECT* 2"  (cost=172515.69..1344551.67 rows=43700166 width=8) (actual
time=12639.042..34490.098rows=3
 
18 loops=1)
               Buffers: shared hit=2638 read=462516
               ->  Hash Join  (cost=172515.69..907550.01 rows=43700166 width=8) (actual time=12639.040..34489.953
rows=318loops=1)
 
                     Hash Cond: (split_part(split_part((s_1.attvalue)::text, ' '::text, 1), '.'::text, 1) =
(e_1.name)::text)
                     Buffers: shared hit=2638 read=462516
                     ->  Hash Join  (cost=193.91..726311.49 rows=310110 width=8) (actual time=2.354..26734.043
rows=308287loops=1)
 
                           Hash Cond: (s_1.tag_id = t_1.id)
                           Buffers: shared hit=1102 read=349392
                           ->  Hash Join  (cost=188.03..716937.71 rows=1671149 width=16) (actual time=2.176..26421.280
rows=651155loop
 
s=1)
                                 Hash Cond: (s_1.atttype_id = vat_1.id)
                                 Buffers: shared hit=1099 read=349392
                                 ->  Seq Scan on functionalvarattributes s_1  (cost=0.00..604679.32 rows=25429032
width=24)(actual tim
 
e=0.003..16949.841 rows=25429808 loops=1)
                                       Buffers: shared hit=997 read=349392
                                 ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual time=2.092..2.092 rows=388
loops=1)
                                       Buckets: 1024  Batches: 1  Memory Usage: 16kB
                                       Buffers: shared hit=102
                                       ->  Seq Scan on variableattributetypes vat_1  (cost=0.00..183.18 rows=388
width=8)(actual time=
 
0.014..1.852 rows=388 loops=1)
                                             Filter: ((fieldtype)::text = ANY
('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))
                                             Rows Removed by Filter: 5516
                                             Buffers: shared hit=102
                           ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual time=0.138..0.138 rows=36 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 2kB
                                 Buffers: shared hit=3
                                 ->  Seq Scan on tags t_1  (cost=0.00..5.43 rows=36 width=8) (actual time=0.016..0.088
rows=36loops=1)
 
                                       Filter: ((status)::text <> 'Internal'::text)
                                       Rows Removed by Filter: 158
                                       Buffers: shared hit=3
                     ->  Hash  (cost=172320.02..172320.02 rows=141 width=24) (actual time=7386.827..7386.827 rows=2544
loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 134kB
                           Buffers: shared hit=1536 read=113124
                           ->  Nested Loop  (cost=171202.47..172320.02 rows=141 width=24) (actual
time=7378.869..7384.698rows=2544 loo
 
ps=1)
                                 Buffers: shared hit=1536 read=113124
                                 ->  Nested Loop  (cost=171202.05..172245.02 rows=16 width=32) (actual
time=7378.835..7379.342rows=256
 
 loops=1)
                                       Buffers: shared hit=751 read=113124
                                       ->  HashAggregate  (cost=171201.77..171251.03 rows=4926 width=24) (actual
time=7368.551..7368.62
0 rows=16 loops=1)
                                             Buffers: shared hit=747 read=113124
                                             ->  Hash Join  (cost=8.95..171189.45 rows=4926 width=24) (actual
time=13.272..7368.471row
 
s=48 loops=1)
                                                   Hash Cond: (e_1.usertemplatevar_id = ut_1.id)
                                                   Buffers: shared hit=747 read=113124
                                                   ->  Seq Scan on functionalvariables e_1  (cost=0.00..155513.72
rows=4164672width=32
 
) (actual time=9.412..5383.223 rows=4164350 loops=1)
                                                         Buffers: shared hit=743 read=113124
                                                   ->  Hash  (cost=8.75..8.75 rows=16 width=8) (actual
time=0.061..0.061rows=16 loops=
 
1)
                                                         Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                                         Buffers: shared hit=4
                                                         ->  Index Scan using usertemp_utv_idx on usertemplatevariable
ut_1 (cost=0.29
 
..8.75 rows=16 width=8) (actual time=0.032..0.052 rows=16 loops=1)
                                                               Index Cond: (usertempl_id = 15)
                                                               Buffers: shared hit=4
                                       ->  Materialize  (cost=0.29..8.83 rows=16 width=8) (actual time=0.643..0.654
rows=16loops=16)
 
                                             Buffers: shared hit=4
                                             ->  Index Scan using usertemp_utv_idx on usertemplatevariable utv
(cost=0.29..8.75rows=1
 
6 width=8) (actual time=0.052..0.075 rows=16 loops=1)
                                                   Index Cond: (usertempl_id = 15)
                                                   Buffers: shared hit=4
                                 ->  Index Only Scan using usertemplvarattribute_atttypeid_key on usertemplvarattribute
utva (cost=0.4
 
2..4.60 rows=9 width=8) (actual time=0.004..0.010 rows=10 loops=256)
                                       Index Cond: (usertempvariable_fk = utv.id)
                                       Heap Fetches: 0
                                       Buffers: shared hit=785
 Total runtime: 75622.559 ms
(104 rows)


Lana ABADIE
Database Engineer
CODAC Section

ITER Organization, Building 72/4108, SCOD, Control System Division
Route de Vinon-sur-Verdon - CS 90 046 - 13067 St Paul Lez Durance Cedex - France
Phone: +33 4 42 17 84 02
Get the latest ITER news on http://www.iter.org/whatsnew
>>-----Original Message-----
>>From: Martín Marqués [mailto:martin@2ndquadrant.com]
>>Sent: 26 July 2016 12:34
>>To: Abadie Lana; Tom Lane
>>Cc: pgsql-performance@postgresql.org
>>Subject: Re: [PERFORM] Very slow query (3-4mn) on a table with 25millions
>>rows
>>
>>El 26/07/16 a las 06:01, Abadie Lana escribió:
>>> Hi Tom,
>>> Thanks for the hints..
>>>
>>> I made various tests for index
>>> The best I could get is the following one with
>>> create index vat_funcvaratt_multi_idx on
>>functionalvarattributes(split_part(split_part(attvalue,' ',1),'.',1), tag_id,
>>atttype_id);
>>> analyze functionalvarattributes;
>>
>>I suggest running analyze over the other tables involved in the query
>>(or over the whole DB) and then sending back the explain analyze, or
>>even better EXPLAIN (ANALYZE,BUFFERS).
>>
>>Some estimates are close and others are really wrong.
>>
>>I'm not saying that's going to give you a big bust but we'll be able to
>>see the planner with fresh stats
>>
>>--
>>Martín Marqués                http://www.2ndQuadrant.com/
>>PostgreSQL Development, 24x7 Support, Training & Services

Re: Very slow query (3-4mn) on a table with 25millions rows

От
Félix GERZAGUET
Дата:
Hello Lana,

On Wed, Jul 27, 2016 at 8:03 AM, Abadie Lana <Lana.Abadie@iter.org> wrote:
Here the result of explain (analyse, buffer). Thanks for your help and let me know if you need more information.

I noticed 3 things in your query:

1. In the second part (after the except), the 2 tables utva and utv are not joined against the others table. Is there a missing join somewhere ?

Let that snipset:

select s.attvalue
  from functionalvarattributes s
     , tags t
     , variableattributetypes vat
 where t.id=s.tag_id
   and t.status!='Internal'
   and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')
   and vat.id=s.atttype_id
   and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name
                                                             from functionalvariables e
                                                                , usertemplatevariable ut
                                                            where e.usertemplatevar_id=ut.id
                                                              and ut.usertempl_id=15
                                                           )

be called A

Let that snipset:

select *
  from usertemplvarattribute utva
     , usertemplatevariable utv
  where utv.id=utva.usertempvariable_fk
    and utv.usertempl_id=15

be called B

Then you query is:

A
except
A CROSS JOIN B

If B is not the empty set, than the above query is guaranteed to always have 0 row.

2. Assuming your query is right (even if I failed to understand its point), we could only do the A snipset once instead of twice using a with clause as in:

with filtered_s as (
select s.attvalue
  from functionalvarattributes s
     , tags t
     , variableattributetypes vat
 where t.id=s.tag_id
   and t.status!='Internal'
   and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')
   and vat.id=s.atttype_id
   and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name
                                                             from functionalvariables e
                                                                , usertemplatevariable ut
                                                            where e.usertemplatevar_id=ut.id
                                                              and ut.usertempl_id=15
                                                           )
)
select s.attvalue
  from filtered_s s
except
select s.attvalue
  from filtered_s s
     , usertemplvarattribute utva
     , usertemplatevariable utv
  where utv.id=utva.usertempvariable_fk
    and utv.usertempl_id=15
;

This rewritten query should run about 2x. faster.

3. The planner believe that the e.name subselect will give 4926 rows (instead of 16 in reality), due to this wrong estimate it will consider the vat_funcvaratt_multi_idx index as not usefull. I don't know how to give the planner more accurate info ...

--
Félix

Re: Very slow query (3-4mn) on a table with 25millions rows

От
Félix GERZAGUET
Дата:

On Wed, Jul 27, 2016 at 11:15 AM, Félix GERZAGUET <felix.gerzaguet@gmail.com> wrote:
 I don't know how to give the planner more accurate info ...
 
Could you try to materialize the e.name subquery in another table. As in

create table func_var_name_for_tpl_15 as
select e.name
                                                             from functionalvariables e
                                                                , usertemplatevariable ut
                                                            where e.usertemplatevar_id=ut.id
                                                              and ut.usertempl_id=15
;

Then analyse that table
Then try the rewritten query:

with filtered_s as (
select s.attvalue
  from functionalvarattributes s
     , tags t
     , variableattributetypes vat
 where t.id=s.tag_id
   and t.status!='Internal'
   and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')
   and vat.id=s.atttype_id
   and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name
                                                             from
func_var_name_for_tpl_15 e
                                                                )
)
select s.attvalue
  from filtered_s s
except
select s.attvalue
  from filtered_s s
     , usertemplvarattribute utva
     , usertemplatevariable utv
  where utv.id=utva.usertempvariable_fk
    and utv.usertempl_id=15
;

Does it use the vat_funcvaratt_multi_idx index now ?

--
Félix

Re: Very slow query (3-4mn) on a table with 25millions rows

От
Abadie Lana
Дата:

Hello Felix

Thanks indeed the new query is much faster…The query itself is complicated to explain basically you can view it as graph and want to make sure that there is no dependencies if I remove a set of points….

 

explain analyze with filtered_s as ( select s.attvalue  from functionalvarattributes s, tags t, variableattributetypes vat where t.id=s.tag_id and t.status!='Internal' and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')  and vat.id=s.atttype_id and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name from functionalvariables e, usertemplatevariable ut where e.usertemplatevar_id=ut.id and ut.usertempl_id=15) ) select s.attvalue from filtered_s s except select s.attvalue from filtered_s s , usertemplvarattribute utva, usertemplatevariable utv where utv.id=utva.usertempvariable_fk and  utv.usertempl_id=15;

                                                                                                QUERY PLAN                            

                                                                    

---------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------

HashSetOp Except  (cost=904251.31..2013436.93 rows=200 width=516) (actual time=40007.482..40007.482 rows=0 loops=1)

   CTE filtered_s

     ->  Hash Join  (cost=171506.51..904251.31 rows=310110 width=8) (actual time=13986.554..40005.687 rows=2 loops=1)

           Hash Cond: (split_part(split_part((s_2.attvalue)::text, ' '::text, 1), '.'::text, 1) = (e.name)::text)

           ->  Hash Join  (cost=193.91..726311.49 rows=310110 width=8) (actual time=2.675..30633.916 rows=308287 loops=1)

                 Hash Cond: (s_2.tag_id = t.id)

                 ->  Hash Join  (cost=188.03..716937.71 rows=1671149 width=16) (actual time=2.518..30249.987 rows=651155 loops=1)

                       Hash Cond: (s_2.atttype_id = vat.id)

                       ->  Seq Scan on functionalvarattributes s_2  (cost=0.00..604679.32 rows=25429032 width=24) (actual time=0.005..1

9229.473 rows=25429808 loops=1)

                       ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual time=2.433..2.433 rows=388 loops=1)

                             Buckets: 1024  Batches: 1  Memory Usage: 16kB

                             ->  Seq Scan on variableattributetypes vat  (cost=0.00..183.18 rows=388 width=8) (actual time=0.010..2.171

rows=388 loops=1)

                                   Filter: ((fieldtype)::text = ANY ('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))

                                   Rows Removed by Filter: 5516

                 ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual time=0.147..0.147 rows=36 loops=1)

                       Buckets: 1024  Batches: 1  Memory Usage: 2kB

                       ->  Seq Scan on tags t  (cost=0.00..5.43 rows=36 width=8) (actual time=0.015..0.119 rows=36 loops=1)

                             Filter: ((status)::text <> 'Internal'::text)

                             Rows Removed by Filter: 158

           ->  Hash  (cost=171251.03..171251.03 rows=4926 width=24) (actual time=8939.073..8939.073 rows=16 loops=1)

                 Buckets: 1024  Batches: 1  Memory Usage: 1kB

                 ->  HashAggregate  (cost=171201.77..171251.03 rows=4926 width=24) (actual time=8939.039..8939.058 rows=16 loops=1)

                       ->  Hash Join  (cost=8.95..171189.45 rows=4926 width=24) (actual time=3188.453..8938.943 rows=48 loops=1)

                             Hash Cond: (e.usertemplatevar_id = ut.id)

                             ->  Seq Scan on functionalvariables e  (cost=0.00..155513.72 rows=4164672 width=32) (actual time=0.004..65

54.351 rows=4164350 loops=1)

                             ->  Hash  (cost=8.75..8.75 rows=16 width=8) (actual time=0.042..0.042 rows=16 loops=1)

                                   Buckets: 1024  Batches: 1  Memory Usage: 1kB

                                   ->  Index Scan using usertemp_utv_idx on usertemplatevariable ut  (cost=0.29..8.75 rows=16 width=8)

(actual time=0.015..0.029 rows=16 loops=1)

                                         Index Cond: (usertempl_id = 15)

   ->  Append  (cost=0.00..999159.97 rows=44010259 width=516) (actual time=13986.564..40007.199 rows=320 loops=1)

         ->  Subquery Scan on "*SELECT* 1"  (cost=0.00..9303.30 rows=310110 width=516) (actual time=13986.563..40005.703 rows=2 loops=1

)

               ->  CTE Scan on filtered_s s  (cost=0.00..6202.20 rows=310110 width=516) (actual time=13986.561..40005.699 rows=2 loops=

1)

         ->  Subquery Scan on "*SELECT* 2"  (cost=0.70..989856.67 rows=43700149 width=516) (actual time=0.071..1.242 rows=318 loops=1)

               ->  Nested Loop  (cost=0.70..552855.18 rows=43700149 width=516) (actual time=0.069..0.941 rows=318 loops=1)

                     ->  CTE Scan on filtered_s s_1  (cost=0.00..6202.20 rows=310110 width=516) (actual time=0.003..0.005 rows=2 loops=

1)

                     ->  Materialize  (cost=0.70..84.46 rows=141 width=0) (actual time=0.032..0.331 rows=159 loops=2)

                           ->  Nested Loop  (cost=0.70..83.75 rows=141 width=0) (actual time=0.053..0.426 rows=159 loops=1)

                                 ->  Index Scan using usertemp_utv_idx on usertemplatevariable utv  (cost=0.29..8.75 rows=16 width=8) (

actual time=0.030..0.052 rows=16 loops=1)

                                       Index Cond: (usertempl_id = 15)

                                ->  Index Only Scan using usertemplvarattribute_atttypeid_key on usertemplvarattribute utva  (cost=0.4

2..4.60 rows=9 width=8) (actual time=0.005..0.011 rows=10 loops=16)

                                       Index Cond: (usertempvariable_fk = utv.id)

                                       Heap Fetches: 0

Total runtime: 40007.716 ms

 

 

Lana

From: Félix GERZAGUET [mailto:felix.gerzaguet@gmail.com]
Sent: 27 July 2016 11:16
To: Abadie Lana
Cc: Martín Marqués; Tom Lane; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very slow query (3-4mn) on a table with 25millions rows

 

Hello Lana,

 

On Wed, Jul 27, 2016 at 8:03 AM, Abadie Lana <Lana.Abadie@iter.org> wrote:

Here the result of explain (analyse, buffer). Thanks for your help and let me know if you need more information.

 

I noticed 3 things in your query:

1. In the second part (after the except), the 2 tables utva and utv are not joined against the others table. Is there a missing join somewhere ?


Let that snipset:

select s.attvalue
  from functionalvarattributes s
     , tags t
     , variableattributetypes vat
 where t.id=s.tag_id
   and t.status!='Internal'
   and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')
   and vat.id=s.atttype_id
   and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name
                                                             from functionalvariables e
                                                                , usertemplatevariable ut
                                                            where e.usertemplatevar_id=ut.id
                                                              and ut.usertempl_id=15
                                                           )

be called A

Let that snipset:

select *
  from usertemplvarattribute utva
     , usertemplatevariable utv
  where utv.id=utva.usertempvariable_fk
    and utv.usertempl_id=15

be called B

Then you query is:

A

except

A CROSS JOIN B

If B is not the empty set, than the above query is guaranteed to always have 0 row.

 

2. Assuming your query is right (even if I failed to understand its point), we could only do the A snipset once instead of twice using a with clause as in:

with filtered_s as (
select s.attvalue
  from functionalvarattributes s
     , tags t
     , variableattributetypes vat
 where t.id=s.tag_id
   and t.status!='Internal'
   and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')
   and vat.id=s.atttype_id
   and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name
                                                             from functionalvariables e
                                                                , usertemplatevariable ut
                                                            where e.usertemplatevar_id=ut.id
                                                              and ut.usertempl_id=15
                                                           )
)
select s.attvalue
  from filtered_s s
except
select s.attvalue
  from filtered_s s
     , usertemplvarattribute utva
     , usertemplatevariable utv
  where utv.id=utva.usertempvariable_fk
    and utv.usertempl_id=15
;

This rewritten query should run about 2x. faster.

3. The planner believe that the e.name subselect will give 4926 rows (instead of 16 in reality), due to this wrong estimate it will consider the vat_funcvaratt_multi_idx index as not usefull. I don't know how to give the planner more accurate info ...


--

Félix

 

Re: Very slow query (3-4mn) on a table with 25millions rows

От
Abadie Lana
Дата:

Sorry for the delay

Still no use of the index

create table func_var_name_for_tpl_15 as select e.name from functionalvariables e, usertemplatevariable ut where e.usertemplatevar_id=ut.id and ut.usertempl_id=15;

SELECT 48

=# analyze func_var_name_for_tpl_15;

ANALYZE

=# explain analyze with filtered_s as ( select s.attvalue from functionalvarattributes s , tags t, variableattributetypes vat where t.id=s.tag_id and t.status!='Internal' and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')    and vat.id=s.atttype_id and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name from func_var_name_for_tpl_15 e)) select s.attvalue from filtered_s s, usertemplvarattribute utva, usertemplatevariable utv where utv.id=utva.usertempvariable_fk and utv.usertempl_id=15;

                                                                                       QUERY PLAN                                     

                                                  

---------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------

Nested Loop  (cost=689051.63..698514.55 rows=741512 width=516) (actual time=11043.744..47958.871 rows=318 loops=1)

   CTE filtered_s

     ->  Hash Join  (cost=195.99..689050.93 rows=5262 width=8) (actual time=11043.680..47957.962 rows=2 loops=1)

           Hash Cond: (s_1.tag_id = t.id)

           ->  Hash Join  (cost=190.11..688886.10 rows=28355 width=16) (actual time=11043.499..47957.774 rows=6 loops=1)

                 Hash Cond: (s_1.atttype_id = vat.id)

                 ->  Hash Semi Join  (cost=2.08..686796.55 rows=431458 width=24) (actual time=11040.920..47955.181 rows=6 loops=1)

                       Hash Cond: (split_part(split_part((s_1.attvalue)::text, ' '::text, 1), '.'::text, 1) = (e.name)::text)

                       ->  Seq Scan on functionalvarattributes s_1  (cost=0.00..604679.32 rows=25429032 width=24) (actual time=0.006..2

2378.636 rows=25429808 loops=1)

                       ->  Hash  (cost=1.48..1.48 rows=48 width=21) (actual time=0.063..0.063 rows=48 loops=1)

                             Buckets: 1024  Batches: 1  Memory Usage: 3kB

                             ->  Seq Scan on func_var_name_for_tpl_15 e  (cost=0.00..1.48 rows=48 width=21) (actual time=0.006..0.032 r

ows=48 loops=1)

                 ->  Hash  (cost=183.18..183.18 rows=388 width=8) (actual time=2.480..2.480 rows=388 loops=1)

                       Buckets: 1024  Batches: 1  Memory Usage: 16kB

                       ->  Seq Scan on variableattributetypes vat  (cost=0.00..183.18 rows=388 width=8) (actual time=0.021..2.220 rows=

388 loops=1)

                             Filter: ((fieldtype)::text = ANY ('{DBF_INLINK,DBF_OUTLINK,DBF_FWDLINK}'::text[]))

                             Rows Removed by Filter: 5516

           ->  Hash  (cost=5.43..5.43 rows=36 width=8) (actual time=0.166..0.166 rows=36 loops=1)

                 Buckets: 1024  Batches: 1  Memory Usage: 2kB

                 ->  Seq Scan on tags t  (cost=0.00..5.43 rows=36 width=8) (actual time=0.015..0.137 rows=36 loops=1)

                       Filter: ((status)::text <> 'Internal'::text)

                       Rows Removed by Filter: 158

   ->  CTE Scan on filtered_s s  (cost=0.00..105.24 rows=5262 width=516) (actual time=11043.686..47957.977 rows=2 loops=1)

   ->  Materialize  (cost=0.70..84.46 rows=141 width=0) (actual time=0.027..0.307 rows=159 loops=2)

         ->  Nested Loop  (cost=0.70..83.75 rows=141 width=0) (actual time=0.049..0.394 rows=159 loops=1)

               ->  Index Scan using usertemp_utv_idx on usertemplatevariable utv  (cost=0.29..8.75 rows=16 width=8) (actual time=0.025.

.0.040 rows=16 loops=1)

                     Index Cond: (usertempl_id = 15)

               ->  Index Only Scan using usertemplvarattribute_atttypeid_key on usertemplvarattribute utva  (cost=0.42..4.60 rows=9 wid

th=8) (actual time=0.005..0.013 rows=10 loops=16)

                     Index Cond: (usertempvariable_fk = utv.id)

                     Heap Fetches: 0

Total runtime: 47959.180 ms

(31 rows)

 

sddcryo=#

 

iterlogo 
Lana ABADIE
Database Engineer
CODAC Section

 
ITER Organization, Building 72/4108, SCOD, Control System Division
Route de Vinon-sur-Verdon - CS 90 046 - 13067 St Paul Lez Durance Cedex – France
Phone: +33 4 42 17 84 02
Get the latest ITER news on http://www.iter.org/whatsnew

From: Félix GERZAGUET [mailto:felix.gerzaguet@gmail.com]
Sent: 27 July 2016 11:37
To: Abadie Lana
Cc: Martín Marqués; Tom Lane; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very slow query (3-4mn) on a table with 25millions rows

 

 

On Wed, Jul 27, 2016 at 11:15 AM, Félix GERZAGUET <felix.gerzaguet@gmail.com> wrote:

 I don't know how to give the planner more accurate info ...

 

Could you try to materialize the e.name subquery in another table. As in

create table func_var_name_for_tpl_15 as
select e.name
                                                             from functionalvariables e
                                                                , usertemplatevariable ut
                                                            where e.usertemplatevar_id=ut.id
                                                              and ut.usertempl_id=15
;

Then analyse that table
Then try the rewritten query:

with filtered_s as (
select s.attvalue
  from functionalvarattributes s
     , tags t
     , variableattributetypes vat
 where t.id=s.tag_id
   and t.status!='Internal'
   and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')
   and vat.id=s.atttype_id
   and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name
                                                             from func_var_name_for_tpl_15 e
                                                                )
)
select s.attvalue
  from filtered_s s
except
select s.attvalue
  from filtered_s s
     , usertemplvarattribute utva
     , usertemplatevariable utv
  where utv.id=utva.usertempvariable_fk
    and utv.usertempl_id=15
;

 

Does it use the vat_funcvaratt_multi_idx index now ?


--

Félix

 

Вложения