Re: [Gmod-schema] Re: performace problem after VACUUM ANALYZE
От | ShengQiang Shu |
---|---|
Тема | Re: [Gmod-schema] Re: performace problem after VACUUM ANALYZE |
Дата | |
Msg-id | 3E4D59C5.304@fruitfly.org обсуждение исходный текст |
Ответ на | performace problem after VACUUM ANALYZE (Scott Cain <cain@cshl.org>) |
Список | pgsql-performance |
PG really does not do the right thing as I mentioned earlier for joining tables. To force not to use seqscan, it still does not use right index (srcfeature_id, nbeg, nend) and performance is even worse. c_gadfly3=# \d fl_src_b_e_key; Index "public.fl_src_b_e_key" Column | Type ---------------+--------- srcfeature_id | integer nbeg | integer nend | integer btree, for table "public.featureloc" c_gadfly3=# explain analyze select * from featureloc fl, feature f where f.feature_id = fl.feature_id and srcfeature_id=1 and (nbeg >= 1000 and nend <= 2000); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Merge Join (cost=115700.27..232516.85 rows=69007 width=445) (actual time=12342.97..12461.23 rows=32 loops=1) Merge Cond: ("outer".feature_id = "inner".feature_id) -> Index Scan using feature_pkey on feature f (cost=0.00..110535.53 rows=2060653 width=361) (actual time=17.85..490.86 rows=28341 loops=1) -> Sort (cost=115700.27..115872.79 rows=69007 width=84) (actual time=11944.23..11944.25 rows=32 loops=1) Sort Key: fl.feature_id -> Seq Scan on featureloc fl (cost=0.00..107580.43 rows=69007 width=84) (actual time=375.85..11944.10 rows=32 loops=1) Filter: ((srcfeature_id = 1) AND (nbeg >= 1000) AND (nend <= 2000)) Total runtime: 12461.37 msec (8 rows) c_gadfly3=# c_gadfly3=# set enable_seqscan=0; SET c_gadfly3=# explain analyze select * from featureloc fl, feature f where f.feature_id = fl.feature_id and srcfeature_id=1 and (nbeg >= 1000 and nend <= 2000); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Merge Join (cost=0.00..236345.49 rows=69007 width=445) (actual time=721.75..26078.64 rows=32 loops=1) Merge Cond: ("outer".feature_id = "inner".feature_id) -> Index Scan using fl_feature_id_key on featureloc fl (cost=0.00..119701.43 rows=69007 width=84) (actual time=549.14..25854.12 rows=32 loops=1) Filter: ((srcfeature_id = 1) AND (nbeg >= 1000) AND (nend <= 2000)) -> Index Scan using feature_pkey on feature f (cost=0.00..110535.53 rows=2060653 width=361) (actual time=50.95..200.37 rows=28342 loops=1) Total runtime: 26078.80 msec (6 rows) Scott Cain wrote: > An update: I ran alter table as suggested, ie, > > alter table featureloc alter srcfeature_id set statistics 100; > > on each column in the table, running vacuum analyze and explain analyze > on the query in between each alter to see if it made any difference. It > did not. Postgres still instists on doing a seq scan on featureloc: > > Unique (cost=336831.46..337179.45 rows=2320 width=47) (actual > time=27219.62..27220.30 rows=179 loops=1) > -> Sort (cost=336831.46..336831.46 rows=23200 width=47) (actual > time=27219.61..27219.80 rows=186 loops=1) > -> Nested Loop (cost=0.00..334732.77 rows=23200 width=47) > (actual time=1003.04..27217.99 rows=186 loops=1) > -> Seq Scan on featureloc fl (cost=0.00..261709.31 > rows=23200 width=14) (actual time=814.68..26094.18 rows=186 loops=1) > -> Index Scan using feature_pkey on feature f > (cost=0.00..3.14 rows=1 width=33) (actual time=6.03..6.03 rows=1 > loops=186) > Total runtime: 27220.63 msec > > > On Fri, 2003-02-14 at 12:29, Scott Cain wrote: > > >Tom, > > > >Sorry about that: I'll try to briefly give the information you are > >looking for. I've read the docs on ALTER TABLE, but it is not clear to > >me what columns I should change STATISTICS on, or should I just do it on > >all of the columns for which indexes exist? > > > >Here's the query again: > > > >select distinct f.name,fl.nbeg,fl.nend,fl.strand,f.type_id,f.feature_id > > from feature f, featureloc fl > > where > > fl.srcfeature_id = 1 and > > ((fl.strand=1 and fl.nbeg <= 393164 and fl.nend >= 390956) OR > > (fl.strand=-1 and fl.nend <= 393164 and fl.nbeg >= 390956)) and > > f.feature_id = fl.feature_id > > > >-------------------------------------------------------------------------- > > > >Naive database: > > > >Unique (cost=75513.46..75513.48 rows=1 width=167) (actual > >time=22815.25..22815.93 rows=179 loops=1) > > -> Sort (cost=75513.46..75513.46 rows=1 width=167) (actual > >time=22815.24..22815.43 rows=186 loops=1) > > -> Nested Loop (cost=0.00..75513.45 rows=1 width=167) (actual > >time=2471.25..22814.01 rows=186 loops=1) > > -> Index Scan using featureloc_idx2 on featureloc fl > >(cost=0.00..75508.43 rows=1 width=14) (actual time=2463.83..22796.50 > >rows=186 loops=1) > > -> Index Scan using feature_pkey on feature f > >(cost=0.00..5.01 rows=1 width=153) (actual time=0.08..0.08 rows=1 > >loops=186) > >Total runtime: 22816.63 msec > >-------------------------------------------------------------------------- > > > >Naive database after featureloc_idx2 dropped: > > > >Unique (cost=75545.46..75545.48 rows=1 width=167) (actual > >time=5232.36..5234.51 rows=179 loops=1) > > -> Sort (cost=75545.46..75545.46 rows=1 width=167) (actual > >time=5232.35..5232.54 rows=186 loops=1) > > -> Nested Loop (cost=0.00..75545.45 rows=1 width=167) (actual > >time=291.46..5220.69 rows=186 loops=1) > > -> Index Scan using featureloc_src_strand_beg_end on > >featureloc fl (cost=0.00..75540.43 rows=1 width=14) (actual > >time=291.30..5214.46 rows=186 loops=1) > > -> Index Scan using feature_pkey on feature f > >(cost=0.00..5.01 rows=1 width=153) (actual time=0.02..0.03 rows=1 > >loops=186) > >Total runtime: 5234.89 msec > >-------------------------------------------------------------------------- > > > >Database after VACUUM ANALYZE was run: > > > >Unique (cost=344377.70..344759.85 rows=2548 width=47) (actual > >time=26466.82..26467.51 rows=179 loops=1) > > -> Sort (cost=344377.70..344377.70 rows=25477 width=47) (actual > >time=26466.82..26467.01 rows=186 loops=1) > > -> Nested Loop (cost=0.00..342053.97 rows=25477 width=47) > >(actual time=262.66..26465.63 rows=186 loops=1) > > -> Seq Scan on featureloc fl (cost=0.00..261709.31 > >rows=25477 width=14) (actual time=118.62..26006.05 rows=186 loops=1) > > -> Index Scan using feature_pkey on feature f > >(cost=0.00..3.14 rows=1 width=33) (actual time=2.45..2.46 rows=1 > >loops=186) > >Total runtime: 26467.85 msec > >-------------------------------------------------------------------------- > > > >After disallowing seqscans (set enable_seqscan=0): > > > >Unique (cost=356513.46..356895.61 rows=2548 width=47) (actual > >time=27494.62..27495.34 rows=179 loops=1) > > -> Sort (cost=356513.46..356513.46 rows=25477 width=47) (actual > >time=27494.61..27494.83 rows=186 loops=1) > > -> Nested Loop (cost=0.00..354189.73 rows=25477 width=47) > >(actual time=198.88..27493.48 rows=186 loops=1) > > -> Index Scan using featureloc_idx1 on featureloc fl > >(cost=0.00..273845.08 rows=25477 width=14) (actual time=129.30..27280.95 > >rows=186 loops=1) > > -> Index Scan using feature_pkey on feature f > >(cost=0.00..3.14 rows=1 width=33) (actual time=1.13..1.13 rows=1 > >loops=186) > >Total runtime: 27495.66 msec > >-------------------------------------------------------------------------- > > > >After dropping featureloc_idx1: > > > >Unique (cost=1310195.21..1310577.36 rows=2548 width=47) (actual > >time=21692.69..21693.37 rows=179 loops=1) > > -> Sort (cost=1310195.21..1310195.21 rows=25477 width=47) (actual > >time=21692.69..21692.88 rows=186 loops=1) > > -> Nested Loop (cost=0.00..1307871.48 rows=25477 width=47) > >(actual time=2197.65..21691.39 rows=186 loops=1) > > -> Index Scan using featureloc_idx2 on featureloc fl > >(cost=0.00..1227526.82 rows=25477 width=14) (actual > >time=2197.49..21618.89 rows=186 loops=1) > > -> Index Scan using feature_pkey on feature f > >(cost=0.00..3.14 rows=1 width=33) (actual time=0.37..0.38 rows=1 > >loops=186) > >Total runtime: 21693.72 msec > >-------------------------------------------------------------------------- > > > >After dropping featureloc_idx2: > > > >Unique (cost=1414516.98..1414899.13 rows=2548 width=47) (actual > >time=1669.17..1669.86 rows=179 loops=1) > > -> Sort (cost=1414516.98..1414516.98 rows=25477 width=47) (actual > >time=1669.17..1669.36 rows=186 loops=1) > > -> Nested Loop (cost=0.00..1412193.25 rows=25477 width=47) > >(actual time=122.69..1668.08 rows=186 loops=1) > > -> Index Scan using featureloc_src_strand_beg_end on > >featureloc fl (cost=0.00..1331848.60 rows=25477 width=14) (actual > >time=122.51..1661.81 rows=186 loops=1) > > -> Index Scan using feature_pkey on feature f > >(cost=0.00..3.14 rows=1 width=33) (actual time=0.02..0.03 rows=1 > >loops=186) > >Total runtime: 1670.20 msec > > > > > >On Fri, 2003-02-14 at 12:00, Tom Lane wrote: > > > >>Scott Cain writes: > >> > >>>[ much stuff ] > >> > >>Could we see EXPLAIN ANALYZE, not just EXPLAIN, output for all these > >>alternatives? Your question boils down to "why is the planner > >>misestimating these queries" ... which is a difficult question to > >>answer when given only the estimates and not the reality. > >> > >>A suggestion though is that you might need to raise the statistics > >>target on the indexed columns, so that ANALYZE will collect > >>finer-grained statistics. (See ALTER TABLE ... SET STATISTICS.) > >>Try booting it up to 100 (from the default 10), re-analyze, and > >>then see if/how the plans change. > >> > >> regards, tom lane
В списке pgsql-performance по дате отправления: