Обсуждение: Slow query but can't see whats wrong
Hello list, I'm using pgsql 8.2.4 and I have this query which is sooo slow but I can seem to figure out why. It is using the appropriate indexes but there is something wrong with the nested loops. I'm running auto vacuum so the statistics should be up to date. I've increase the statistics on most foreign keys. Below is the query and the explain analyze. If you need more info please let me know. Thanks, henke EXPLAIN ANALYZE (SELECT COUNT(*) FROM tbl_file_structure JOIN tbl_file ON pk_file_id = fk_file_id JOIN tbl_structure ON pk_structure_id = fk_structure_id JOIN tbl_archive ON pk_archive_id = fk_archive_id JOIN tbl_share ON pk_share_id = fk_share_id JOIN tbl_computer ON pk_computer_id = fk_computer_id JOIN tbl_filetype ON pk_filetype_id = fk_filetype_id JOIN tbl_acl ON fk_file_structure_id = pk_file_structure_id WHERE ( LOWER(file_name) LIKE LOWER('awstats%') AND archive_complete = true) GROUP BY file_name, file_ctime, structure_path, pk_computer_id, filetype_icon, computer_name, share_name, share_path) "HashAggregate (cost=180.28..180.54 rows=21 width=125) (actual time=144071.947..144180.311 rows=7794 loops=1)" " -> Nested Loop (cost=11.29..179.81 rows=21 width=125) (actual time=1.519..139469.148 rows=224606 loops=1)" " -> Hash Join (cost=11.29..144.82 rows=4 width=133) (actual time=1.356..81072.074 rows=40836 loops=1)" " Hash Cond: (tbl_share.fk_computer_id = tbl_computer.pk_computer_id)" " -> Hash Join (cost=9.15..142.63 rows=4 width=122) (actual time=1.117..80482.340 rows=40836 loops=1)" " Hash Cond: (tbl_archive.fk_share_id = tbl_share.pk_share_id)" " -> Nested Loop (cost=8.02..141.43 rows=6 width=108) (actual time=0.979..79847.699 rows=40836 loops=1)" " -> Nested Loop (cost=4.73..97.56 rows=6 width=108) (actual time=0.812..69005.311 rows=43946 loops=1)" " -> Nested Loop (cost=4.73..58.09 rows=6 width=50) (actual time=0.662..50552.977 rows=43946 loops=1)" " -> Nested Loop (cost=0.01..9.84 rows=1 width=42) (actual time=0.409..7371.172 rows=25039 loops=1)" " Join Filter: (tbl_filetype.pk_filetype_id = tbl_file.fk_filetype_id)" " -> Index Scan using tbl_file_idx on tbl_file (cost=0.01..8.34 rows=1 width=39) (actual time=0.283..3339.003 rows=25039 loops=1)" " Index Cond: ((lower ((file_name)::text) ~>=~ 'awstats'::character varying) AND (lower ((file_name)::text) ~<~ 'awstatt'::character varying))" " Filter: (lower ((file_name)::text) ~~ 'awstats%'::text)" " -> Seq Scan on tbl_filetype (cost=0.00..1.22 rows=22 width=19) (actual time=0.013..0.072 rows=22 loops=25039)" " -> Bitmap Heap Scan on tbl_file_structure (cost=4.72..48.12 rows=11 width=24) (actual time=0.964..1.685 rows=2 loops=25039)" " Recheck Cond: (tbl_file.pk_file_id = tbl_file_structure.fk_file_id)" " -> Bitmap Index Scan on tbl_file_structure_idx (cost=0.00..4.72 rows=11 width=0) (actual time=0.322..0.322 rows=2 loops=25039)" " Index Cond: (tbl_file.pk_file_id = tbl_file_structure.fk_file_id)" " -> Index Scan using tbl_structure_pkey on tbl_structure (cost=0.00..6.56 rows=1 width=74) (actual time=0.391..0.401 rows=1 loops=43946)" " Index Cond: (tbl_structure.pk_structure_id = tbl_file_structure.fk_structure_id)" " -> Bitmap Heap Scan on tbl_archive (cost=3.29..7.30 rows=1 width=16) (actual time=0.217..0.221 rows=1 loops=43946)" " Recheck Cond: (tbl_archive.pk_archive_id = tbl_structure.fk_archive_id)" " Filter: archive_complete" " -> Bitmap Index Scan on tbl_archive_pkey (cost=0.00..3.29 rows=1 width=0) (actual time=0.133..0.133 rows=64 loops=43946)" " Index Cond: (tbl_archive.pk_archive_id = tbl_structure.fk_archive_id)" " -> Hash (cost=1.06..1.06 rows=6 width=30) (actual time=0.094..0.094 rows=6 loops=1)" " -> Seq Scan on tbl_share (cost=0.00..1.06 rows=6 width=30) (actual time=0.053..0.067 rows=6 loops=1)" " -> Hash (cost=2.06..2.06 rows=6 width=19) (actual time=0.199..0.199 rows=6 loops=1)" " -> Seq Scan on tbl_computer (cost=0.00..2.06 rows=6 width=19) (actual time=0.109..0.167 rows=6 loops=1)" " -> Index Scan using tbl_acl_idx on tbl_acl (cost=0.00..8.54 rows=17 width=8) (actual time=1.205..1.380 rows=6 loops=40836)" " Index Cond: (tbl_acl.fk_file_structure_id = tbl_file_structure.pk_file_structure_id)" "Total runtime: 144215.020 ms"
On Tue, Jul 24, 2007 at 10:40:16AM +0200, Henrik Zagerholm wrote: > I'm using pgsql 8.2.4 and I have this query which is sooo slow but I > can seem to figure out why. > It is using the appropriate indexes but there is something wrong with > the nested loops. > > I'm running auto vacuum so the statistics should be up to date. I've > increase the statistics on most foreign keys. Have you examined the last_vacuum, last_autovacuum, last_analyze, and last_autoanalyze columns in pg_stat_user_tables to find out when the tables were last vacuumed and analyzed? The estimate on this index scan is a problem: > -> Index Scan using tbl_file_idx on tbl_file (cost=0.01..8.34 rows=1 width=39) (actual time=0.283..3339.003 rows=25039loops=1) > Index Cond: ((lower ((file_name)::text) ~>=~ 'awstats'::character varying) AND (lower ((file_name)::text) ~<~ 'awstatt'::charactervarying)) > Filter: (lower ((file_name)::text) ~~ 'awstats%'::text) Is tbl_file_idx an index on the expression lower(file_name)? If so then I don't think increasing the statistics on tbl_file.file_name will help, but increasing the statistics on tbl_file_idx.pg_expression_1 might. You'll have to experiment to find a good value. ALTER TABLE tbl_file_idx ALTER pg_expression_1 SET STATISTICS 100; ANALYZE tbl_file; Unfortunately the statistics settings on index expressions don't survive a pg_dump. Fixing this has been discussed a few times but I don't think anybody has worked on it. The developers' TODO list has the following item: * Allow accurate statistics to be collected on indexes with more than one column or expression indexes, perhaps using per-index statistics -- Michael Fuhr
Henrik Zagerholm <henke@mac.se> writes: > I'm using pgsql 8.2.4 and I have this query which is sooo slow but I > can seem to figure out why. Seems the core of the problem is the misestimation here: > " -> Index Scan using > tbl_file_idx on tbl_file (cost=0.01..8.34 rows=1 width=39) (actual > time=0.283..3339.003 rows=25039 loops=1)" > " Index Cond: ((lower > ((file_name)::text) ~>=~ 'awstats'::character varying) AND (lower > ((file_name)::text) ~<~ 'awstatt'::character varying))" > " Filter: (lower > ((file_name)::text) ~~ 'awstats%'::text)" Looks like either your stats are not up to date on this table, or you need to increase the stats target for file_name. Anytime a scan row estimate is off by a factor of 25000, you're going to get a bad plan :-( regards, tom lane
24 jul 2007 kl. 16:10 skrev Tom Lane: > Henrik Zagerholm <henke@mac.se> writes: >> I'm using pgsql 8.2.4 and I have this query which is sooo slow but I >> can seem to figure out why. > > Seems the core of the problem is the misestimation here: > >> " -> Index Scan using >> tbl_file_idx on tbl_file (cost=0.01..8.34 rows=1 width=39) (actual >> time=0.283..3339.003 rows=25039 loops=1)" >> " Index Cond: >> ((lower >> ((file_name)::text) ~>=~ 'awstats'::character varying) AND (lower >> ((file_name)::text) ~<~ 'awstatt'::character varying))" >> " Filter: (lower >> ((file_name)::text) ~~ 'awstats%'::text)" > > Looks like either your stats are not up to date on this table, or you > need to increase the stats target for file_name. Anytime a scan row > estimate is off by a factor of 25000, you're going to get a bad > plan :-( > Thanks for the tips on getting this query faster. What I did was. Changed statistics on both file_name and tbl_file_idx.pg_expression_1 which made the stats look much better. A simple query test on tbl_file showed that the the new stats worked. EXPLAIN ANALYZE SELECT file_name FROM tbl_file WHERE lower(file_name) LIKE lower('awstats%'); "Bitmap Heap Scan on tbl_file (cost=1157.12..11999.34 rows=39868 width=19) (actual time=49.751..258.550 rows=39959 loops=1)" " Filter: (lower((file_name)::text) ~~ 'awstats%'::text)" " -> Bitmap Index Scan on tbl_file_idx (cost=0.00..1147.15 rows=35881 width=0) (actual time=47.791..47.791 rows=39959 loops=1)" " Index Cond: ((lower((file_name)::text) ~>=~ 'awstats'::character varying) AND (lower((file_name)::text) ~<~ 'awstatt'::character varying))" "Total runtime: 308.610 ms" I then added another table which as a many to one relationship with tbl_file called tbl_file_structure. This is the one I want to do the count on. Surprisingly this query is about 100 times slower due to the SEQ scan on tbl_file_structure. The foreign key fk_file_id is indexed and it has statistics set to 200. How come it makes an seq scan on this table? The count retrieved is 75 000 so its way lower than the total 3 834 059 rows. Should I raise the statistics more? Is there a rule of thumb how much the statistics should be reagards to the number of rows in the table? Can I make my database adjust the statistics dynamically? I don't want to go around to my customers changing statistics every time the tables starts to fill up. Anyway here is the slow query. EXPLAIN ANALYZE SELECT COUNT(pk_file_structure_id) FROM tbl_file_structure INNER JOIN tbl_file ON fk_file_id = pk_file_id WHERE lower(file_name) like lower('awstats%'); "Aggregate (cost=172512.17..172512.18 rows=1 width=8) (actual time=30316.316..30316.317 rows=1 loops=1)" " -> Hash Join (cost=12673.69..171634.39 rows=351110 width=8) (actual time=1927.730..30191.260 rows=75262 loops=1)" " Hash Cond: (tbl_file_structure.fk_file_id = tbl_file.pk_file_id)" " -> Seq Scan on tbl_file_structure (cost=0.00..80537.59 rows=3834059 width=16) (actual time=10.056..14419.662 rows=3834059 loops=1)" " -> Hash (cost=11999.34..11999.34 rows=39868 width=8) (actual time=1896.859..1896.859 rows=39959 loops=1)" " -> Bitmap Heap Scan on tbl_file (cost=1157.12..11999.34 rows=39868 width=8) (actual time=457.867..1779.792 rows=39959 loops=1)" " Filter: (lower((file_name)::text) ~~ 'awstats %'::text)" " -> Bitmap Index Scan on tbl_file_idx (cost=0.00..1147.15 rows=35881 width=0) (actual time=450.469..450.469 rows=39959 loops=1)" " Index Cond: ((lower((file_name)::text) ~>=~ 'awstats'::character varying) AND (lower((file_name)::text) ~<~ 'awstatt'::character varying))" "Total runtime: 30316.739 ms" Thanks for your help, Henrik > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq