Обсуждение: Slow query but can't see whats wrong

Поиск
Список
Период
Сортировка

Slow query but can't see whats wrong

От
Henrik Zagerholm
Дата:
Hello list,

I'm using pgsql 8.2.4 and I have this query which is sooo slow but I
can seem to figure out why.
It is using the appropriate indexes but there is something wrong with
the nested loops.

I'm running auto vacuum so the statistics should be up to date. I've
increase the statistics on most foreign keys.

Below is the query and the explain analyze.
If you need more info please let me know.

Thanks,
henke


EXPLAIN ANALYZE (SELECT COUNT(*) FROM tbl_file_structure
JOIN tbl_file ON pk_file_id = fk_file_id
JOIN tbl_structure ON pk_structure_id = fk_structure_id
JOIN tbl_archive ON pk_archive_id =     fk_archive_id
JOIN tbl_share ON pk_share_id = fk_share_id
JOIN tbl_computer ON pk_computer_id = fk_computer_id
JOIN tbl_filetype ON pk_filetype_id = fk_filetype_id
JOIN tbl_acl ON fk_file_structure_id = pk_file_structure_id
WHERE ( LOWER(file_name) LIKE LOWER('awstats%') AND archive_complete
= true) GROUP BY file_name, file_ctime, structure_path,
    pk_computer_id, filetype_icon, computer_name, share_name, share_path)

"HashAggregate  (cost=180.28..180.54 rows=21 width=125) (actual
time=144071.947..144180.311 rows=7794 loops=1)"
"  ->  Nested Loop  (cost=11.29..179.81 rows=21 width=125) (actual
time=1.519..139469.148 rows=224606 loops=1)"
"        ->  Hash Join  (cost=11.29..144.82 rows=4 width=133) (actual
time=1.356..81072.074 rows=40836 loops=1)"
"              Hash Cond: (tbl_share.fk_computer_id =
tbl_computer.pk_computer_id)"
"              ->  Hash Join  (cost=9.15..142.63 rows=4 width=122)
(actual time=1.117..80482.340 rows=40836 loops=1)"
"                    Hash Cond: (tbl_archive.fk_share_id =
tbl_share.pk_share_id)"
"                    ->  Nested Loop  (cost=8.02..141.43 rows=6
width=108) (actual time=0.979..79847.699 rows=40836 loops=1)"
"                          ->  Nested Loop  (cost=4.73..97.56 rows=6
width=108) (actual time=0.812..69005.311 rows=43946 loops=1)"
"                                ->  Nested Loop  (cost=4.73..58.09
rows=6 width=50) (actual time=0.662..50552.977 rows=43946 loops=1)"
"                                      ->  Nested Loop
(cost=0.01..9.84 rows=1 width=42) (actual time=0.409..7371.172
rows=25039 loops=1)"
"                                            Join Filter:
(tbl_filetype.pk_filetype_id = tbl_file.fk_filetype_id)"
"                                            ->  Index Scan using
tbl_file_idx on tbl_file  (cost=0.01..8.34 rows=1 width=39) (actual
time=0.283..3339.003 rows=25039 loops=1)"
"                                                  Index Cond: ((lower
((file_name)::text) ~>=~ 'awstats'::character varying) AND (lower
((file_name)::text) ~<~ 'awstatt'::character varying))"
"                                                  Filter: (lower
((file_name)::text) ~~ 'awstats%'::text)"
"                                            ->  Seq Scan on
tbl_filetype  (cost=0.00..1.22 rows=22 width=19) (actual
time=0.013..0.072 rows=22 loops=25039)"
"                                      ->  Bitmap Heap Scan on
tbl_file_structure  (cost=4.72..48.12 rows=11 width=24) (actual
time=0.964..1.685 rows=2 loops=25039)"
"                                            Recheck Cond:
(tbl_file.pk_file_id = tbl_file_structure.fk_file_id)"
"                                            ->  Bitmap Index Scan on
tbl_file_structure_idx  (cost=0.00..4.72 rows=11 width=0) (actual
time=0.322..0.322 rows=2 loops=25039)"
"                                                  Index Cond:
(tbl_file.pk_file_id = tbl_file_structure.fk_file_id)"
"                                ->  Index Scan using
tbl_structure_pkey on tbl_structure  (cost=0.00..6.56 rows=1
width=74) (actual time=0.391..0.401 rows=1 loops=43946)"
"                                      Index Cond:
(tbl_structure.pk_structure_id = tbl_file_structure.fk_structure_id)"
"                          ->  Bitmap Heap Scan on tbl_archive
(cost=3.29..7.30 rows=1 width=16) (actual time=0.217..0.221 rows=1
loops=43946)"
"                                Recheck Cond:
(tbl_archive.pk_archive_id = tbl_structure.fk_archive_id)"
"                                Filter: archive_complete"
"                                ->  Bitmap Index Scan on
tbl_archive_pkey  (cost=0.00..3.29 rows=1 width=0) (actual
time=0.133..0.133 rows=64 loops=43946)"
"                                      Index Cond:
(tbl_archive.pk_archive_id = tbl_structure.fk_archive_id)"
"                    ->  Hash  (cost=1.06..1.06 rows=6 width=30)
(actual time=0.094..0.094 rows=6 loops=1)"
"                          ->  Seq Scan on tbl_share
(cost=0.00..1.06 rows=6 width=30) (actual time=0.053..0.067 rows=6
loops=1)"
"              ->  Hash  (cost=2.06..2.06 rows=6 width=19) (actual
time=0.199..0.199 rows=6 loops=1)"
"                    ->  Seq Scan on tbl_computer  (cost=0.00..2.06
rows=6 width=19) (actual time=0.109..0.167 rows=6 loops=1)"
"        ->  Index Scan using tbl_acl_idx on tbl_acl
(cost=0.00..8.54 rows=17 width=8) (actual time=1.205..1.380 rows=6
loops=40836)"
"              Index Cond: (tbl_acl.fk_file_structure_id =
tbl_file_structure.pk_file_structure_id)"
"Total runtime: 144215.020 ms"


Re: Slow query but can't see whats wrong

От
Michael Fuhr
Дата:
On Tue, Jul 24, 2007 at 10:40:16AM +0200, Henrik Zagerholm wrote:
> I'm using pgsql 8.2.4 and I have this query which is sooo slow but I
> can seem to figure out why.
> It is using the appropriate indexes but there is something wrong with
> the nested loops.
>
> I'm running auto vacuum so the statistics should be up to date. I've
> increase the statistics on most foreign keys.

Have you examined the last_vacuum, last_autovacuum, last_analyze,
and last_autoanalyze columns in pg_stat_user_tables to find out
when the tables were last vacuumed and analyzed?

The estimate on this index scan is a problem:

> ->  Index Scan using tbl_file_idx on tbl_file  (cost=0.01..8.34 rows=1 width=39) (actual time=0.283..3339.003
rows=25039loops=1) 
>       Index Cond: ((lower ((file_name)::text) ~>=~ 'awstats'::character varying) AND (lower ((file_name)::text) ~<~
'awstatt'::charactervarying)) 
>       Filter: (lower ((file_name)::text) ~~ 'awstats%'::text)

Is tbl_file_idx an index on the expression lower(file_name)?  If
so then I don't think increasing the statistics on tbl_file.file_name
will help, but increasing the statistics on tbl_file_idx.pg_expression_1
might.  You'll have to experiment to find a good value.

ALTER TABLE tbl_file_idx ALTER pg_expression_1 SET STATISTICS 100;
ANALYZE tbl_file;

Unfortunately the statistics settings on index expressions don't
survive a pg_dump.  Fixing this has been discussed a few times but
I don't think anybody has worked on it.  The developers' TODO list
has the following item:

* Allow accurate statistics to be collected on indexes with more
  than one column or expression indexes, perhaps using per-index
  statistics

--
Michael Fuhr

Re: Slow query but can't see whats wrong

От
Tom Lane
Дата:
Henrik Zagerholm <henke@mac.se> writes:
> I'm using pgsql 8.2.4 and I have this query which is sooo slow but I
> can seem to figure out why.

Seems the core of the problem is the misestimation here:

> "                                            ->  Index Scan using
> tbl_file_idx on tbl_file  (cost=0.01..8.34 rows=1 width=39) (actual
> time=0.283..3339.003 rows=25039 loops=1)"
> "                                                  Index Cond: ((lower
> ((file_name)::text) ~>=~ 'awstats'::character varying) AND (lower
> ((file_name)::text) ~<~ 'awstatt'::character varying))"
> "                                                  Filter: (lower
> ((file_name)::text) ~~ 'awstats%'::text)"

Looks like either your stats are not up to date on this table, or you
need to increase the stats target for file_name.  Anytime a scan row
estimate is off by a factor of 25000, you're going to get a bad plan :-(

            regards, tom lane

Re: Slow query but can't see whats wrong

От
Henrik Zagerholm
Дата:
24 jul 2007 kl. 16:10 skrev Tom Lane:

> Henrik Zagerholm <henke@mac.se> writes:
>> I'm using pgsql 8.2.4 and I have this query which is sooo slow but I
>> can seem to figure out why.
>
> Seems the core of the problem is the misestimation here:
>
>> "                                            ->  Index Scan using
>> tbl_file_idx on tbl_file  (cost=0.01..8.34 rows=1 width=39) (actual
>> time=0.283..3339.003 rows=25039 loops=1)"
>> "                                                  Index Cond:
>> ((lower
>> ((file_name)::text) ~>=~ 'awstats'::character varying) AND (lower
>> ((file_name)::text) ~<~ 'awstatt'::character varying))"
>> "                                                  Filter: (lower
>> ((file_name)::text) ~~ 'awstats%'::text)"
>
> Looks like either your stats are not up to date on this table, or you
> need to increase the stats target for file_name.  Anytime a scan row
> estimate is off by a factor of 25000, you're going to get a bad
> plan :-(
>
Thanks for the tips on getting this query faster.
What I did was.

Changed statistics on both file_name and tbl_file_idx.pg_expression_1
which made the stats look much better. A simple query test on
tbl_file showed that the the new stats worked.

EXPLAIN ANALYZE SELECT file_name FROM tbl_file WHERE lower(file_name)
LIKE lower('awstats%');

"Bitmap Heap Scan on tbl_file  (cost=1157.12..11999.34 rows=39868
width=19) (actual time=49.751..258.550 rows=39959 loops=1)"
"  Filter: (lower((file_name)::text) ~~ 'awstats%'::text)"
"  ->  Bitmap Index Scan on tbl_file_idx  (cost=0.00..1147.15
rows=35881 width=0) (actual time=47.791..47.791 rows=39959 loops=1)"
"        Index Cond: ((lower((file_name)::text) ~>=~
'awstats'::character varying) AND (lower((file_name)::text) ~<~
'awstatt'::character varying))"
"Total runtime: 308.610 ms"


I then added another table which as a many to one relationship with
tbl_file called tbl_file_structure. This is the one I want to do the
count on.
Surprisingly this query is about 100 times slower due to the SEQ scan
on tbl_file_structure. The foreign key fk_file_id is indexed and it
has statistics set to 200.

How come it makes an seq scan on this table? The count retrieved is
75 000 so its way lower than the total 3 834 059 rows.
Should I raise the statistics more? Is there a rule of thumb how much
the statistics should be reagards to the number of rows in the table?
Can I make my database adjust the statistics dynamically? I don't
want to go around to my customers changing statistics every time the
tables starts to fill up.

Anyway here is the slow query.

EXPLAIN ANALYZE SELECT COUNT(pk_file_structure_id) FROM
tbl_file_structure INNER JOIN tbl_file ON fk_file_id = pk_file_id
WHERE lower(file_name) like lower('awstats%');

"Aggregate  (cost=172512.17..172512.18 rows=1 width=8) (actual
time=30316.316..30316.317 rows=1 loops=1)"
"  ->  Hash Join  (cost=12673.69..171634.39 rows=351110 width=8)
(actual time=1927.730..30191.260 rows=75262 loops=1)"
"        Hash Cond: (tbl_file_structure.fk_file_id =
tbl_file.pk_file_id)"
"        ->  Seq Scan on tbl_file_structure  (cost=0.00..80537.59
rows=3834059 width=16) (actual time=10.056..14419.662 rows=3834059
loops=1)"
"        ->  Hash  (cost=11999.34..11999.34 rows=39868 width=8)
(actual time=1896.859..1896.859 rows=39959 loops=1)"
"              ->  Bitmap Heap Scan on tbl_file
(cost=1157.12..11999.34 rows=39868 width=8) (actual
time=457.867..1779.792 rows=39959 loops=1)"
"                    Filter: (lower((file_name)::text) ~~ 'awstats
%'::text)"
"                    ->  Bitmap Index Scan on tbl_file_idx
(cost=0.00..1147.15 rows=35881 width=0) (actual time=450.469..450.469
rows=39959 loops=1)"
"                          Index Cond: ((lower((file_name)::text)
~>=~ 'awstats'::character varying) AND (lower((file_name)::text) ~<~
'awstatt'::character varying))"
"Total runtime: 30316.739 ms"

Thanks for your help,
Henrik



>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq