Seq scan on join table despite index and high statistics

От: Henrik Zagerholm
Тема: Seq scan on join table despite index and high statistics
Дата: ,
Msg-id: CCDAD82F-2714-496B-8D4E-63A39E8D2C76@mac.se
(см: обсуждение, исходный текст)
Список: pgsql-performance

Hello list,

I have a problem with a simple count query on a pgsql 8.2.3 server.

SELECT COUNT(pk_file_structure_id) FROM tbl_file_structure INNER JOIN
tbl_file ON fk_file_id = pk_file_id WHERE lower(file_name) like lower
('awstats%');
Using Explain analyze I've noticed that it makes a seq scan on
tbl_file_structure but I have an index on fk_file_id and its
statistics is set to 200. I  ran an analyze on both tbl_file and
tbl_file_structure.
The count retrieved is 75 000 so its way lower than the total 3 834
059 rows.

Should I raise the statistics more? Is there a rule of thumb how much
the statistics should be reagards to the number of rows in the table?
Can I make my database adjust the statistics dynamically? I don't
want to go around to my customers changing statistics every time the
tables starts to fill up.

Anyway here is the explain analyze on the slow query.

EXPLAIN ANALYZE SELECT COUNT(pk_file_structure_id) FROM
tbl_file_structure INNER JOIN tbl_file ON fk_file_id = pk_file_id
WHERE lower(file_name) like lower('awstats%');

"Aggregate  (cost=172512.17..172512.18 rows=1 width=8) (actual
time=30316.316..30316.317 rows=1 loops=1)"
"  ->  Hash Join  (cost=12673.69..171634.39 rows=351110 width=8)
(actual time=1927.730..30191.260 rows=75262 loops=1)"
"        Hash Cond: (tbl_file_structure.fk_file_id =
tbl_file.pk_file_id)"
"        ->  Seq Scan on tbl_file_structure  (cost=0.00..80537.59
rows=3834059 width=16) (actual time=10.056..14419.662 rows=3834059
loops=1)"
"        ->  Hash  (cost=11999.34..11999.34 rows=39868 width=8)
(actual time=1896.859..1896.859 rows=39959 loops=1)"
"              ->  Bitmap Heap Scan on tbl_file
(cost=1157.12..11999.34 rows=39868 width=8) (actual
time=457.867..1779.792 rows=39959 loops=1)"
"                    Filter: (lower((file_name)::text) ~~ 'awstats
%'::text)"
"                    ->  Bitmap Index Scan on tbl_file_idx
(cost=0.00..1147.15 rows=35881 width=0) (actual time=450.469..450.469
rows=39959 loops=1)"
"                          Index Cond: ((lower((file_name)::text)
~>=~ 'awstats'::character varying) AND (lower((file_name)::text) ~<~
'awstatt'::character varying))"
"Total runtime: 30316.739 ms"

Could this have something to do with low settings in postgresql.conf?
I haven't tweaked any settings in postgresql.conf yet.

Please help,
Regards, henke


В списке pgsql-performance по дате сообщения:

От: Andreas Tille
Дата:
Сообщение: Using EXECUTE in a function
От: Camilo Porto
Дата:
Сообщение: Semantics of PostgreSQL Server Log Stats