Re: abnormally long time in performing a two-table join

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: abnormally long time in performing a two-table join
Дата
Msg-id 25080.1029109707@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: abnormally long time in performing a two-table join  (Chris Mungall <cjm@fruitfly.org>)
Ответы Re: abnormally long time in performing a two-table join  (Chris Mungall <cjm@fruitfly.org>)
Список pgsql-admin
Chris Mungall <cjm@fruitfly.org> writes:
> On Sun, 11 Aug 2002, Tom Lane wrote:
>> Odd ... I'm surprised it doesn't choose a hash join.  What do you get if
>> you try it with "set enable_nestloop = off" ?

> Much better!

> omicia29=# set enable_nestloop = off;
> SET VARIABLE
> omicia29=# explain analyze select seqfeature_id from seqfeature NATURAL
> JOIN sfqv where qualifier_value = 'BRCA1' and seqfeature.seqfeature_key_id
> = 15;
> NOTICE:  QUERY PLAN:

> Hash Join  (cost=55921.31..223860.67 rows=778 width=8) (actual
> time=4249.63..4259.47 rows=15 loops=1)
>   ->  Index Scan using sfqv_idx3 on sfqv  (cost=0.00..167411.84 rows=41423
> width=4) (actual time=38.05..44.50 rows=110 loops=1)
>   ->  Hash  (cost=51056.43..51056.43 rows=49453 width=4) (actual
> time=4211.15..4211.15 rows=0 loops=1)
>         ->  Seq Scan on seqfeature  (cost=0.00..51056.43 rows=49453
> width=4) (actual time=0.14..3974.12 rows=100030 loops=1)
> Total runtime: 4259.67 msec

Hmm.  It looks like the reason the planner doesn't like this plan is
that it's vastly overestimating the number of rows it will get from
indexscanning sfqv_idx3 for qualifier_value = 'BRCA1' (viz, 41423
rather than the true 110).  This is a statistical failure.

What do you get from
    select * from pg_stats where tablename = 'sfqv'
?

You might try increasing the statistics target for the qualifier_value
column --- I'm guessing that you need more resolution in the stats to
deal correctly with low-probability data.  Try
    ALTER TABLE sfqv ALTER COLUMN qualifier_value SET STATISTICS 100;
    ANALYZE sfqv;        -- to recompute stats
then see how the EXPLAIN results change.  (The default stats target is
10; 100 might be more than you need, or perhaps not.)

> however I'm not sure what the implications of turning nestloop off
> altogether are - maybe i can hardcode it just for this query

It'd be best not to.  I'd counsel seeing if more stats help, first.

            regards, tom lane

В списке pgsql-admin по дате отправления:

Предыдущее
От: Chris Mungall
Дата:
Сообщение: Re: abnormally long time in performing a two-table join
Следующее
От: Chris Mungall
Дата:
Сообщение: Re: abnormally long time in performing a two-table join