Re: Help with a seq scan on multi-million row table

Поиск
Список
Период
Сортировка
От
Тема Re: Help with a seq scan on multi-million row table
Дата
Msg-id 20060511034711.46266.qmail@web50305.mail.yahoo.com
обсуждение исходный текст
Ответ на Re: Help with a seq scan on multi-million row table  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Help with a seq scan on multi-million row table  (Markus Schaber <schabi@logix-tt.com>)
Список pgsql-sql
Not sure if I'm showing you what you asked for, but here it is:

select * from pg_stats  where tablename='user_url_tag' and attname='user_url_id';schemaname |  tablename   |   attname
| null_frac | avg_width | n_distinct |                            most_common_vals                           |
                                      most_common_freqs                   |
histogram_bounds                                | correlation
 

------------+--------------+-------------+-----------+-----------+------------+------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------+-------------public
   | user_url_tag | user_url_id |         0 |         4 |      60825 |
{458321,1485346,16304,68027,125417,153465,182503,201175,202973,218423}|
{0.00133333,0.001,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667}|
{195,195993,325311,480323,647778,782598,1014527,1201726,1424822,1614712,1853719}|    0.795521
 

You asked if the table has been analyzed recently.
I think so - I run ANALYZE on the whole DB every night, like this:

$ psql -U me -c "ANALYZE;" mydb

For a good measure, I just analyzed the table now: $ psql -U me -c "ANALYZE user_url_tag;" mydb
Then I set the enable_hashjoin back to ON and re-run the EXPLAIN ANALYZE.
I still get the sequential scan, even after analyzing the table :(

I'm not sure which numbers you are referring to when you said the estimate is off, but here are some numbers: The whole
tablehas 6-7 M rows. That query matches about 2500 rows.
 

If there are other things I can play with and help narrow this down, please let me know.

Thanks,
Otis


----- Original Message ----
From: Tom Lane <tgl@sss.pgh.pa.us>
To: ogjunk-pgjedan@yahoo.com
Cc: pgsql-sql@postgresql.org
Sent: Wednesday, May 10, 2006 9:53:49 PM
Subject: Re: [SQL] Help with a seq scan on multi-million row table 

<ogjunk-pgjedan@yahoo.com> writes:
> Aha!  set hashjoin=off did the trick.

>                ->  Index Scan using ix_user_url_tag_user_url_id on user_url_tag userurltag0_  (cost=0.00..157.34
rows=103width=14) (actual time=1.223..1.281 rows=5 loops=1666)
 
>                      Index Cond: (userurltag0_.user_url_id = "outer".id)

This seems to be the problem right here: the estimate of matching rows
is off by a factor of 20, and that inflates the overall cost estimate
for this plan about the same, causing the planner to think the other way
is cheaper.

What does the pg_stats row for user_url_tag.user_url_id contain?
Have you analyzed that table recently?
           regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend





В списке pgsql-sql по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Help with a seq scan on multi-million row table
Следующее
От: "Christian Paul Cosinas"
Дата:
Сообщение: ORDER BY question