Re: query produces 1 GB temp file

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: query produces 1 GB temp file
Дата
Msg-id 87y8e2kb8j.fsf@stark.xeocode.com
обсуждение исходный текст
Ответ на query produces 1 GB temp file  (Dirk Lutzebaeck <lutzeb@aeccom.com>)
Ответы Re: query produces 1 GB temp file  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: query produces 1 GB temp file  (Dirk.Lutzebaeck@t-online.de (Dirk Lutzebaeck))
Список pgsql-performance
Dirk Lutzebaeck <lutzeb@aeccom.com> writes:

> Below is the query and results for EXPLAIN and EXPLAIN ANALYZE. All
> tables have been analyzed before.

Really? A lot of the estimates are very far off. If you really just analyzed
these tables immediately prior to the query then perhaps you should try
raising the statistics target on spec and co. Or is the problem that there's a
correlation between those two columns?

>                ->  Nested Loop  (cost=0.00..8346.73 rows=3 width=1361) (actual time=34.104..18016.005 rows=703677
loops=1)
>                      ->  Nested Loop  (cost=0.00..5757.17 rows=17 width=51) (actual time=0.467..3216.342 rows=48563
loops=1)
>                            ->  Nested Loop  (cost=0.00..5606.39 rows=30 width=42) (actual time=0.381..1677.014
rows=48563loops=1) 
>                                  ->  Index Scan using es_sc_index on es  (cost=0.00..847.71 rows=301 width=8) (actual
time=0.184..46.519rows=5863 loops=1) 
>                                        Index Cond: ((spec = 122293729) AND (co = 117305223::oid))

The root of your problem,. The optimizer is off by a factor of 20. It thinks
these two columns are much more selective than they are.

>                                  ->  Index Scan using bi_env_index on bi  (cost=0.00..15.80 rows=1 width=42) (actual
time=0.052..0.218rows=8 loops=5863) 
>                                        Index Cond: ("outer".en = bi.en)
>                                        Filter: ((rc = 130170467::oid) AND (co = 117305223::oid) AND (hide = false))

It also thinks these three columns are much more selective than they are.

How accurate are its estimates if you just do these?

explain analyze select * from es where spec = 122293729
explain analyze select * from es where co = 117305223::oid
explain analyze select * from bi where rc = 130170467::oid
explain analyze select * from bi where co = 117305223
explain analyze select * from bi where hide = false

If they're individually accurate then you've run into the familiar problem of
needing cross-column statistics. If they're individually inaccurate then you
should try raising the targets on those columns with:

ALTER TABLE [ ONLY ] name [ * ]
    ALTER [ COLUMN ] column SET STATISTICS integer

and reanalyzing.


Dirk Lutzebaeck <lutzeb@aeccom.com> writes:

> Can some please explain why the temp file is so huge? I understand
> there are a lot of rows.

Well that I can't explain. 22k rows of width 1361 doesn't sound so big to me
either. The temporary table does need to store three copies of the records at
a given time, but still it sounds like an awful lot.


--
greg

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: GiST indexes and concurrency (tsearch2)
Следующее
От: Tom Lane
Дата:
Сообщение: Re: query produces 1 GB temp file