Re: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running

Поиск
Список
Период
Сортировка
От El-Lotso
Тема Re: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running
Дата
Msg-id 1189749101.22514.39.camel@neuromancer.home.net
обсуждение исходный текст
Ответ на Re: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
On Wed, 2007-09-12 at 10:41 -0400, Tom Lane wrote:
> El-Lotso <el.lotso@gmail.com> writes:
> > I'm really at my wits end here.
>
> Try to merge the multiple join keys into one, somehow.  I'm not sure why
> the planner is overestimating the selectivity of the combined join
> conditions, but that's basically where your problem is coming from.

I've tried merging them together.. what previously was

INNER JOIN  TS
ON TS.ID = TRH.ID AND
TS.TTYPE = TRH.TTYPE AND
TS.START_TIMESTAMP = TRH.START_TIMESTAMP

has become
inner join TS
on ts.id_ttype_startstamp = trh.id_ttype_startstamp

where id_ttype_startstamp = (id || '-'||ttype || '-' || start_timestamp)

It's working somewhat better but everything is not as rosy as it should
as the planner is still over/under estimating the # of rows.

FROM org :
Nested Loop Left Join  (cost=10612.48..24857.20 rows=1 width=61) (actual
time=1177.626..462856.007 rows=750 loops=1)

TO merge joined conditions :
Hash Join  (cost=41823.94..45889.49 rows=6101 width=61) (actual
time=3019.609..3037.692 rows=750 loops=1)
  Hash Cond: (trd.trd_join_key = ts.ts_join_key)

Merged Join using the Main table : 3 - 5 million rows
Hash Left Join  (cost=80846.38..121112.36 rows=25 width=244) (actual
time=5088.437..5457.269 rows=750 loops=1)

Note that it still doesn't really help that much, the estimated rows is
still way off the actual number of rows. On one of the querys there the
hid field has a subset of 8 values, it's even worst. And it seems like
the merge condition doesn't help at all.


I'm still trying to merge more join conditions to see if it helps.




> A truly brute-force solution would be "set enable_nestloop = off"
> but this is likely to screw performance for other queries.

I've also tried this... It's not helping much actually.
As mentioned previously, this is a one to many relationship and because
of that, somehow PG just doesn't take it into account.

I'm still not having much luck here. (playing with a subset of the main
table's data _does_ show some promise, but when querying on the main
table w/ 3 million data, everything grinds to a halt)





В списке pgsql-performance по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: When/if to Reindex
Следующее
От: Ow Mun Heng
Дата:
Сообщение: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running