Re: Suggestions wanted for 7.2.4 query

Поиск
Список
Период
Сортировка
От Josh Berkus
Тема Re: Suggestions wanted for 7.2.4 query
Дата
Msg-id 200305040907.03306.josh@agliodbs.com
обсуждение исходный текст
Ответ на Re: Suggestions wanted for 7.2.4 query  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Suggestions wanted for 7.2.4 query  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
Tom,

> > SELECT sv_events.*, FALSE AS fuzzy_team FROM sv_events WHERE  EXISTS (
> > SELECT event_id FROM event_days
> >  WHERE event_days.event_id = sv_events.event_id AND  (event_day BETWEEN
> > ('2003-04-08'::TIMESTAMP WITHOUT TIME ZONE)
> >   AND ('2003-06-17 23:59'::TIMESTAMP WITHOUT TIME ZONE) ) );
>
> Is event_days.event_id unique?  If so, try

Regrettably, no.   Event_days is an iterative list of all of the days covered
by the event.   What's unique is event_days_pk, which is event_id, event_day.
If I did a direct join to event_days, multi-day events would appear on the
search results more than once .... which we *don't* want.

> This at least gives you some glimmer of a chance that the restriction on
> event_day can be used to avoid computing the entire join represented by
> sv_events.  With the exists() form, there's no chance...

Hmmm.  There are other ways I can get at the date limit for sv_events; I'll
try that.  Unfortunately, those ways require a seq scan on events, so I'm not
sure we have a net gain here (that is, I can't imagine that  a two-column
date calculation between two parameters could be indexed)

 However, by my reading, 75% of the cost of the query is the unindexed join
between "events" and "cases".    Are you saying that the planner being vague
about what will be returned from the EXISTS clause is what's triggering the
seq scan on "cases"?

--
Josh Berkus
Aglio Database Solutions
San Francisco


В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Suggestions wanted for 7.2.4 query
Следующее
От: "Jim C. Nasby"
Дата:
Сообщение: Re: More tablescanning fun