Re: Query planner cost estimate less than the sum of its parts?

Поиск
Список
Период
Сортировка
От Scott Carey
Тема Re: Query planner cost estimate less than the sum of its parts?
Дата
Msg-id a1ec7d000811051419m294b967akb6a7caf7aac5ead1@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Query planner cost estimate less than the sum of its parts?  (Gregory Stark <stark@enterprisedb.com>)
Список pgsql-performance
I'll have to think a bit about that given that the query had run for 20 hours of 250MB/sec-ish disk reads and wasn't done.  Luckily, thats not even 35% disk utilization on this system, and the 'right' query with fewer tables does things properly with a hash and takes seconds rather than hours (days?).

If it can short-circuit the search, then its probably extremely underestimating how much data it has to look through before finding a match, which I'd expect out of a partitioned table query since the planner assumptions around those are generally bad to really bad (as in, the aggregate statistics on a list of tables is essentially not used or calculated/estimated wrong).  I suppose the real problem is there, its going to have to look through most of this data to find a match, on every loop, and the planner has no clue.
If the nested loop was the other way around it would not have even pinned the disk and have been all in memory on the matching.  If it had hashed all of the estimated 128K values in the top -- which at 1GB for work_mem it should but does not -- it could have scanned once for matches and thrown out those in the hash that did not have a match.

Anyhow this isn't causing a problem at the moment, and it looks like the usual culprit with poor planner choices on partition tables and not a new one.

On Wed, Nov 5, 2008 at 1:22 PM, Gregory Stark <stark@enterprisedb.com> wrote:

"Scott Carey" <scott@richrelevance.com> writes:

> Certainly, a cost estimate that is ... LESS than one of the sub sections of
> the query is wrong.   This was one hell of a broken query, but it at least
> should have taken an approach that was not a nested loop, and I'm curious if
> that choice was due to a bad estimate here.
>
> Nested Loop IN Join  (cost=0.00..3850831.86 rows=128266 width=8)

Because it's an IN join it doesn't have to run the inner join to completion.
Once it finds a match it can return the outer tuple and continue to the next
outer tuple.

--
 Gregory Stark
 EnterpriseDB          http://www.enterprisedb.com
 Ask me about EnterpriseDB's 24x7 Postgres support!

В списке pgsql-performance по дате отправления:

Предыдущее
От: Gregory Stark
Дата:
Сообщение: Re: Query planner cost estimate less than the sum of its parts?
Следующее
От: "Kevin Grittner"
Дата:
Сообщение: Re: Create and drop temp table in 8.3.4