Обсуждение: Query planner cost estimate less than the sum of its parts?

Поиск
Список
Период
Сортировка

Query planner cost estimate less than the sum of its parts?

От
"Scott Carey"
Дата:
So, we had a query run accidentally without going through the right checks to ensure that it had the right limits in a where clause for our table partitioning, resulting in an attempt to scan TB's of data.

Obviously, we fixed the query, but the curious result is this explain plan (shortened, in full form its ~3500 lines long).  A true cost estimate of ~ 4 million isn't a big deal on this server.  It is plainly wrong...  wouldn't a nested loop of this sort estimate at least 128266*4100644 for the cost?  Or something on that order of magnitude? 
Certainly, a cost estimate that is ... LESS than one of the sub sections of the query is wrong.   This was one hell of a broken query, but it at least should have taken an approach that was not a nested loop, and I'm curious if that choice was due to a bad estimate here. 

Nested Loop IN Join  (cost=0.00..3850831.86 rows=128266 width=8)
   Join Filter: ((log.p_p_logs.s_id)::text = (log.s_r_logs.s_id)::text)
   ->  Append  (cost=0.00..6078.99 rows=128266 width=46)
         ->  Seq Scan on p_p_logs  (cost=0.00..1.01 rows=1 width=14)
               Filter: ((date >= '2008-10-27'::date) AND (sector = 12))
         ->  Seq Scan on p_p_logs_012_2008_10_27 p_p_logs  (cost=0.00..718.22 rows=15148 width=46)
               Filter: ((date >= '2008-10-27'::date) AND (sector = 12))
      [ Snipped ~ 10 more tables]

   ->  Append  (cost=0.00..4100644.78 rows=29850181 width=118)
         ->  Seq Scan on s_r_logs  (cost=0.00..1.01 rows=1 width=14)
               Filter: log.s_r_logs.source
         ->  Seq Scan on s_r_logs_002_2008_10_01 s_r_logs  (cost=0.00..91.00 rows=1050 width=33)
               Filter: p_log.s_r_logs.source
         ->  Seq Scan on s_r_logs_002_2008_10_02 s_r_logs  (cost=0.00..65.00 rows=750 width=33)
      [ Snipped ~1500 tables of various sizes ]

Re: Query planner cost estimate less than the sum of its parts?

От
Gregory Stark
Дата:
"Scott Carey" <scott@richrelevance.com> writes:

> Certainly, a cost estimate that is ... LESS than one of the sub sections of
> the query is wrong.   This was one hell of a broken query, but it at least
> should have taken an approach that was not a nested loop, and I'm curious if
> that choice was due to a bad estimate here.
>
> Nested Loop IN Join  (cost=0.00..3850831.86 rows=128266 width=8)

Because it's an IN join it doesn't have to run the inner join to completion.
Once it finds a match it can return the outer tuple and continue to the next
outer tuple.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

Re: Query planner cost estimate less than the sum of its parts?

От
"Scott Carey"
Дата:
I'll have to think a bit about that given that the query had run for 20 hours of 250MB/sec-ish disk reads and wasn't done.  Luckily, thats not even 35% disk utilization on this system, and the 'right' query with fewer tables does things properly with a hash and takes seconds rather than hours (days?).

If it can short-circuit the search, then its probably extremely underestimating how much data it has to look through before finding a match, which I'd expect out of a partitioned table query since the planner assumptions around those are generally bad to really bad (as in, the aggregate statistics on a list of tables is essentially not used or calculated/estimated wrong).  I suppose the real problem is there, its going to have to look through most of this data to find a match, on every loop, and the planner has no clue.
If the nested loop was the other way around it would not have even pinned the disk and have been all in memory on the matching.  If it had hashed all of the estimated 128K values in the top -- which at 1GB for work_mem it should but does not -- it could have scanned once for matches and thrown out those in the hash that did not have a match.

Anyhow this isn't causing a problem at the moment, and it looks like the usual culprit with poor planner choices on partition tables and not a new one.

On Wed, Nov 5, 2008 at 1:22 PM, Gregory Stark <stark@enterprisedb.com> wrote:

"Scott Carey" <scott@richrelevance.com> writes:

> Certainly, a cost estimate that is ... LESS than one of the sub sections of
> the query is wrong.   This was one hell of a broken query, but it at least
> should have taken an approach that was not a nested loop, and I'm curious if
> that choice was due to a bad estimate here.
>
> Nested Loop IN Join  (cost=0.00..3850831.86 rows=128266 width=8)

Because it's an IN join it doesn't have to run the inner join to completion.
Once it finds a match it can return the outer tuple and continue to the next
outer tuple.

--
 Gregory Stark
 EnterpriseDB          http://www.enterprisedb.com
 Ask me about EnterpriseDB's 24x7 Postgres support!