Обсуждение: Weird explain output

Поиск
Список
Период
Сортировка

Weird explain output

От
wd
Дата:
I have a long SQL with many join, the explain analyse output like

   ->  Hash Left Join  (cost=197749705.15..5320438982425.18
rows=5245599776015 width=7542) (actual time=674.634..2645.293 rows=4568
loops=1)
         Hash Cond: (p.id = ps.partner_id)
         ->  Merge Join  (cost=197747693.55..78881918575.29
rows=5245599776015 width=7486) (actual time=636.814..653.137 rows=4568
loops=1)
               Merge Cond: (a3.team_id = t.id)
               ->  Sort  (cost=3509574.57..3554578.91 rows=18001735
width=112) (actual time=518.318..519.221 rows=4569 loops=1)
                     Sort Key: a3.team_id
                     Sort Method: quicksort  Memory: 549kB
                      ..................... sniped
               ->  Materialize  (cost=194238118.98..194529512.87
rows=58278779 width=7390) (actual time=118.488..122.408 rows=4569 loops=1)
                     ->  Sort  (cost=194238118.98..194383815.92
rows=58278779 width=7390) (actual time=118.486..119.723 rows=3687 loops=1)
                           Sort Key: t.id
                           Sort Method: quicksort  Memory: 6951kB
                       ............................. sniped
         ->  Hash  (cost=1917.17..1917.17 rows=7555 width=68) (actual
time=37.578..37.578 rows=7555 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 958kB
                ............................ sniped

We can see the 'Hash Left Join node' spend 2645.293 ms, so I change the sql
and remove the left join node, and expect the SQL will end with 653.137 ms.
But the explain analyse output like

   ->  Merge Join  (cost=197548498.05..314933709300.46 rows=5245599776015
width=7478) (actual time=775.495..2744.759 rows=4568 loops=1)
         Merge Cond: (a3.team_id = t.id)
         ->  Sort  (cost=3509574.57..3554578.91 rows=18001735 width=112)
(actual time=540.636..541.392 rows=4569 loops=1)
               Sort Key: a3.team_id
               Sort Method: quicksort  Memory: 549kB
               .......................... sniped
         ->  Materialize  (cost=194038923.48..194330317.37 rows=58278779
width=7382) (actual time=234.632..238.065 rows=4569 loops=1)
               ->  Sort  (cost=194038923.48..194184620.42 rows=58278779
width=7382) (actual time=234.628..235.744 rows=3687 loops=1)
                     Sort Key: t.id
                     Sort Method: quicksort  Memory: 6899kB
                 ......................... sniped

We can see the 'Merge Join' spend 2744.759 ms. The sniped output under the
'Merge Join' node in the two explain output are similar, Why does this
happen?

Re: Weird explain output

От
Tom Lane
Дата:
wd <wd@wdicc.com> writes:
> I have a long SQL with many join, the explain analyse output like
>    ->  Hash Left Join  (cost=197749705.15..5320438982425.18
> rows=5245599776015 width=7542) (actual time=674.634..2645.293 rows=4568
> loops=1)
>          Hash Cond: (p.id = ps.partner_id)
>          ->  Merge Join  (cost=197747693.55..78881918575.29
> rows=5245599776015 width=7486) (actual time=636.814..653.137 rows=4568
> loops=1)
>                Merge Cond: (a3.team_id = t.id)
>                ->  Sort  (cost=3509574.57..3554578.91 rows=18001735
> width=112) (actual time=518.318..519.221 rows=4569 loops=1)
>                      Sort Key: a3.team_id
>                      Sort Method: quicksort  Memory: 549kB
>                       ..................... sniped
>                ->  Materialize  (cost=194238118.98..194529512.87
> rows=58278779 width=7390) (actual time=118.488..122.408 rows=4569 loops=1)
>                      ->  Sort  (cost=194238118.98..194383815.92
> rows=58278779 width=7390) (actual time=118.486..119.723 rows=3687 loops=1)
>                            Sort Key: t.id
>                            Sort Method: quicksort  Memory: 6951kB
>                        ............................. sniped
>          ->  Hash  (cost=1917.17..1917.17 rows=7555 width=68) (actual
> time=37.578..37.578 rows=7555 loops=1)
>                Buckets: 1024  Batches: 1  Memory Usage: 958kB
>                 ............................ sniped

> We can see the 'Hash Left Join node' spend 2645.293 ms, so I change the sql
> and remove the left join node, and expect the SQL will end with 653.137 ms.
> But the explain analyse output like

>    ->  Merge Join  (cost=197548498.05..314933709300.46 rows=5245599776015
> width=7478) (actual time=775.495..2744.759 rows=4568 loops=1)
>          Merge Cond: (a3.team_id = t.id)
>          ->  Sort  (cost=3509574.57..3554578.91 rows=18001735 width=112)
> (actual time=540.636..541.392 rows=4569 loops=1)
>                Sort Key: a3.team_id
>                Sort Method: quicksort  Memory: 549kB
>                .......................... sniped
>          ->  Materialize  (cost=194038923.48..194330317.37 rows=58278779
> width=7382) (actual time=234.632..238.065 rows=4569 loops=1)
>                ->  Sort  (cost=194038923.48..194184620.42 rows=58278779
> width=7382) (actual time=234.628..235.744 rows=3687 loops=1)
>                      Sort Key: t.id
>                      Sort Method: quicksort  Memory: 6899kB
>                  ......................... sniped

> We can see the 'Merge Join' spend 2744.759 ms. The sniped output under the
> 'Merge Join' node in the two explain output are similar, Why does this
> happen?

You're worrying about the wrong thing entirely.  The right thing to be
worrying about is why are some of those row estimates off by four orders
of magnitude, and what you can do to fix that.  The planner will never
deliver sane plans when its estimates are so far off from reality.

From my perspective, you snipped all the interesting parts of this
output, because the misestimation is evidently happening somewhere down
in there.

            regards, tom lane

Re: Weird explain output

От
wd
Дата:
Thanks for your reply.


On Mon, Feb 4, 2013 at 3:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> You're worrying about the wrong thing entirely.  The right thing to be
> worrying about is why are some of those row estimates off by four orders
> of magnitude, and what you can do to fix that.  The planner will never
> deliver sane plans when its estimates are so far off from reality.
>
>
You mean the rows from explain is bigger then the rows real returned?
It caused by a sub query with a filter like ( a and b ) or ( c and d ),
I've tried to create a temp table for it, the explained rows and the real
rows will similarly.

Here is the full output, http://explain.depesz.com/s/M7oo


> From my perspective, you snipped all the interesting parts of this
> output, because the misestimation is evidently happening somewhere down
> in there.
>
>                         regards, tom lane
>