Обсуждение: planner or statistical bug on 8.5

Поиск
Список
Период
Сортировка

planner or statistical bug on 8.5

От
Pavel Stehule
Дата:
Hello

I checked query and I was surprised with very strange plan:

postgres=# create table a(a int primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"a_pkey" for table "a"
CREATE TABLE
postgres=# create table b(b int primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"b_pkey" for table "b"
CREATE TABLE
postgres=# create table c(c int primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"c_pkey" for table "c"
CREATE TABLE
postgres=# ANALYZE ;
ANALYZE
postgres=# explain select  a, b from a,b,c;                             QUERY PLAN
-----------------------------------------------------------------------Nested Loop  (cost=0.00..276595350.00
rows=13824000000width=8)  ->  Nested Loop  (cost=0.00..115292.00 rows=5760000 width=8)        ->  Seq Scan on a
(cost=0.00..34.00rows=2400 width=4)        ->  Materialize  (cost=0.00..82.00 rows=2400 width=4)              ->  Seq
Scanon b  (cost=0.00..34.00 rows=2400 width=4)  ->  Materialize  (cost=0.00..82.00 rows=2400 width=0)        ->  Seq
Scanon c  (cost=0.00..34.00 rows=2400 width=0)
 
(7 rows)

Regards
Pavel Stehule


Re: planner or statistical bug on 8.5

От
"Dann Corbit"
Дата:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Pavel Stehule
> Sent: Monday, January 11, 2010 11:55 PM
> To: PostgreSQL Hackers
> Cc: Tom Lane
> Subject: [HACKERS] planner or statistical bug on 8.5
> 
> Hello
> 
> I checked query and I was surprised with very strange plan:
> 
> postgres=# create table a(a int primary key);
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
> "a_pkey" for table "a"
> CREATE TABLE
> postgres=# create table b(b int primary key);
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
> "b_pkey" for table "b"
> CREATE TABLE
> postgres=# create table c(c int primary key);
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
> "c_pkey" for table "c"
> CREATE TABLE
> postgres=# ANALYZE ;
> ANALYZE
> postgres=# explain select  a, b from a,b,c;
>                               QUERY PLAN
> -----------------------------------------------------------------------
>  Nested Loop  (cost=0.00..276595350.00 rows=13824000000 width=8)
>    ->  Nested Loop  (cost=0.00..115292.00 rows=5760000 width=8)
>          ->  Seq Scan on a  (cost=0.00..34.00 rows=2400 width=4)
>          ->  Materialize  (cost=0.00..82.00 rows=2400 width=4)
>                ->  Seq Scan on b  (cost=0.00..34.00 rows=2400 width=4)
>    ->  Materialize  (cost=0.00..82.00 rows=2400 width=0)
>          ->  Seq Scan on c  (cost=0.00..34.00 rows=2400 width=0)
> (7 rows)

You have no join columns, so it is a simple product.

Perhaps you meant something like this:
EXPLAIN SELECT a.a, b.b, c.c FROM a a, b b, c c WHERE a.a = b.b AND a.a = c.c



Re: planner or statistical bug on 8.5

От
Matteo Beccati
Дата:
Il 12/01/2010 08:55, Pavel Stehule ha scritto:
> I checked query and I was surprised with very strange plan:
>
> postgres=# explain select  a, b from a,b,c;
>                                QUERY PLAN
> -----------------------------------------------------------------------
>   Nested Loop  (cost=0.00..276595350.00 rows=13824000000 width=8)
>     ->   Nested Loop  (cost=0.00..115292.00 rows=5760000 width=8)
>           ->   Seq Scan on a  (cost=0.00..34.00 rows=2400 width=4)
>           ->   Materialize  (cost=0.00..82.00 rows=2400 width=4)
>                 ->   Seq Scan on b  (cost=0.00..34.00 rows=2400 width=4)
>     ->   Materialize  (cost=0.00..82.00 rows=2400 width=0)
>           ->   Seq Scan on c  (cost=0.00..34.00 rows=2400 width=0)
> (7 rows)

It doesn't surprise me. Tables are empty, thus get a default non-0 row 
estimate, which happens to be 2400:

test=# create table a (a int);
CREATE TABLE
test=# ANALYZE a;
ANALYZE
test=# EXPLAIN SELECT * from a;                     QUERY PLAN
----------------------------------------------------- Seq Scan on a  (cost=0.00..14.80 rows=2400 width=4)
(1 row)


That said, 2400^3 (cross join of 3 tables) == 13824000000


Cheers
-- 
Matteo Beccati

Development & Consulting - http://www.beccati.com/


Re: planner or statistical bug on 8.5

От
Pavel Stehule
Дата:
2010/1/12 Matteo Beccati <php@beccati.com>:
> Il 12/01/2010 08:55, Pavel Stehule ha scritto:
>>
>> I checked query and I was surprised with very strange plan:
>>
>> postgres=# explain select  a, b from a,b,c;
>>                               QUERY PLAN
>> -----------------------------------------------------------------------
>>  Nested Loop  (cost=0.00..276595350.00 rows=13824000000 width=8)
>>    ->   Nested Loop  (cost=0.00..115292.00 rows=5760000 width=8)
>>          ->   Seq Scan on a  (cost=0.00..34.00 rows=2400 width=4)
>>          ->   Materialize  (cost=0.00..82.00 rows=2400 width=4)
>>                ->   Seq Scan on b  (cost=0.00..34.00 rows=2400 width=4)
>>    ->   Materialize  (cost=0.00..82.00 rows=2400 width=0)
>>          ->   Seq Scan on c  (cost=0.00..34.00 rows=2400 width=0)
>> (7 rows)
>
> It doesn't surprise me. Tables are empty, thus get a default non-0 row
> estimate, which happens to be 2400:
>

I though so default estimate is used only when table wasn't analysed.
But you have a true. I am verifying it on 8.3 and the behave is same.

so all is ok.

Regards
Pavel Stehule

> test=# create table a (a int);
> CREATE TABLE
> test=# ANALYZE a;
> ANALYZE
> test=# EXPLAIN SELECT * from a;
>                     QUERY PLAN
> -----------------------------------------------------
>  Seq Scan on a  (cost=0.00..14.80 rows=2400 width=4)
> (1 row)
>
>
> That said, 2400^3 (cross join of 3 tables) == 13824000000
>
>
> Cheers
> --
> Matteo Beccati
>
> Development & Consulting - http://www.beccati.com/
>


Re: planner or statistical bug on 8.5

От
Robert Haas
Дата:
On Tue, Jan 12, 2010 at 3:08 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2010/1/12 Matteo Beccati <php@beccati.com>:
>> Il 12/01/2010 08:55, Pavel Stehule ha scritto:
>>>
>>> I checked query and I was surprised with very strange plan:
>>>
>>> postgres=# explain select  a, b from a,b,c;
>>>                               QUERY PLAN
>>> -----------------------------------------------------------------------
>>>  Nested Loop  (cost=0.00..276595350.00 rows=13824000000 width=8)
>>>    ->   Nested Loop  (cost=0.00..115292.00 rows=5760000 width=8)
>>>          ->   Seq Scan on a  (cost=0.00..34.00 rows=2400 width=4)
>>>          ->   Materialize  (cost=0.00..82.00 rows=2400 width=4)
>>>                ->   Seq Scan on b  (cost=0.00..34.00 rows=2400 width=4)
>>>    ->   Materialize  (cost=0.00..82.00 rows=2400 width=0)
>>>          ->   Seq Scan on c  (cost=0.00..34.00 rows=2400 width=0)
>>> (7 rows)
>>
>> It doesn't surprise me. Tables are empty, thus get a default non-0 row
>> estimate, which happens to be 2400:
>>
>
> I though so default estimate is used only when table wasn't analysed.
> But you have a true. I am verifying it on 8.3 and the behave is same.

Not quite totally the same. I got:
                           QUERY PLAN
------------------------------------------------------------------Nested Loop  (cost=36.40..276619270.40
rows=13824000000width=8)  ->  Nested Loop  (cost=0.00..139234.00 rows=5760000 width=4)        ->  Seq Scan on a
(cost=0.00..34.00rows=2400 width=4)        ->  Seq Scan on c  (cost=0.00..34.00 rows=2400 width=0)  ->  Materialize
(cost=36.40..60.40rows=2400 width=4)        ->  Seq Scan on b  (cost=0.00..34.00 rows=2400 width=4) 

Tom made some changes for 8.5 that will result in materialization
being used in more places, and I think we're seeing that here.  The
planner thinks that materializing the inner side of the nestloop will
save it from going to disk for every iteration, but that's not really
true.  b will be fully cached anyway, but the planner doesn't know
that.  I think we need to think about this a little more before we let
this code out into the wild, or we'll get complaints about materialize
nodes being inserted in places where they only slow things down...

Mind you, it's not totally obvious to me what the solution is.

...Robert


Re: planner or statistical bug on 8.5

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> Tom made some changes for 8.5 that will result in materialization
> being used in more places, and I think we're seeing that here.  The
> planner thinks that materializing the inner side of the nestloop will
> save it from going to disk for every iteration, but that's not really
> true.  b will be fully cached anyway, but the planner doesn't know
> that.  I think we need to think about this a little more before we let
> this code out into the wild, or we'll get complaints about materialize
> nodes being inserted in places where they only slow things down...

I don't think it's a big deal.  The materialize node won't create much
of any slowdown unless its tuplestore gets big enough to spill to disk.
And at that point you're probably talking enough savings from avoided
visibility checks to make it worthwhile.

It's possible that the cost parameters in there need some fine-tuning,
but I don't think the costing model per se is wrong.  It's certainly
far less bogus than it was before.
        regards, tom lane