Re: Estimation row error

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: Estimation row error
Дата
Msg-id CAFj8pRBv0Jhukw_z1s8TZUa8dSGDx3S9BDLb0w6uf_Ecxy5Rtg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Estimation row error  (Mathieu VINCENT <mathieu.vincent@pmsipilot.com>)
Список pgsql-performance
Hi

2015-12-18 16:21 GMT+01:00 Mathieu VINCENT <mathieu.vincent@pmsipilot.com>:
Hello,

No one to help me to understand this bad estimation rows ?
It's NOT caused by :
  • correlation between columns (cross-correlation)
  • bad statistics (i tried with  default_statistics_target to 10 000)
  • bad number of distinct values
  • complexe join conditions
I have no more ideas.

PostgreSQL has not cross tables statistics - so expect uniform distribution of foreign keys.  This expectation is broken in your example.

You can find some prototype solutions by Tomas Vondra in hackars mailing list.

Regards

Pavel
 

thank you for your help.
Mathieu VINCENT

2015-12-17 11:58 GMT+01:00 Mathieu VINCENT <mathieu.vincent@pmsipilot.com>:
Adding foreign key between on t2 and t3, does not change the plan.

drop table if exists t1;
drop table if exists t2;
drop table if exists t3;

create table t1 as select generate_Series(1,200000) as c1; 
create table t2 as select generate_Series(1,200000)%100+1 as c1; 
create table t3 as select generate_Series(1,1500)%750+1 as c1;

alter table t1 add PRIMARY KEY (c1);
create index on t2 (c1);
create index on t3 (c1);
ALTER TABLE t2  ADD CONSTRAINT t2_fk FOREIGN KEY (c1) REFERENCES t1(c1);
ALTER TABLE t3  ADD CONSTRAINT t3_fk FOREIGN KEY (c1) REFERENCES t1(c1);

analyze verbose t1;
analyze verbose t2;
analyze verbose t3;

EXPLAIN (analyze on, buffers on, verbose on)
select 
*
from 
t1 t1
inner join t2 on t1.c1=t2.c1
inner join t3 on t1.c1=t3.c1

Cordialement,
PSIH Décisionnel en santé
Mathieu VINCENT 
Data Analyst
PMSIpilot - 61 rue Sully - 69006 Lyon - France

2015-12-17 11:37 GMT+01:00 Mathieu VINCENT <mathieu.vincent@pmsipilot.com>:
Here, another issue with row estimate.
And, in this example, there is not correlation beetween columns in a same table.

drop table if exists t1;
drop table if exists t2;
drop table if exists t3;

create table t1 as select generate_Series(1,200000) as c1; 
create table t2 as select generate_Series(1,200000)%100 as c1; 
create table t3 as select generate_Series(1,1500)%750 as c1;

alter table t1 add PRIMARY KEY (c1);
create index on t2 (c1);
create index on t3 (c1);

analyze verbose t1;
analyze verbose t2;
analyze verbose t3;

EXPLAIN (analyze on, buffers on, verbose on)
select 
*
from 
t1 t1
inner join t2 on t1.c1=t2.c1
inner join t3 on t2.c1=t3.c1
Do you understand how postgresql calculate the row estimate ?

BR
Mathieu VINCENT

2015-12-17 10:14 GMT+01:00 Matteo Grolla <matteo.grolla@gmail.com>:
Thank you both for the help!
happy holidays

2015-12-17 10:10 GMT+01:00 Mathieu VINCENT <mathieu.vincent@pmsipilot.com>:
thks Gunnar,

I removed the correlation between t3.c1 and t3.c2 in this sql script :

drop table if exists t1;
drop table if exists t2;
drop table if exists t3;
drop table if exists t4;

create table t1 as select generate_Series(1,300000) as c1; 
create table t2 as select generate_Series(1,400) as c1; 
create table t3 as select floor(random()*100+1) as c1, c2 from generate_Series(1,200000) c2;
create table t4 as select generate_Series(1,200000) as c1;

alter table t1 add PRIMARY KEY (c1);
alter table t2 add PRIMARY KEY (c1);
alter table t3 add PRIMARY KEY (c1,c2);
create index on t3 (c1);
create index on t3 (c2);
alter table t4 add PRIMARY KEY (c1);

analyze verbose t1;
analyze verbose t2;
analyze verbose t3;
analyze verbose t4;

EXPLAIN (analyze on, buffers on, verbose on)
select 
*
from 
t1 t1
inner join t2 on t1.c1=t2.c1
inner join t3 on t2.c1=t3.c1
inner join t4 on t3.c2=t4.c1

Now, the estimate is good : http://explain.depesz.com/s/gCX

Have a good day

Mathieu VINCENT

2015-12-15 11:21 GMT+01:00 Gunnar "Nick" Bluth <gunnar.bluth.extern@elster.de>:
Am 15.12.2015 um 10:49 schrieb Andreas Kretschmer:
> Gunnar Nick Bluth <gunnar.bluth.extern@elster.de> wrote:
>
>> Am 15.12.2015 um 09:05 schrieb Mathieu VINCENT:
>>> Hello,
>>>
>>> No one to help me to understand this bad estimation rows ?
>>
>> Well,
>>
>> on a rather beefy machine, I'm getting quite a different plan:
>> http://explain.depesz.com/s/3y5r
>
> you are using 9.5, right? Got the same plan with 9.5.

Nope...:
                                                  version

------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.3.5 on x86_64-unknown-linux-gnu, compiled by gcc
(Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit

So much for those correlation improvements then ;-/


> Btw.: Hi Gunnar ;-)

Hi :)

--
Gunnar "Nick" Bluth
DBA ELSTER

Tel:   +49 911/991-4665
Mobil: +49 172/8853339


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance






В списке pgsql-performance по дате отправления:

Предыдущее
От: Mathieu VINCENT
Дата:
Сообщение: Re: Estimation row error
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: Can't explain db size