Обсуждение: What am I missing? Explain row estimate wrong

Поиск
Список
Период
Сортировка

What am I missing? Explain row estimate wrong

От
Jorge Torralba
Дата:
Am I missing something really simple here?  If I run my calculations to determine how postgres determines the number of rows in the explain out put, my math using the method described in the documentation works just fine and I get a value which equals waht you see in the explain.

However, I have one table where I don't even come close. No matter what I set the statistics to and then vacuum analyze, I still get whacky numbers compared to the explain for the output. Am I just missing something here? Should I get some sleep and try again?

Here is what I am using for calculations.


set stats back to default of 100


alter table mytable alter id set statistics -1;

ALTER TABLE

vacuum analyze mytable;


explain select * from mytable where id < 12345;                

                                            QUERY PLAN                                             

---------------------------------------------------------------------------------------------------

 Bitmap Heap Scan on mytable  (cost=1444.81..73368.05 rows=76952 width=280)

   Recheck Cond: (id < 12345)

   ->  Bitmap Index Scan on index_mytable_on_id  (cost=0.00..1425.57 rows=76952 width=0)

         Index Cond: (id < 12345)

(4 rows)


select relname,  reltuples::int, relpages from pg_class where relname = 'mytable';                

  relname   | reltuples | relpages 

------------+-----------+----------

 mytable |   1721143 |    67711

(1 row)


select histogram_bounds from pg_stats where tablename ='mytable' and attname = 'id';                

                                                                                                                                                                                                                         

                                                                                               histogram_bounds                                                                                                          

                                                                                                                                                                                                               

 {12,2147,2365,2743,3811,5132,6775,7803,8968,10375,11707,11936,12899,14432,16179,18685,20394,22311,24273,26203,28511,30506,32012,33584,35527,37700,40837,43905,47046,49896,52464,54907,56477,58223,59807,61554,63450,6551

8,66818,68527,69938,71781,73462,74967,76427,78038,80525,82222,83690,85440,86522,88182,89681,90805,93176,95169,97513,99629,101172,103701,105274,107067,108432,109426,111668,114066,116641,118929,122604,125096,127514,1298

12,132095,133867,136484,137017,140121,142244,144270,145203,146996,149353,151230,153037,155255,157480,158277,161314,162013,162054,162298,162928,165418,168405,171089,173359,175749,178139,181864,183148,186855}

(1 row)


My Math .....

( ( 1 + ( 12345 - 11936 ) / ( 12899 - 11936 ) ) / 100 ) * 1721143 = 24521

24521 is not even close to rows=76952


Thanks for looking!




--
Thanks,

Jorge Torralba
----------------------------

Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.

Re: What am I missing? Explain row estimate wrong

От
Feike Steenbergen
Дата:
> select histogram_bounds from pg_stats where tablename ='mytable' and attname = 'id';        

Do any of the < 12345 values appear in the most_common_vals (mcv) list? 
https://www.postgresql.org/docs/current/static/view-pg-stats.html

To my knowledge, mcv entries are calculated first, and then the histogram bounds are taken into account:
https://www.postgresql.org/docs/current/static/row-estimation-examples.html