Does "correlation" mislead the optimizer on large tables?

Поиск
Список
Период
Сортировка
От Ron Mayer
Тема Does "correlation" mislead the optimizer on large tables?
Дата
Msg-id Pine.LNX.4.44.0301231909070.986-100000@localhost.localdomain
обсуждение исходный текст
Ответ на Re: Same query, same performance  ("alexandre :: aldeia digital" <alepaes@aldeiadigital.com.br>)
Ответы Re: Does "correlation" mislead the optimizer on large tables?  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
Short summary:

  On a large tables, I think the "correlation" pg_stats field as calculated
  by "vacuum analyze" or "analyze" can mislead the optimizer.

  By forcing index scans on some queries shown below, some queries
  in my database speed up from 197 seconds to under 30 seconds.

  I'd like feedback on whether or not having a smarter "analyze"
  function (which I think I could write as a separate utility) would
  help me situations like this.

Longer:

  In particular, if I have a large table t with columns 'a','b','c', etc,
  and I cluster the table as follows:

    create table t_ordered as select * from t order by a,b;
    vacuum analyze t_ordered;

  Column "b" will (correctly) get a very low "correlation" in
  the pg_stats table -- but I think the optimizer would do better
  assuming a high correlation because similar 'b' values are still
  grouped closely on the same disk pages.



  Below is a real-world example of this issue.

  The table "fact" is a large one (reltuples = 1e8, relpages = 1082385)
  and contains about 1 years worth of data.  The data was loaded
  sequentialy (ordered by dat,tim).

    logs=# \d fact;
            Table "fact"
     Column |          Type          | Modifiers
    --------+------------------------+-----------
     dat    | date                   |
     tim    | time without time zone |
     ip_id  | integer                |
     bid_id | integer                |
     req_id | integer                |
     ref_id | integer                |
     uag_id | integer                |
    Indexes: i_fact_2__bid_id,
         i_fact_2__dat,
         i_fact_2__tim,
         i_fact_2__ip_id,
         i_fact_2__ref_id,
         i_fact_2__req_id


  With a table this large, each day's worth of data contains
  about 3000 pages; or conversely, each page contains only about
  a 30 second range of values for "tim".

  As shown in the queries below, the optimizer wanted to do
  a sequential scan when looking at a 10 minute part of the day.
  However also as shown, forcing an index scan did much better.

  I'm guessing this happened because the optimizer saw the
  horrible correlation, and decided it would have to read
  an enormous number of pages if it did an index scan.

===========================================

logs=# select tablename,attname,n_distinct,correlation from pg_stats where tablename='fact';
 tablename | attname | n_distinct | correlation
-----------+---------+------------+-------------
 fact      | dat     |        365 |           1
 fact      | tim     |      80989 | -0.00281447
 fact      | ip_id   |      44996 |    0.660689
 fact      | bid_id  |     742850 |    0.969026
 fact      | req_id  |       2778 |     0.67896
 fact      | ref_id  |        595 |    0.258023
 fact      | uag_id  |        633 |    0.234216
(7 rows)


logs=# explain analyze select * from fact where tim<'00:10:00';
                                                    QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Seq Scan on fact  (cost=0.00..1949838.40 rows=526340 width=32) (actual time=0.39..197447.50 rows=402929 loops=1)
   Filter: (tim < '00:10:00'::time without time zone)
 Total runtime: 197810.01 msec
(3 rows)

logs=# explain analyze select * from fact where tim<'00:10:00';
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
 Seq Scan on fact  (cost=0.00..1949838.40 rows=526340 width=32) (actual time=15.25..156705.76 rows=402929 loops=1)
   Filter: (tim < '00:10:00'::time without time zone)
 Total runtime: 157089.15 msec
(3 rows)

logs=# set enable_seqscan = off;
SET
logs=# explain analyze select * from fact where tim<'00:10:00';
                                                               QUERY PLAN
                

----------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using i__fact__tim on fact  (cost=0.00..2110978.39 rows=526340 width=32) (actual time=104.41..23307.84
rows=402929loops=1) 
   Index Cond: (tim < '00:10:00'::time without time zone)
 Total runtime: 23660.95 msec
(3 rows)

logs=# explain analyze select * from fact where tim<'00:10:00';
                                                             QUERY PLAN
             

-------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using i__fact__tim on fact  (cost=0.00..2110978.39 rows=526340 width=32) (actual time=0.03..1477.35
rows=402929loops=1) 
   Index Cond: (tim < '00:10:00'::time without time zone)
 Total runtime: 1827.94 msec
(3 rows)



logs=#

*******************************************************************************
*******************************************************************************


So two questions:

  a) Am I on to something.... or is something else the reason why
     the optimizer chose the much slower sequential scan?

  b) If I did write an "analyze" that tried to set "correlation" values
     that took into account such local grouping of data, would anyone
     be interested?


       Ron






В списке pgsql-performance по дате отправления:

Предыдущее
От: Noah Silverman
Дата:
Сообщение: Crash Recovery
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Crash Recovery