Обсуждение: After VACUUM, statistics become skewed

Поиск
Список
Период
Сортировка

After VACUUM, statistics become skewed

От
Robert.Farrugia@go.com.mt
Дата:

I have noticed the following problem on various large tables for certain queries.

Given a table around 4GB in size containing millions of records, before vacuuming the following query used the correct index (mo_200302_called_idx) which is built on answertime and callednumber_type.

Query is as follows:
select answertime::date, count(*), sum(callduration) from mobileorig_200302 where answertime between '2003/2/3 00:00:00' and '2003/2/14 23:59:59' and normal(dialleddigits_value) = '50043992' and callednumber_type in ('P', 'M') group by answertime::date

After doing a database wide vacuum full analyze, the above query is starting to use an incorrect index i.e. the callingnumber_type (which uses answertime and callingnumber_type fields).  This eventually slows down the system since the query takes more time to finish.

NOTICE:  QUERY PLAN:

Aggregate  (cost=9218923.08..9218935.56 rows=166 width=12)
  ->  Group  (cost=9218923.08..9218927.24 rows=1664 width=12)
        ->  Sort  (cost=9218923.08..9218923.08 rows=1664 width=12)
              ->  Index Scan using mo_200302_calling_idx on mobileorig_200302  (cost=0.00..9218834.06 rows=1664 width=12)

EXPLAIN


I managed to track down the problem to this.  If I used only one callednumber_type, i.e. the query becomes
select answertime::date, count(*), sum(callduration) from mobileorig_200302 where answertime between '2003/2/3 00:00:00' and '2003/2/14 23:59:59' and normal(dialleddigits_value) = '50043992' and callednumber_type in ('M') group by answertime::date

the query planner uses the correct index, while using more than one type, it skews up.

NOTICE:  QUERY PLAN:
Aggregate  (cost=137870.25..137871.57 rows=18 width=12)
  ->  Group  (cost=137870.25..137870.69 rows=177 width=12)
        ->  Sort  (cost=137870.25..137870.25 rows=177 width=12)
              ->  Index Scan using mo_200302_called_idx on mobileorig_200302  (cost=0.00..137863.66 rows=177 width=12)
EXPLAIN

Anyone else encountered something similar ? Any ideas on what is happening and if is solvable ?  Usually by dropping the table and reloading it from disk may solve the problem (it may take more than one try), but this is becoming unpractical due to the amount of data to restore each time.  I'll try re-indexing the tables and see if something happens.  Unfortunately this also happens on tables which have not been changed for a very long time and have also been re-indexed so I'm not confident on this.

The system runs on postgres 7.2.3.

Regards
Robert

Re: After VACUUM, statistics become skewed

От
Tom Lane
Дата:
Robert.Farrugia@go.com.mt writes:
> Given a table around 4GB in size containing millions of records, before
> vacuuming the following query used the correct index
> (mo_200302_called_idx) which is built on answertime and callednumber_type.

Could we see EXPLAIN ANALYZE results for both the good plan and the bad
plan, for the same query?  You're not giving us comparable results.

It would also be useful to show the pg_stats entries for the table.
I suspect that the planner is being fooled by some odd statistics of
your data, but we have not got any info here with which to speculate.

> Usually by dropping the table and reloading it from
> disk may solve the problem (it may take more than one try), but this is
> becoming unpractical due to the amount of data to restore each time.

You can get back to the un-analyzed state by deleting the rows in
pg_statistic for this table, viz
    delete from pg_statistic where starelid = (select oid from
        pg_class where relname = 'mobileorig_200302');
Grotty, but it sure beats dropping and reloading a big table.

            regards, tom lane

Unsubscribe

От
"Q"
Дата:
Can anyone tell me how to unsubscribe?
I'm really sorry, I don't see any instructions anywhere.

Re: Unsubscribe

От
Gary Stainburn
Дата:
On Wednesday 21 May 2003 7:07 pm, Q wrote:
> Can anyone tell me how to unsubscribe?
> I'm really sorry, I don't see any instructions anywhere.

One option is to go to http://developer.postgresql.org/mailsub.php,
choose the admin list, choose unsubscribe and enter your email address.

HTH

Gary
--
Gary Stainburn

This email does not contain private or confidential material as it
may be snooped on by interested government parties for unknown
and undisclosed purposes - Regulation of Investigatory Powers Act, 2000


Re: After VACUUM, statistics become skewed

От
Robert.Farrugia@go.com.mt
Дата:

Hi Tom,

I delete the rows from the pg_statistics table and rerun the same query;  Here are the results.

NOTICE:  QUERY PLAN:
Aggregate  (cost=218105.86..218105.88 rows=1 width=12)
  ->  Group  (cost=218105.86..218105.87 rows=3 width=12)
        ->  Sort  (cost=218105.86..218105.86 rows=3 width=12)
              ->  Index Scan using mo_200302_calling_idx on mobileorig_200302  (cost=0.00..218105.84 rows=3 width=12)
EXPLAIN

As you can see it is still using the wrong index but the cost is less.

I am attaching the output of the pg_stats table for the table shown in the example.



Thanks
Robert



Tom Lane <tgl@sss.pgh.pa.us>

21/05/2003 18:51

       
        To:        Robert.Farrugia@go.com.mt
        cc:        pgsql-admin@postgresql.org
        Subject:        Re: [ADMIN] After VACUUM, statistics become skewed



Robert.Farrugia@go.com.mt writes:
> Given a table around 4GB in size containing millions of records, before
> vacuuming the following query used the correct index
> (mo_200302_called_idx) which is built on answertime and callednumber_type.

Could we see EXPLAIN ANALYZE results for both the good plan and the bad
plan, for the same query?  You're not giving us comparable results.

It would also be useful to show the pg_stats entries for the table.
I suspect that the planner is being fooled by some odd statistics of
your data, but we have not got any info here with which to speculate.

> Usually by dropping the table and reloading it from
> disk may solve the problem (it may take more than one try), but this is
> becoming unpractical due to the amount of data to restore each time.

You can get back to the un-analyzed state by deleting the rows in
pg_statistic for this table, viz
                delete from pg_statistic where starelid = (select oid from
                                 pg_class where relname = 'mobileorig_200302');
Grotty, but it sure beats dropping and reloading a big table.

                                                  regards, tom lane


Вложения

Re: After VACUUM, statistics become skewed

От
Tom Lane
Дата:
Robert.Farrugia@go.com.mt writes:
> Here are the results.

I asked for EXPLAIN ANALYZE for both cases.  When you're worried about
planner estimates diverging from reality, it does not help to not know
what reality is.

            regards, tom lane

Re: After VACUUM, statistics become skewed

От
Robert.Farrugia@go.com.mt
Дата:

Tom,

I deleted the statistics from the pg_stats table and ran the following.

CDR=# explain analyze verbose select answertime::date, count(*), sum(callduration) from mobileorig_200302 where answertime between '2003/2/3 00:00:00' and '2003/2/14 23:59:59' and dialleddigits_value = '50043992' and callednumber_type in ('P', 'M') group by answertime::date;
NOTICE:  QUERY DUMP:

{ AGG :startup_cost 217970.15 :total_cost 217970.18 :rows 1 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1082 :restypmod -1 :resname answertime :reskey 0 :reskeyop 0 :ressortgroupref 1 :resjunk false } :expr { VAR :varno 0 :varattno 3 :vartype 1082 :vartypmod -1  :varlevelsup 0 :varnoold 0 :varoattno 0}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 20 :restypmod -1 :resname count :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { AGGREG :aggname count :basetype 0 :aggtype 20 :target { CONST :consttype 23 :constlen 4 :constbyval true :constisnull false :constvalue  4 [ 1 0 0 0 ] } :aggstar true :aggdistinct false }} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 20 :restypmod -1 :resname sum :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { AGGREG :aggname sum :basetype 23 :aggtype 20 :target { VAR :varno 0 :varattno 2 :vartype 23 :vartypmod -1  :varlevelsup 0 :v arnoold 1 :varoattno 17} :aggstar false :aggdistinct false }}) :qpqual <> :lefttree { GRP :startup_cost 217970.15 :total_cost 217970.16 :rows 3 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1184 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 1 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 23 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 2 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 17}} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 1082 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 3 :vartype 1082 :vartypmod -1  :varlevelsup 0 :varnoold 0 :varoattno 0}}) :qpqual <> :lefttree { SORT :startup_cost 217970.15 :total_cost 217970.15 :rows 3 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1184 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 23 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 17 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 17}} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 1082 :restypmod -1 :resname <> :reskey 1 :reskeyop 1095 :ressortgroupref 0 :resjunk false } :expr { EXPR :typeOid 1082  :opType func :oper { FUNC :funcid 1178 :functype 1082 } :args ({ VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8})}}) :qpqual <> :lefttree { INDEXSCAN :startup_cost 0.00 :total_cost 217970.14 :rows 3 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1184 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 23 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 17 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 17}} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 1082 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { EXPR :typeOid 1082  :opType func :oper { FUNC :funcid 1178 :functype 1082 } :args ({ VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8})}}) :qpqual ({ EXPR :typeOid 16  :opType op :oper { OPER :opno 1062 :opid 1070 :opresulttype 16 } :args ( { VAR :varno 1 :varattno 13 :vartype 1043 :vartypmod 35  :varlevelsup 0 :varnoold 1 :varoattno 13} { CONST :consttype 1043 :constlen -1 :constbyval false :constisnull false :constvalue  12 [ 12 0 0 0 53 48 48 52 51 57 57 50 ] })} { EXPR :typeOid 16  :opType or :oper <> :args ({ EXPR :typeOid 16  :opType op :oper { OPER
:opno 1054 :opid 1048 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 20 :vartype 1042 :vartypmod 5  :varlevelsup 0 :varnoold 1 :varoattno 20} { CONST :consttype 1042 :constlen -1 :constbyval false :constisnull false :constvalue  5 [ 5 0 0 0 80 ] })} { EXPR :typeOid 16  :opType op :oper { OPER :opno 1054 :opid 1048 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 20 :vartype 1042 :vartypmod 5  :varlevelsup 0 :varnoold 1 :varoattno 20} { CONST :consttype 1042 :constlen -1 :constbyval false :constisnull false :constvalue  5 [ 5 0 0 0 77 ] })})}) :lefttree <> :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :scanrelid 1 :indxid ( 2591699101) :indxqual (({ EXPR :typeOid 16  :opType op :oper { OPER :opno 1325 :opid 2056 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8} { CONST :consttype 1184 :constlen 8 :constbyval false :constisnull false :constvalue  8 [ 0 0 0 -64 125 65 -105 65 ] })} { EXPR :typeOid 16  :opType op :oper { OPER :opno 1323 :opid 2055 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8} { CONST :consttype 1184 :constlen 8 :constbyval false :constisnull false :constvalue  8 [ 0 0 0 -68 -59 -128 -105 65 ] })})) :indxqualorig (({ EXPR :typeOid 16  :opType op :oper { OPER :opno 1325 :opid 2056 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8} { CONST :consttype 1184 :constlen 8 :constbyval false :constisnull false :constvalue  8 [ 0 0 0 -64 125 65 -105 65 ] })} { EXPR :typeOid 16  :opType op :oper { OPER :opno 1323 :opid 2055 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8} { CONST :consttype 1184 :cons tlen 8 :constbyval false :constisnull false :constvalue  8 [ 0 0 0 -68 -59 -128 -105 65 ] })})) :indxorderdir 1 } :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :keycount 1 } :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :numCols 1 :tuplePerGroup true } :righttree <> :extprm () :locprm () :initplan <> :nprm 0 }
NOTICE:  QUERY PLAN:

Aggregate  (cost=217970.15..217970.18 rows=1 width=12) (actual time=420082.30..420085.05 rows=7 loops=1)
  ->  Group  (cost=217970.15..217970.16 rows=3 width=12) (actual time=420082.24..420084.14 rows=417 loops=1)
        ->  Sort  (cost=217970.15..217970.15 rows=3 width=12) (actual time=420082.22..420082.60 rows=417 loops=1)
              ->  Index Scan using mo_200302_calling_idx on mobileorig_200302  (cost=0.00..217970.14 rows=3 width=12) (actual time=2965.92..420078.94 rows=417 loops=1)
Total runtime: 420085.34 msec

EXPLAIN


I then ran analyze  on the table and reran the same query.

CDR=# explain analyze verbose select answertime::date, count(*), sum(callduration) from mobileorig_200302 where answertime between '2003/2/3 00:00:00' and '2003/2/14 23:59:59' and dialleddigits_value = '50043992' and callednumber_type in ('P', 'M') group by answertime::date;
NOTICE:  QUERY DUMP:

{ AGG :startup_cost 9265841.46 :total_cost 9265841.63 :rows 2 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1082 :restypmod -1 :resname answertime :reskey 0 :reskeyop 0 :ressortgroupref 1 :resjunk false } :expr { VAR :varno 0 :varattno 3 :vartype 1082 :vartypmod -1  :varlevelsup 0 :varnoold 0 :varoattno 0}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 20 :restypmod -1 :resname count :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { AGGREG :aggname count :basetype 0 :aggtype 20 :target { CONST :consttype 23 :constlen 4 :constbyval true :constisnull false :constvalue  4 [ 1 0 0 0 ] } :aggstar true :aggdistinct false }} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 20 :restypmod -1 :resname sum :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { AGGREG :aggname sum :basetype 23 :aggtype 20 :target { VAR :varno 0 :varattno 2 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 17} :aggstar false :aggdistinct false }}) :qpqual <> :lefttree { GRP :startup_cost 9265841.46 :total_cost 9265841.52 :rows 24 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1184 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 1 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 23 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 2 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 17}} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 1082 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 3 :vartype 1082 :vartypmod -1  :varlevelsup 0 :varnoold 0 :varoattno 0}}) :qpqual <> :lefttree { SORT :startup_cost 9265 841.46 :total_cost 9265841.46 :rows 24 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1184 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 23 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 17 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 17}} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 1082 :restypmod -1 :resname <> :reskey 1 :reskeyop 1095 :ressortgroupref 0 :resjunk false } :expr { EXPR :typeOid 1082  :opType func :oper { FUNC :funcid 1178 :functype 1082 } :args ({ VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8})}}) :qpqual <> :lefttree { INDEXSCAN :startup_cost 0.00 :total_cost 92 65840.92 :rows 24 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1184 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 23 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 17 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 17}} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 1082 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { EXPR :typeOid 1082  :opType func :oper { FUNC :funcid 1178 :functype 1082 } :args ({ VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8})}}) :qpqual ({ EXPR :typeOid 16  :opType op :oper { OPER :opno 1062 :opid 1070 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 13 :vartype 1043 :vartypmod 35  :varlevelsup 0 :varnoold 1 :varoattno 13} { CONST :consttype 1043 :constlen -1 :constbyval false :constisnull false :constvalue  12 [ 12 0 0 0 53 48 48 52 51 57 57 50 ] })} { EXPR :typeOid 16  :opType or :oper <> :args ({ EXPR :typeOid 16  :opType op :op
er { OPER :opno 1054 :opid 1048 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 20 :vartype 1042 :vartypmod 5  :varlevelsup 0 :varnoold 1 :varoattno 20} { CONST :consttype 1042 :constlen -1 :constbyval false :constisnull false :constvalue  5 [ 5 0 0 0 80 ] })} { EXPR :typeOid 16  :opType op :oper { OPER :opno 1054 :opid 1048 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 20 :vartype 1042 :vartypmod 5  :varlevelsup 0 :varnoold 1 :varoattno 20} { CONST :consttype 1042 :constlen -1 :constbyval false :constisnull false :constvalue  5 [ 5 0 0 0 77 ] })})}) :lefttree <> :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :scanrelid 1 :indxid ( 2591699101) :indxqual (({ EXPR :typeOid 16  :opType op :oper { OPER :opno 1325 :opid 2056 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8} { CONST :consttype 1184 : constlen 8 :constbyval false :constisnull false :constvalue  8 [ 0 0 0 -64 125 65 -105 65 ] })} { EXPR :typeOid 16  :opType op :oper { OPER :opno 1323 :opid 2055 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8} { CONST :consttype 1184 :constlen 8 :constbyval false :constisnull false :constvalue  8 [ 0 0 0 -68 -59 -128 -105 65 ] })})) :indxqualorig (({ EXPR :typeOid 16  :opType op :oper { OPER :opno 1325 :opid 2056 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8} { CONST :consttype 1184 :constlen 8 :constbyval false :constisnull false :constvalue  8 [ 0 0 0 -64 125 65 -105 65 ] })} { EXPR :typeOid 16  :opType op :oper { OPER :opno 1323 :opid 2055 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8} { CONST :consttype 1184 :constlen 8 :constbyval false :constisnull false :constvalue  8 [ 0 0 0 -68 -59 -128 -105 65 ] })})) :indxorderdir 1 } :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :keycount 1 } :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :numCols 1 :tuplePerGroup true } :righttree <> :extprm () :locprm () :initplan <> :nprm 0 }
NOTICE:  QUERY PLAN:

Aggregate  (cost=9265841.46..9265841.63 rows=2 width=12) (actual time=340838.88..340841.70 rows=7 loops=1)
  ->  Group  (cost=9265841.46..9265841.52 rows=24 width=12) (actual time=340838.83..340840.75 rows=417 loops=1)
        ->  Sort  (cost=9265841.46..9265841.46 rows=24 width=12) (actual time=340838.81..340839.19 rows=417 loops=1)
              ->  Index Scan using mo_200302_calling_idx on mobileorig_200302  (cost=0.00..9265840.92 rows=24 width=12) (actual time=4156.92..340836.03 rows=417 loops=1)
Total runtime: 340842.06 msec

EXPLAIN


Running a slightly modified query i.e. removing the in and using an equality, the planner uses the "correct" index this time.  Here are the results.

CDR=# explain analyze verbose select answertime::date, count(*), sum(callduration) from mobileorig_200302 where answertime between '2003/2/3 00:00:00' and '2003/2/14 23:59:59' and dialleddigits_value = '50043992' and callednumber_type = 'P' group by answertime::date;        
NOTICE:  QUERY DUMP:

{ AGG :startup_cost 121007.20 :total_cost 121007.22 :rows 1 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1082 :restypmod -1 :resname answertime :reskey 0 :reskeyop 0 :ressortgroupref 1 :resjunk false } :expr { VAR :varno 0 :varattno 3 :vartype 1082 :vartypmod -1  :varlevelsup 0 :varnoold 0 :varoattno 0}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 20 :restypmod -1 :resname count :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { AGGREG :aggname count :basetype 0 :aggtype 20 :target { CONST :consttype 23 :constlen 4 :constbyval true :constisnull false :constvalue  4 [ 1 0 0 0 ] } :aggstar true :aggdistinct false }} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 20 :restypmod -1 :resname sum :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { AGGREG :aggname sum :basetype 23 :aggtype 20 :target { VAR :varno 0 :varattno 2 :vartype 23 :vartypmod -1  :varlevelsup 0 :v arnoold 1 :varoattno 17} :aggstar false :aggdistinct false }}) :qpqual <> :lefttree { GRP :startup_cost 121007.20 :total_cost 121007.21 :rows 2 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1184 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 1 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 23 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 2 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 17}} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 1082 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 0 :varattno 3 :vartype 1082 :vartypmod -1  :varlevelsup 0 :varnoold 0 :varoattno 0}}) :qpqual <> :lefttree { SORT :startup_cost 121007.20 :total_cost 121007.20 :rows 2 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1184 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 23 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 17 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 17}} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 1082 :restypmod -1 :resname <> :reskey 1 :reskeyop 1095 :ressortgroupref 0 :resjunk false } :expr { EXPR :typeOid 1082  :opType func :oper { FUNC :funcid 1178 :functype 1082 } :args ({ VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8})}}) :qpqual <> :lefttree { INDEXSCAN :startup_cost 0.00 :total_cost 121007.19 :rows 2 :width 12 :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 1184 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8}} { TARGETENTRY :resdom { RESDOM :resno 2 :restype 23 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 17 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 17}} { TARGETENTRY :resdom { RESDOM :resno 3 :restype 1082 :restypmod -1 :resname <> :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { EXPR :typeOid 1082  :opType func :oper { FUNC :funcid 1178 :functype 1082 } :args ({ VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8})}}) :qpqual ({ EXPR :typeOid 16  :opType op :oper { OPER :opno 1062 :opid 1070 :opresulttype 16 } :args ( { VAR :varno 1 :varattno 13 :vartype 1043 :vartypmod 35  :varlevelsup 0 :varnoold 1 :varoattno 13} { CONST :consttype 1043 :constlen -1 :constbyval false :constisnull false :constvalue  12 [ 12 0 0 0 53 48 48 52 51 57 57 50 ] })}) :lefttree <> :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :scanrelid 1 :indxi
d ( 2591699100) :indxqual (({ EXPR :typeOid 16  :opType op :oper { OPER :opno 1325 :opid 2056 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8} { CONST :consttype 1184 :constlen 8 :constbyval false :constisnull false :constvalue  8 [ 0 0 0 -64 125 65 -105 65 ] })} { EXPR :typeOid 16  :opType op :oper { OPER :opno 1323 :opid 2055 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8} { CONST :consttype 1184 :constlen 8 :constbyval false :constisnull false :constvalue  8 [ 0 0 0 -68 -59 -128 -105 65 ] })} { EXPR :typeOid 16  :opType op :oper { OPER :opno 1054 :opid 1048 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 2 :vartype 1042 :vartypmod 5  :varlevelsup 0 :varnoold 1 :varoattno 20} { CONST :consttype 1042 :constlen -1 :constbyval false :constisnull false :c onstvalue  5 [ 5 0 0 0 80 ] })})) :indxqualorig (({ EXPR :typeOid 16  :opType op :oper { OPER :opno 1325 :opid 2056 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8} { CONST :consttype 1184 :constlen 8 :constbyval false :constisnull false :constvalue  8 [ 0 0 0 -64 125 65 -105 65 ] })} { EXPR :typeOid 16  :opType op :oper { OPER :opno 1323 :opid 2055 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 8 :vartype 1184 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 8} { CONST :consttype 1184 :constlen 8 :constbyval false :constisnull false :constvalue  8 [ 0 0 0 -68 -59 -128 -105 65 ] })} { EXPR :typeOid 16  :opType op :oper { OPER :opno 1054 :opid 1048 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 20 :vartype 1042 :vartypmod 5  :varlevelsup 0 :varnoold 1 :varoattno 20} { CONST :consttype 1042 :constlen -1 :constbyval false :constisnull false :constva lue  5 [ 5 0 0 0 80 ] })})) :indxorderdir 1 } :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :keycount 1 } :righttree <> :extprm () :locprm () :initplan <> :nprm 0  :numCols 1 :tuplePerGroup true } :righttree <> :extprm () :locprm () :initplan <> :nprm 0 }
NOTICE:  QUERY PLAN:

Aggregate  (cost=121007.20..121007.22 rows=1 width=12) (actual time=10295.75..10298.57 rows=7 loops=1)
  ->  Group  (cost=121007.20..121007.21 rows=2 width=12) (actual time=10295.71..10297.63 rows=417 loops=1)
        ->  Sort  (cost=121007.20..121007.20 rows=2 width=12) (actual time=10295.69..10296.08 rows=417 loops=1)
              ->  Index Scan using mo_200302_called_idx on mobileorig_200302  (cost=0.00..121007.19 rows=2 width=12) (actual time=250.15..10293.56 rows=417 loops=1)
Total runtime: 10298.86 msec

EXPLAIN


Is this what you wanted or am I misunderstanding you ?

Regards
Robert



Tom Lane <tgl@sss.pgh.pa.us>

22/05/2003 17:51

       
        To:        Robert.Farrugia@go.com.mt
        cc:        pgsql-admin@postgresql.org
        Subject:        Re: [ADMIN] After VACUUM, statistics become skewed



Robert.Farrugia@go.com.mt writes:
> Here are the results.

I asked for EXPLAIN ANALYZE for both cases.  When you're worried about
planner estimates diverging from reality, it does not help to not know
what reality is.

                                                  regards, tom lane


Re: After VACUUM, statistics become skewed

От
Tom Lane
Дата:
Robert.Farrugia@go.com.mt writes:
> Running a slightly modified query i.e. removing the in and using an
> equality, the planner uses the "correct" index this time.

Hm.  What were the definitions of the two indexes, exactly?

There is an asymmetry in the planner's handling of AND/OR clauses that
can cause it to fail to recognize that a multicolumn index is usable,
depending on whether the OR clause (the IN part of your query) applies
to the first index column or a later one.  I'm not sure if this relates
to your problem or not, though...

            regards, tom lane

Re: After VACUUM, statistics become skewed

От
Robert.Farrugia@go.com.mt
Дата:

Sorry Tom for the late reply but I was off for the past few days.

The definitions of the indexes are:

mo_200302_calling_idx uses (answertime::timestamp, callingnumber_type::char(1), callingnumber_value::varchar) in this order
mo_200302_called_idx uses (answertime::timestamp, callednumber_type::char(1), callednumber_value::varchar) in this order

If I am understanding you well, the planner may ignore the index if the OR clause uses the second or third column of a multi-column index, even though the first column is used.  Is there a way to force the planner to use a particular index ?

Regards
Robert




Tom Lane <tgl@sss.pgh.pa.us>

25/05/2003 06:12

       
        To:        Robert.Farrugia@go.com.mt
        cc:        pgsql-admin@postgresql.org
        Subject:        Re: [ADMIN] After VACUUM, statistics become skewed



Robert.Farrugia@go.com.mt writes:
> Running a slightly modified query i.e. removing the in and using an
> equality, the planner uses the "correct" index this time.

Hm.  What were the definitions of the two indexes, exactly?

There is an asymmetry in the planner's handling of AND/OR clauses that
can cause it to fail to recognize that a multicolumn index is usable,
depending on whether the OR clause (the IN part of your query) applies
to the first index column or a later one.  I'm not sure if this relates
to your problem or not, though...

                                                  regards, tom lane


Re: After VACUUM, statistics become skewed

От
Tom Lane
Дата:
Robert.Farrugia@go.com.mt writes:
> The definitions of the indexes are:

> mo_200302_calling_idx uses (answertime::timestamp,
> callingnumber_type::char(1), callingnumber_value::varchar) in this order
> mo_200302_called_idx uses (answertime::timestamp,
> callednumber_type::char(1), callednumber_value::varchar) in this order

> If I am understanding you well, the planner may ignore the index if the OR
> clause uses the second or third column of a multi-column index, even
> though the first column is used.  Is there a way to force the planner to
> use a particular index ?

In this particular case the problem is that the planner does not believe
the index is applicable.  The only way to make it think that is to
change the query or the index definition so that the OR clause (i.e.,
the IN) contains references to the first index column.  For example, you
could rewrite your query from its current effective form:

select ... where
    answertime between '2003/2/3 00:00:00' and '2003/2/14 23:59:59'
    and dialleddigits_value = '50043992'
    and (callednumber_type = 'P' or callednumber_type = 'M')

to something like

select ... where
    (answertime between '2003/2/3 00:00:00' and '2003/2/14 23:59:59'
     and dialleddigits_value = '50043992'
     and callednumber_type = 'P')
   OR
    (answertime between '2003/2/3 00:00:00' and '2003/2/14 23:59:59'
     and dialleddigits_value = '50043992'
     and callednumber_type = 'M')

which can be used for two indexscans on the first two columns of
mo_200302_called_idx.  (BTW, if this is a frequent query, maybe you
should have an index that includes dialleddigits_value as the third
column?)  You could factor out "and dialleddigits_value = '50043992'"
or not here, that doesn't matter; but the OR'd-together subclauses
have to all mention the leading term of the index, ie, answertime.

Plan B would be to leave the query alone and change the index column
order so that callednumber_type is first.  This is probably not a great
idea though, since answertime is a much better first key for most
purposes.

It's moderately annoying that the planner is not bright enough to figure
out this transformation for itself.  I suspect what is needed is a whole
fresh look at the OR-index processing algorithm; it seems excessively
complicated and yet it still doesn't do everything one would want.

            regards, tom lane