От: Manfred Koizar
Тема: Re: Sort and index
Дата: ,
Msg-id: ssa481td7d7jpl31rqdf9m0c64iui6qqsf@email.aon.at
(см: обсуждение, исходный текст)
Ответ на: Re: Sort and index  ("Jim C. Nasby")
Ответы: Re: Sort and index  ("Jim C. Nasby")
Список: pgsql-performance

Скрыть дерево обсуждения

Sort and index  (Andrei Gaspar, )
 Re: Sort and index  ("Dave Held", )
  Re: Sort and index  (Andrei Gaspar, )
  Re: Sort and index  (Michael Fuhr, )
   Re: Sort and index  (Andrei Gaspar, )
  Re: Sort and index  ("Jim C. Nasby", )
   Re: Sort and index  (Tom Lane, )
    Re: Sort and index  ("Jim C. Nasby", )
     Re: Sort and index  ("Jim C. Nasby", )
      Re: Sort and index  (Tom Lane, )
       Re: Sort and index  ("Jim C. Nasby", )
        Re: Sort and index  (Tom Lane, )
         Re: Sort and index  ("Jim C. Nasby", )
          Re: Sort and index  (Manfred Koizar, )
           Re: Sort and index  ("Jim C. Nasby", )
            Re: Sort and index  (Manfred Koizar, )
             Re: Sort and index  ("Jim C. Nasby", )

On Sun, 24 Apr 2005 17:01:46 -0500, "Jim C. Nasby" <>
wrote:
>> >> Feel free to propose better cost equations.

I did.  More than once.

>estimated index scan cost for (project_id, id, date) is
>0.00..100117429.34 while the estimate for work_units is
>0.00..103168408.62; almost no difference,

~3%

> even though project_id correlation is .657

This is divided by the number of index columns, so the index correlation
is estimated to be 0.219.

> while work_units correlation is .116.

So csquared is 0.048 and 0.013, respectively, and you get a result not
far away from the upper bound in both cases.  The cost estimations
differ by only 3.5% of (max_IO_cost - min_IO_cost).

>you'll see that the cost of the index scan is way overestimated. Looking
>at the code, the runcost is calculated as
>
>    run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost);
>
>where csquared is indexCorrelation^2. Why is indexCorrelation squared?
>The comments say a linear interpolation between min_IO and max_IO is
>used, but ISTM that if it was linear then instead of csquared,
>indexCorrelation would just be used.

In my tests I got much more plausible results with

    1 - (1 - abs(correlation))^2

Jim, are you willing to experiment with one or two small patches of
mine?  What version of Postgres are you running?

Servus
 Manfred


В списке pgsql-performance по дате сообщения:

От: "Jim C. Nasby"
Дата:
Сообщение: Re: Partitioning / Clustering
От: PFC
Дата:
Сообщение: Re: Partitioning / Clustering