Re: Dynamlically updating the estimated cost of a transaction

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: Dynamlically updating the estimated cost of a transaction
Дата
Msg-id 4B1DE793.6050908@2ndquadrant.com
обсуждение исходный текст
Ответ на Dynamlically updating the estimated cost of a transaction  (Hasini Gunasinghe <hasi7786@gmail.com>)
Список pgsql-performance
Hasini Gunasinghe wrote:
>
> I am looking for a way to let the user know what the estimated time
> for the current transaction he requested and while the transaction is
> in progress, how much time is elapsed for the transaction  as a
> fraction of the total estimated time at a particular instance, by
> dynamically estimating the time for the transaction at that instance.
I think this one needs to get added to the FAQ.  To re-use from when I
answered this last month:  this just isn't exposed in PostgreSQL yet.
Clients ask for queries to be run, eventually they get rows of results
back, but there's no notion of how many they're going to get in advance
or how far along they are in executing the query's execution plan.
There's a couple of academic projects that have started exposing more of
the query internals, but I'm not aware of anyone who's even started
moving in the direction of what you'd need to produce a progress bar or
estimate a total run-time.  It's a hard problem--you could easily spend
several years of your life on this alone and still not have even a
mediocre way to predict how much time is left to execute a generic query.

In practice, people tend to save query log files showing historical
information about how long queries took to run, and then use that to
predict future response times.  That's a much easier way to get
something useful for a lot of applications than expecting you can ever
estimate just based on an EXPLAIN plan.

> I got to know how Prostgre estimates the cost for a particular
> operation. Does estimated cost means the estimated time to evaluate an
> operation in the context of Postgre? And also may I know if there is
> any way to achieve the requirement I mentioned above, with the Postgre
> SQL?
Estimated costs are not ever used to predict an estimated time.  An
interesting research project would be trying to tie the two together
more tightly, by collecting a bunch of data measuring real EXPLAIN
ANALYZE execution times with their respective cost estimates.

Even after you collected it, actually using the data from such research
is quite tricky.  For example, people have tried to tie some of the
individual cost components to the real world--for example, measuring the
true amount of time it takes to do a sequential read vs. a seek and
adjusting random_page_cost accordingly.  But if you then set
random_page_cost to its real-world value based on that estimate, you get
a value well outside what seems to work for people in practice.  This
suggests the underlying cost estimate doesn't reflect the real-world
value it intends to that closely.  But improving on that situation
without going backwards in the quality of the plans the query optimizer
produces is a tricky problem.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com


В списке pgsql-performance по дате отправления:

Предыдущее
От: Niu Yan
Дата:
Сообщение: error occured in dbt2 against with postgresql
Следующее
От: Ben Brehmer
Дата:
Сообщение: Re: Load experimentation