Re: generalizing the planner knobs

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: generalizing the planner knobs
Дата
Msg-id 87lkz0c9vg.fsf@stark.xeocode.com
обсуждение исходный текст
Ответ на Re: generalizing the planner knobs  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: generalizing the planner knobs  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Is it worth pointing out that using the same plan all the time is *no*
> recipe for guaranteeing response time?  There is no such thing as a plan
> that is good for every case --- outlying data values can make a
> usually-good plan blow out your performance guarantee anyway.  Disabling
> the planner is just a recipe for ensuring that that will happen, IMHO.

But outlying data is something the user has control over. The user when
approving plans needs to be aware not just that the plan is experimentally
good, but that it will perform reliably within the constraints based on his
knowledge of the application and the data.

My point is that I don't need a plan that is "good" for every case. I need a
plan I can trust to perform as expected. If my boss asks me what impact
doubling the number of users will have I need to be able to answer "it'll be
at worst twice as slow" (knowing that my queries and the plans I've seen are
all O(users)).

If twice as slow is still tolerable then that's fine, even if a faster plan
was possible. What I don't want to say is "well we'll have to try it and see"
which is all I can say if there's a risk the plans will change.

Actually I would expect the facility to only be useful if there was still a
way to update the plans. The DBA would periodically re-analyze the queries in
the system and check any changed plans to ensure they were still reasonable.
Effectively the same as my script except at the query level rather than at the
database statistics level.

The scenario where this is useful is not in a development environment where
things are changing dynamically. But rather in a mature application where the
data distribution is well established. Outlying data almost certainly
represents an application bug and should be signalled, not allowed to
spontaneously bring down the rest of the system.

As anecdotal evidence, in the last job where I worked, once we had 10 million
users and over a hundred web requests per second it would have been pretty
hard to believe any reasonable query could involve a sequential scan.
Certainly no query that the web application should be performing itself
without human intervention. If for whatever reason there was some outlying
data point where that would have been the "right" plan it would have
immediately brought down the web site.

-- 
greg




В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Andrew Dunstan"
Дата:
Сообщение: Re: [PATCHES] snprintf() argument reordering not working
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: generalizing the planner knobs