Re: Pointers needed on optimizing slow SQL statements

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Pointers needed on optimizing slow SQL statements
Дата
Msg-id 603c8f070906071328h121937a2sdc9ee3f22620015@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Pointers needed on optimizing slow SQL statements  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: Pointers needed on optimizing slow SQL statements
Список pgsql-performance
On Sat, Jun 6, 2009 at 4:50 AM, Simon Riggs<simon@2ndquadrant.com> wrote:
> The Function Index solution works, but it would be much better if we
> could get the planner to remember certain selectivities.

I agree.

> I'm thinking a command like
>
>        ANALYZE foo [WHERE .... ]
>
> which would specifically analyze the selectivity of the given WHERE
> clause for use in queries.

I think that's probably not the best syntax, because we don't want to
just do it once; we want to make it a persistent property of the table
so that every future ANALYZE run picks it up.  Maybe something like:

ALTER TABLE <table> ADD ANALYZE <name> (<clause>)
ALTER TABLE <table> DROP ANALYZE <name>

(I'm not in love with this so feel free to suggest improvements.)

One possible problem with this kind of thing is that it could be
inconvenient if the number of clauses that you need to analyze is
large.  For example, suppose you have a table called "object" with a
column called "type_id".  It's not unlikely that the histograms and
MCVs for many of the columns in that table will be totally different
depending on the value of type_id.  There might be enough different
WHERE clauses that capturing their selectivity individually wouldn't
be practical, or at least not convenient.

One possible alternative would be to change the meaning of the
<clause>, so that instead of just asking the planner to gather
selectivity on that one clause, it asks the planner to gather a whole
separate set of statistics for the case where that clause holds.  Then
when we plan a query, we set the theorem-prover to work on the clauses
(a la constraint exclusion) and see if any of them are implied by the
query.  If so, we can use that set of statistics in lieu of the global
table statistics.  There is the small matter of figuring out what to
do if we added multiple clauses and more than one is provable, but
<insert hand-waving here>.

It would also be good to do this automatically whenever a partial
index is present.

...Robert

В списке pgsql-performance по дате отправления:

Предыдущее
От: Dimitri Fontaine
Дата:
Сообщение: Re: Pointers needed on optimizing slow SQL statements
Следующее
От: Laszlo Nagy
Дата:
Сообщение: Re: Why is my stats collector so busy?