Re: Hints (Was: Index Tuning Features)

Поиск
Список
Период
Сортировка
От Casey Duncan
Тема Re: Hints (Was: Index Tuning Features)
Дата
Msg-id 6A42E0E6-163A-4A7B-B3A0-6D0569E51003@pandora.com
обсуждение исходный текст
Ответ на Hints (Was: Index Tuning Features)  (Andrew Sullivan <ajs@crankycanuck.ca>)
Ответы Re: Hints (Was: Index Tuning Features)
Список pgsql-hackers
On Oct 12, 2006, at 4:26 AM, Andrew Sullivan wrote:

> On Thu, Oct 12, 2006 at 08:34:45AM +0200, Florian Weimer wrote:
>>
>> Some statistics are very hard to gather from a sample, e.g. the  
>> number
>> of distinct values in a column.
>
> Then how can the DBA know it, either?  The problem with this sort of
> argument is always that people are claiming some special knowledge is
> available to the DBA.  If it's true that the DBA really _can_ know
> this stuff, then there must be some way to learn it.  Which means
> that you can, in principle, figure out ways to communicate that to
> the optimizer.

Yes, but it may be much more efficient for the human to tell the  
computer than for the computer to introspect things. Take, for  
example, ndisinct as data grows large. I, the database designer, may  
know (or simply see) that a certain foreign key column will have  
roughly a certain cardinality regardless of how big the table gets.  
It's a lot more efficient for me to tell the system that up front  
then have it need to do a full table scan or tens of millions of rows  
periodically to figure it out, or worse--as it is currently--to come  
up with an estimate that is multiple orders of magnitude off, even  
with the stats target turned all the way up.

I realize that this is a case that is possible to do manually now,  
sort of. I can tweak the stats table myself. But it would be nice if  
you could do it in such a way that it would override what analyze  
comes up with on a case-by-case basis.

We could have a perfect query planner, but feed it bad stats and it  
will still make poor decisions.

I'm of the strong opinion that hinting the data is much better than  
hinting the queries. There tends to be many fewer places you need to  
do that, and new queries can automatically take advantage.

> I like the suggestion, though, that there be ways to codify known
> relationships in the system in such a way that the optimizer can
> learn to use that information.  _That_ seems to me to be a big
> improvement, because it can be taken into consideration along with
> relationships that emerge from the statistics, that the DBA may not
> know about.

I'm all for things the computer can do for me automagically. It's  
just good to have the ability to tell the computer about things you  
know about the data that it either can't efficiently figure out or  
can't figure out at all.

-Casey



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: ./configure argument checking
Следующее
От: "Andrew Dunstan"
Дата:
Сообщение: Re: ./configure argument checking