Re: Maximum statistics target

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: Maximum statistics target
Дата
Msg-id 200803101926.43786.peter_e@gmx.net
обсуждение исходный текст
Ответ на Re: Maximum statistics target  (Gregory Stark <stark@enterprisedb.com>)
Ответы Re: Maximum statistics target  (Decibel! <decibel@decibel.org>)
Список pgsql-hackers
Am Montag, 10. März 2008 schrieb Gregory Stark:
> > It's not possible to believe that you'd not notice O(N^2) behavior for N
> > approaching 800000 ;-).  Perhaps your join columns were unique keys, and
> > thus didn't have any most-common-values?
>
> We could remove the hard limit on statistics target and impose the limit
> instead on the actual size of the arrays. Ie, allow people to specify
> larger sample sizes and discard unreasonably large excess data (possibly
> warning them when that happens).

I have run some more useful tests now with more distinct values.  The planning 
times do increase, but this is not the primary worry.  If you want to spend 
20 seconds of planning to speed up your query by 40 seconds, this could 
surely be a win in some scenarios, and not a catastrophic loss if not.  The 
practical problems lie with memory usage in ANALYZE, in two ways.  First, at 
some point it will try to construct pg_statistic rows that don't fit into the 
1GB limit, as mentioned upthread.  You get a funny error message and it 
aborts.  This is fixable with some cosmetics.  Second, ANALYZE appears to 
temporarily leak memory (it probably doesn't bother to free things along the 
way, as most of the code does), and so some not so large statistics targets 
(say, 40000) can get your system swapping like crazy.  A crafty user could 
probably kill the system that way, perhaps even with the restricted settings 
we have now.  I haven't inspected the code in detail yet, but I imagine a few 
pfree() calls and/or a counter that checks the current memory usage against 
maintenance_work_mem could provide additional safety.  If we could get 
ANALYZE under control, then I imagine this would provide a more natural upper 
bound for the statistics targets, and it would be controllable by the 
administrator.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Terminating a backend
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Detecting large-file support in configure