AW: Call for alpha testing: planner statistics revision s

Поиск
Список
Период
Сортировка
От Zeugswetter Andreas SB
Тема AW: Call for alpha testing: planner statistics revision s
Дата
Msg-id 11C1E6749A55D411A9670001FA68796336832B@sdexcsrv1.f000.d0188.sd.spardat.at
обсуждение исходный текст
Ответы Re: AW: Call for alpha testing: planner statistics revision s  (Alex Pilosov <alex@pilosoft.com>)
Re: AW: Call for alpha testing: planner statistics revision s  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
First of all thanks for the great effort, it will surely be appreciated :-)

> * On large tables, ANALYZE uses a random sample of rows rather than
> examining every row, so that it should take a reasonably short time
> even on very large tables.  Possible downside: inaccurate stats.
> We need to find out if the sample size is large enough.

Imho that is not optimal :-) ** ducks head, to evade flying hammer **
1. the random sample approach should be explicitly requested with some 
syntax extension
2. the sample size should also be tuneable with some analyze syntax 
extension (the dba chooses the tradeoff between accuracy and runtime)
3. if at all, an automatic analyze should do the samples on small tables,
and accurate stats on large tables

The reasoning behind this is, that when the optimizer does a "mistake"
on small tables the runtime penalty is small, and probably even beats
the cost of accurate statistics lookup. (3 page table --> no stats 
except table size needed)

When on the other hand the optimizer does a "mistake" on a huge table
the difference is easily a matter of hours, thus you want accurate stats.

Because we do not want the dba to decide which statistics are optimal,
there should probably be an analyze helper application that is invoked
with "vacuum analyze database optimal" or some such, that also decides 
whether a table was sufficiently altered to justify new stats gathering
or vacuum. The decision, what to do may also be based on a runtime limit, 
that the dba specifies ("do the most important stats/vacuums you can do 
within ~3 hours"). 

These points are also based on experience with huge SAP/R3 installations
and the way statistics are gathered there.

Andreas


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Zeugswetter Andreas SB
Дата:
Сообщение: AW: RE: Row Versioning, for jdbc updateable result sets
Следующее
От: Zeugswetter Andreas SB
Дата:
Сообщение: AW: timestamp with/without time zone