Re: ANALYZE sampling is too good
От | Fabien COELHO |
---|---|
Тема | Re: ANALYZE sampling is too good |
Дата | |
Msg-id | alpine.DEB.2.10.1312070807350.6697@sto обсуждение исходный текст |
Ответ на | Re: ANALYZE sampling is too good (Greg Stark <stark@mit.edu>) |
Список | pgsql-hackers |
> http://en.wikipedia.org/wiki/Cluster_sampling > http://en.wikipedia.org/wiki/Multistage_sampling > > I suspect the hard part will be characterising the nature of the > non-uniformity in the sample generated by taking a whole block. Some > of it may come from how the rows were loaded (e.g. older rows were > loaded by pg_restore but newer rows were inserted retail) or from the > way Postgres works (e.g. hotter rows are on blocks with fewer rows in > them and colder rows are more densely packed). I would have thought that as VACUUM reclaims space it levels that issue in the long run and on average, so that it could be simply ignored? > I've felt for a long time that Postgres would make an excellent test > bed for some aspiring statistics research group. I would say "applied statistics" rather than "research". Nevertheless I can ask my research statistician colleagues next door about their opinion on this sampling question. -- Fabien.
В списке pgsql-hackers по дате отправления: