Re: ANALYZE sampling is too good

Поиск

Список

Период

Сортировка

От	Greg Stark
Тема	Re: ANALYZE sampling is too good
Дата	6 декабря 2013 г. 19:06:31
Msg-id	CAM-w4HPDaioC9epxviuNkD-8ZnYeBSb7Z=uHQUkTohMfdkgFVQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: ANALYZE sampling is too good (Andres Freund <andres@2ndquadrant.com>)
Ответы	Re: ANALYZE sampling is too good
Список	pgsql-hackers

Дерево обсуждения

It looks like this is a fairly well understood problem because in the
real world it's also often cheaper to speak to people in a small
geographic area or time interval too. These wikipedia pages sound
interesting and have some external references:

http://en.wikipedia.org/wiki/Cluster_sampling
http://en.wikipedia.org/wiki/Multistage_sampling

I suspect the hard part will be characterising the nature of the
non-uniformity in the sample generated by taking a whole block. Some
of it may come from how the rows were loaded (e.g. older rows were
loaded by pg_restore but newer rows were inserted retail) or from the
way Postgres works (e.g. hotter rows are on blocks with fewer rows in
them and colder rows are more densely packed).

I've felt for a long time that Postgres would make an excellent test
bed for some aspiring statistics research group.


-- 
greg

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tom Lane
Дата: 06 декабря 2013 г., 19:03:01
Сообщение: Re: Proof of concept: standalone backend with full FE/BE protocol

Следующее

От: Tom Lane
Дата: 06 декабря 2013 г., 19:10:31
Сообщение: Re: pg_archivecleanup bug

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: ANALYZE sampling is too good

Предыдущее

Следующее