Re: Overhauling GUCS
| От | Gregory Stark | 
|---|---|
| Тема | Re: Overhauling GUCS | 
| Дата | |
| Msg-id | 87ve0ius9y.fsf@oxford.xeocode.com обсуждение исходный текст | 
| Ответ на | Re: Overhauling GUCS ("Hakan Kocaman" <hkocam@googlemail.com>) | 
| Список | pgsql-hackers | 
"Hakan Kocaman" <hkocam@googlemail.com> writes: > On 6/9/08, Gregory Stark <stark@enterprisedb.com> wrote: >> >> n_distinct. For that Josh is right, we *would* need a sample size >> proportional to the whole data set which would practically require us to >> scan the whole table (and have a technique for summarizing the results in a >> nearly constant sized data structure). > > is this (summarizing results in a constant sized data structure) something > which could be achived by Bloom-Filters ? Uhm, it would be a bit of a strange application of them but actually it seems to me that would be a possible approach. It would need a formula for estimating the number of distinct values given the number of bits set in the bloom filter. That should be a tractable combinatorics problem (in fact it's pretty similar to the combinatorics I posted a while back about getting all the drives in a raid array busy). And if you have a dynamic structure where the filter size grows then it would overestimate because extra copied bits would be set. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!
В списке pgsql-hackers по дате отправления: