Re: proposal : cross-column stats

Поиск
Список
Период
Сортировка
От tv@fuzzy.cz
Тема Re: proposal : cross-column stats
Дата
Msg-id 3e869cc4f3a74f9cddff605f57f32605.squirrel@sq.gransy.com
обсуждение исходный текст
Ответ на Re: proposal : cross-column stats  (Nicolas Barbier <nicolas.barbier@gmail.com>)
Ответы Re: proposal : cross-column stats  (Tomas Vondra <tv@fuzzy.cz>)
Список pgsql-hackers
> 2010/12/24 Florian Pflug <fgp@phlo.org>:
>
>> On Dec23, 2010, at 20:39 , Tomas Vondra wrote:
>>
>>>   I guess we could use the highest possible value (equal to the number
>>>   of tuples) - according to wiki you need about 10 bits per element
>>>   with 1% error, i.e. about 10MB of memory for each million of
>>>   elements.
>>
>> Drat. I had expected these number to come out quite a bit lower than
>> that, at least for a higher error target. But even with 10% false
>> positive rate, it's still 4.5MB per 1e6 elements. Still too much to
>> assume the filter will always fit into memory, I fear :-(
>
> I have the impression that both of you are forgetting that there are 8
> bits in a byte. 10 bits per element = 1.25MB per milion elements.

We are aware of that, but we really needed to do some very rough estimates
and it's much easier to do the calculations with 10. Actually according to
wikipedia it's not 10bits per element but 9.6, etc. But it really does not
matter if there is 10MB or 20MB of data, it's still a lot of data ...

Tomas



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: SQL/MED - core functionality
Следующее
От: Florian Pflug
Дата:
Сообщение: Re: proposal : cross-column stats