Re: cyclical redundancy checksum algorithm(s)?

Поиск
Список
Период
Сортировка
От Teodor Sigaev
Тема Re: cyclical redundancy checksum algorithm(s)?
Дата
Msg-id 451B7F8E.30303@sigaev.ru
обсуждение исходный текст
Ответ на Re: cyclical redundancy checksum algorithm(s)?  ("Karen Hill" <karen_hill22@yahoo.com>)
Список pgsql-general
>> You sure that's actually what he said?  A change in CRC proves the data
>> changed, but lack of a change does not prove it didn't.
>
> "To quickly determine if rows have changed, we rely on a cyclic
> redundancy checksum (CRC) algorithm.   If the CRC is identical for the
 >
>> "summary" functions, such as an MD5 hash.  I wouldn't trust it at all
>> with a 32-bit CRC, and not much with a 64-bit CRC.  Too much risk of
>> collision.

Small example of collisions for crc32:
0x38ee5531
         Hundemarke
         92294
0x59471e4f
         raciner
         tranchefiler
0x947bb6c0
         Betriebsteile
         4245


I had make a lot of work when choosing hash function for tsearch2. Also, I had
find that popular hash algorithms produce more collision for non-ascii
languages... CRC32 is more "smooth".
On dictionary with 332296 unique words CRC32 produces 11 collisions, perl's hash
function - 35, pgsql's hash_any - 12.

--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

В списке pgsql-general по дате отправления:

Предыдущее
От: Matthias.Pitzl@izb.de
Дата:
Сообщение: Definition of return types for own functions?
Следующее
От: Teodor Sigaev
Дата:
Сообщение: Re: Full Text fuzzy search