Re: Storing images in PostgreSQL databases (again)

Поиск
Список
Период
Сортировка
От Robert L Mathews
Тема Re: Storing images in PostgreSQL databases (again)
Дата
Msg-id 452ED18B.7040807@tigertech.com
обсуждение исходный текст
Ответ на Storing images in PostgreSQL databases (again)  (TIJod <tijod@yahoo.fr>)
Ответы Re: Storing images in PostgreSQL databases (again)  (Alexander Staubo <alex@purefiction.net>)
Список pgsql-general
Michelle Konzack <linux4michelle@freenet.de> wrote:

> I do this already but have problems since I have
> stored arround 130 million files on a server...
>
 > ...
 >
> MD5 hashes are 32 Bytes long, maybe they change
> it to 64 Bytes?
>
> I have already over 2000 collisions and checked
> it, that the files are NOT the same.

You mean you have 2000 collisions out of the checksums of 130 million
different files? That can't be right.

An MD5 hash is 128 bits, and using the values found in
<http://en.wikipedia.org/wiki/Birthday_attack>, you don't reach a 50%
chance of a single collision until you've checksummed 2.2 x 10^19
different inputs. That's, ummm, 22,000,000,000,000,000,000, I think,
which is much larger than 130,000,000.

In other words, you should not expect even a single collision until you
have 169,230,769,231 times as many files as you currently have, which
should not be a issue before the end of the useful life of the solar system.

If you have 2000 collisions after 130 million different files (or even
if you have two collisions), something is almost certainly wrong with
your code, unfortunately.

--
Robert L Mathews

  "The trouble with doing something right the first time is
   that nobody appreciates how difficult it was."

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: bad error message
Следующее
От: Alexander Staubo
Дата:
Сообщение: Re: Storing images in PostgreSQL databases (again)