Re: What exactly is our CRC algorithm?

Поиск
Список
Период
Сортировка
От Gavin Flower
Тема Re: What exactly is our CRC algorithm?
Дата
Msg-id 5435B97F.3010700@archidevsys.co.nz
обсуждение исходный текст
Ответ на Re: What exactly is our CRC algorithm?  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: What exactly is our CRC algorithm?
Список pgsql-hackers
On 09/10/14 10:13, Andres Freund wrote:
> On 2014-10-08 22:13:46 +0300, Heikki Linnakangas wrote:
>> Hmm. So the simple, non-table driven, calculation gives the same result as
>> using the lookup table using the reflected lookup code. That's expected; the
>> lookup method is supposed to be the same, just faster. However, using the
>> "normal" lookup code, but with a "reflected" lookup table, produces the same
>> result as Postgres' algorithm. Indeed, that's what we do in PostgreSQL. But
>> AFAICS, that's an incorrect combination. You're supposed to the
>> non-reflected lookup table with the non-reflected lookup code; you can't mix
>> and match.
>>
>> As far as I can tell, PostgreSQL's so-called CRC algorithm doesn't
>> correspond to any bit-by-bit CRC variant and polynomial. My math skills are
>> not strong enough to determine what the consequences of that are. It might
>> still be a decent checksum. Or not. I couldn't tell if the good error
>> detection properties of the normal CRC-32 polynomial apply to our algorithm
>> or not.
> Additional interesting datapoints are that hstore and ltree contain the
> same tables - but properly use the reflected computation.
>
>> Thoughts?
> It clearly seems like a bad idea to continue with this - I don't think
> anybody here knows which guarantees this gives us.
>
> The question is how can we move away from this. There's unfortunately
> two places that embed PGC32 that are likely to prove problematic when
> fixing the algorithm: pg_trgm and tsgist both seem to include crc's in
> their logic in a persistent way.  I think we should provide
> INIT/COMP/FIN_PG32 using the current algorithm for these.
>
> If we're switching to a saner computation, we should imo also switch to
> a better polynom - CRC-32C has better error detection capabilities than
> CRC32 and is available in hardware. As we're paying the price pf
> breaking compat anyway...
>
> Arguably we could also say that given that there's been little evident
> problems with the borked computation we could also switch to a much
> faster hash instead of continuing to use crc...
>
> Greetings,
>
> Andres Freund
>
Could a 64 bit variant of some kind be useful as an option - in addition 
to a correct 32 bit?

As most people have 64 bit processors and storage is less constrained 
now-a-days, as well as we tend to store much larger chunks of data.


Cheers,
Gavin



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: Promise index tuples for UPSERT
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}