On Sat, Sep 13, 2014 at 12:55:33PM -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-09-13 08:52:33 +0300, Ants Aasma wrote:
> >> On Sat, Sep 13, 2014 at 6:59 AM, Arthur Silva <arthurprs@gmail.com> wrote:
> >>> That's not entirely true. CRC-32C beats pretty much everything with the same
> >>> length quality-wise and has both hardware implementations and highly
> >>> optimized software versions.
>
> >> For better or for worse CRC is biased by detecting all single bit
> >> errors, the detection capability of larger errors is slightly
> >> diminished. The quality of the other algorithms I mentioned is also
> >> very good, while producing uniformly varying output.
>
> > There's also much more literature about the various CRCs in comparison
> > to some of these hash allgorithms.
>
> Indeed. CRCs have well-understood properties for error detection.
> Have any of these new algorithms been analyzed even a hundredth as
> thoroughly? No. I'm unimpressed by evidence-free claims that
> something else is "also very good".
>
> Now, CRCs are designed for detecting the sorts of short burst errors
> that are (or were, back in the day) common on phone lines. You could
> certainly make an argument that that's not the type of threat we face
> for PG data. However, I've not seen anyone actually make such an
> argument, let alone demonstrate that some other algorithm would be better.
> To start with, you'd need to explain precisely what other error pattern
> is more important to defend against, and why.
>
> regards, tom lane
>
Here is a blog on the development of xxhash:
http://fastcompression.blogspot.com/2012/04/selecting-checksum-algorithm.html
Regards,
Ken