Re: Enabling Checksums
От | Ants Aasma |
---|---|
Тема | Re: Enabling Checksums |
Дата | |
Msg-id | CA+CSw_vdR=h5Au1imBan1KREoyE6=jHva6fDwgG2rc3GW6Dfeg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Enabling Checksums (Andres Freund <andres@2ndquadrant.com>) |
Ответы |
Re: Enabling Checksums
(Ants Aasma <ants@cybertec.at>)
|
Список | pgsql-hackers |
On Thu, Apr 18, 2013 at 9:09 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-04-18 00:44:02 +0300, Ants Aasma wrote: >> I went ahead and coded up both the parallel FNV-1a and parallel FNV-1a >> + srl1-xor variants and ran performance tests and detection rate tests >> on both. >> >> Performance results: >> Mul-add checksums: 12.9 bytes/s >> FNV-1a checksums: 13.5 bytes/s >> FNV-1a + srl-1: 7.4 bytes/s >> >> Detection rates: >> False positive rates: >> Add-mul FNV-1a FNV-1a + srl-1 >> Single bit flip: 1:inf 1:129590 1:64795 >> Double bit flip: 1:148 1:511 1:53083 >> Triple bit flip: 1:673 1:5060 1:61511 >> Quad bit flip: 1:1872 1:19349 1:68320 >> Write 0x00 byte: 1:774538137 1:118776 1:68952 >> Write 0xFF byte: 1:165399500 1:137489 1:68958 >> Partial write: 1:59949 1:71939 1:89923 >> Write garbage: 1:64866 1:64980 1:67732 >> Write run of 00: 1:57077 1:61140 1:59723 >> Write run of FF: 1:63085 1:59609 1:62977 >> >> Test descriptions: >> N bit flip: picks N random non-overlapping bits and flips their value. >> Write X byte: overwrites a single byte with X. >> Partial write: picks a random cut point, overwrites everything from >> there to end with 0x00. >> Write garbage/run of X: picks two random cut points and fills >> everything in between with random values/X bytes. > > I don't think this table is complete without competing numbers for > truncated crc-32. Any chance to get that? I didn't have time to run the full test set, the CRC32 is so slow that the test would take 7 hours so I ran it on 10% of the dataset. The number shouldn't be off by much as that still gives about 3.6M probes for each test. CRC32C slice-by-8: 0.57 bytes/cycle Single bit flip: 1:inf Double bit flip: 1:33105 Triple bit flip: 1:inf Quad bit flip: 1:31665 Write 0x00 byte: 1:181934 Write 0xFF byte: 1:230424 Partial write: 1:324 Write garbage: 1:75059 Write run of 0: 1:57951 Write run of FF: 1:65677 The behavior for bit flips is about what is expected. A bias towards detecting odd number of bit flips is probably behind the better than uniform detection rate of byte overwrites. The partial write is very odd and might be some kind of bug, although I'm not sure yet what. Will investigate. I also did avalanche diagrams for the two FNV algorithms discussed. Basically the methodology is that I generated pages with random data, took their checksum and then tried flipping each bit on the page, counting for each checksum bit how many times it was affected by the input bit change. Ideally each input bit affects each output bit with 50% probability. The attached images are created for 1M random pages (1 petabyte of data checksummed for anyone counting). Each 32x16 block corresponds to how each 32bit word affects the 16 bits of the checksum. Black is ideal 50% flip rate, blue is 5% bias (+-2.5%), green is 33%, yellow is 75% and red is 100% bias (output is never flipped or always flipped). High bias reduces error detection rate for bit errors in the given bits. This confirms the analytical result that high bits in plain FNV are not well dispersed. The dispersal pattern of FNV-1a ^ srl-3 however looks great. Only the last 128 bytes are not well mixed. I'd say that if we introduce one more round of mixing the result would be about as good as we can hope for. I'll generate an avalanche diagram for CRC32C too, but it will take a while even if I use a smaller dataset. Regards, Ants Aasma -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de
Вложения
В списке pgsql-hackers по дате отправления: