Substituting Checksum Algorithm (was: Enabling Checksums)
От | Jeff Davis |
---|---|
Тема | Substituting Checksum Algorithm (was: Enabling Checksums) |
Дата | |
Msg-id | 1366701448.12032.90.camel@jdavis обсуждение исходный текст |
Ответы |
Re: Substituting Checksum Algorithm (was: Enabling Checksums)
Re: Substituting Checksum Algorithm (was: Enabling Checksums) Re: Substituting Checksum Algorithm (was: Enabling Checksums) |
Список | pgsql-hackers |
Starting a new thread to more narrowly address the topic. Attached is my reorganization of Ants's patch here: http://www.postgresql.org/message-id/CA +CSw_vinyf-w45i=M1m__MpJZY=e8S4Nt_KNnpEbtWjTOaSUA@mail.gmail.com My changes: * wrest the core FNV algorithm from the specific concerns of a data page - PageCalcChecksum16 mixes the block number, reduces to 16 bits, and avoids the pd_checksum field - the checksum algorithm is just a pure block checksum with a 32-bit result * moved the FNV checksum into a separate file, checksum.c * added Ants's suggested compilation flags for better optimization * slight update to the paragraph in the README that discusses concerns specific to a data page I do have a couple questions/concerns about Ants's patch though: * The README mentions a slight bias; does that come from the mod (2^16-1)? That's how I updated the README, so I wanted to make sure. * How was the FNV_PRIME chosen? * I can't match the individual algorithm step as described in the README to the actual code. And the comments in the README don't make it clear enough which one is right (or maybe they both are, and I'm just tired). The README says: hash = (hash ^ value) * ((hash ^ value) >> 3) (which is obviously missing the FNV_PRIME factor) and the code says: +#define CHECKSUM_COMP(checksum, value) do {\ + uint32 __tmp = (checksum) ^ (value);\ + (checksum) = __tmp * FNV_PRIME ^ (__tmp >> 3);\ +} while (0) I'm somewhat on the fence about the "shift right". It was discussed in this part of the thread: http://www.postgresql.org/message-id/99343716-5F5A-45C8-B2F6-74B9BA357396@phlo.org I think we should be able to state with a little more clarity in the README why there is a problem with plain FNV-1a, and why this modification is both effective and safe. I'd lean toward simplicity and closer adherence to the published version of the algorithm rather than detecting a few more obscure error patterns. It looks like the modification slows down the algorithm, too. Regards, Jeff Davis
Вложения
В списке pgsql-hackers по дате отправления: