Обсуждение: pgsql: Add Bloom filter implementation.
Add Bloom filter implementation. A Bloom filter is a space-efficient, probabilistic data structure that can be used to test set membership. Callers will sometimes incur false positives, but never false negatives. The rate of false positives is a function of the total number of elements and the amount of memory available for the Bloom filter. Two classic applications of Bloom filters are cache filtering, and data synchronization testing. Any user of Bloom filters must accept the possibility of false positives as a cost worth paying for the benefit in space efficiency. This commit adds a test harness extension module, test_bloomfilter. It can be used to get a sense of how the Bloom filter implementation performs under varying conditions. This is infrastructure for the upcoming "heapallindexed" amcheck patch, which verifies the consistency of a heap relation against one of its indexes. Author: Peter Geoghegan Reviewed-By: Andrey Borodin, Michael Paquier, Thomas Munro, Andres Freund Discussion: https://postgr.es/m/CAH2-Wzm5VmG7cu1N-H=nnS57wZThoSDQU+F5dewx3o84M+jY=g@mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/51bc271790eb234a1ba4d14d3e6530f70de92ab5 Modified Files -------------- src/backend/lib/Makefile | 4 +- src/backend/lib/README | 2 + src/backend/lib/bloomfilter.c | 305 +++++++++++++++++++++ src/include/lib/bloomfilter.h | 27 ++ src/test/modules/Makefile | 1 + src/test/modules/test_bloomfilter/.gitignore | 4 + src/test/modules/test_bloomfilter/Makefile | 21 ++ src/test/modules/test_bloomfilter/README | 68 +++++ .../test_bloomfilter/expected/test_bloomfilter.out | 22 ++ .../test_bloomfilter/sql/test_bloomfilter.sql | 19 ++ .../test_bloomfilter/test_bloomfilter--1.0.sql | 11 + .../modules/test_bloomfilter/test_bloomfilter.c | 138 ++++++++++ .../test_bloomfilter/test_bloomfilter.control | 4 + src/tools/pgindent/typedefs.list | 1 + 14 files changed, 625 insertions(+), 2 deletions(-)
On 3/31/18 21:36, Andres Freund wrote: > Add Bloom filter implementation. There is a compiler warning in the tests, with gcc-7: test_bloomfilter.c: In function 'test_bloomfilter': test_bloomfilter.c:40:38: error: '__builtin_snprintf' output may be truncated before the last format character [-Werror=format-truncation=] snprintf(element, sizeof(element), "i" INT64_FORMAT, i); ^ This can be fixed by allocating one more byte for the possible sign, or doing the whole thing in unsigned. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Apr 3, 2018 at 8:39 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > There is a compiler warning in the tests, with gcc-7: > > test_bloomfilter.c: In function 'test_bloomfilter': > test_bloomfilter.c:40:38: error: '__builtin_snprintf' output may be > truncated before the last format character [-Werror=format-truncation=] > snprintf(element, sizeof(element), "i" INT64_FORMAT, i); > ^ > > This can be fixed by allocating one more byte for the possible sign, or > doing the whole thing in unsigned. I don't want to do the whole thing in unsigned, because this ties back fairly directly to an int8 argument from the test_bloomfilter() SQL function interface. I proposed the attached, which makes the buffer one byte larger, per your suggestion. Thanks -- Peter Geoghegan
Вложения
On 2018-04-03 10:38:57 -0700, Peter Geoghegan wrote: > I proposed the attached, which makes the buffer one byte larger, per > your suggestion. Pushed, thanks. - Andres