General purpose hashing func in pgbench
| От | Ildar Musin |
|---|---|
| Тема | General purpose hashing func in pgbench |
| Дата | |
| Msg-id | 0e8bd39e-dfcd-2879-f88f-272799ad7ef2@postgrespro.ru обсуждение исходный текст |
| Ответы |
Re: General purpose hashing func in pgbench
|
| Список | pgsql-hackers |
Hi hackers, Following up the recent discussion on zipfian distribution I was trying to reproduce some YCSB-like workloads. As this paper [1] describes, YCSB uses zipfian distribution to generate keys in order simulate intensive load on small number of records as it happens in real world applications (e.g. blogs). One problem is that most popular records keys are clustered together. To scatter them across the keyspace authors use hashing, the FNV-1a hash function in particular [2]. I've read Fabien Coelho's thread on additional operators and functions. Generally it could be possible to implement some basic hashing algorithms right in a pgbench script using just bitwise and arithmetic operators. But should we probably provide users with some general purpose hash function? The attached patch introduces hash() function which implements FNV-1a as an example of such hashing algorithm. There are also couple of images in the attachement that I have got from visualizing original zipfian distribution and the hashed one. Usage example: In psql: create table abc as select generate_series(0, 999) as a, 0 as b; pgbench script: \set rnd random_zipfian(0, 1000000, 0.99) \set key abs(hash(:rnd)) % 1000 begin; update abc set b = b + 1 where a = :key; end; Any thoughts or suggestions? [1] http://www.brianfrankcooper.net/home/publications/ycsb.pdf [2] https://en.wikipedia.org/wiki/Fowler–Noll–Vo_hash_function Thanks! -- Ildar Musin Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Вложения
В списке pgsql-hackers по дате отправления:

