Re: General purpose hashing func in pgbench
От | Ildar Musin |
---|---|
Тема | Re: General purpose hashing func in pgbench |
Дата | |
Msg-id | 960a8096-dde5-5175-7d7c-ef151086cbee@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: General purpose hashing func in pgbench (Fabien COELHO <coelho@cri.ensmp.fr>) |
Ответы |
Re: General purpose hashing func in pgbench
|
Список | pgsql-hackers |
Hello Fabien, 17/01/2018 10:52, Fabien COELHO пишет: >> Here is a new version of patch. I've splitted it into two parts. The >> first one is almost the same as v4 from [1] with some refactoring. >> The second part introduces random_seed variable as you proposed. > > Patch 1 applies. Compilations fails, there are two "hash_seed" > declared in "pgbench.c". > > Patch 2 applies cleanly on top of the previous patch and compiles, > because the variable is removed... > > If an ":hash_seed" pgbench variable is used, ISTM that there is no > need for a global variable at all, so the two patches are going back > and forth, which is unhelpful. ISTM better to provide just one > combined patch for the feature. > > If the hash_seed variable really needs to be kept, it should be an > "int64" variable, like other pgbench values. > > The len < 1 || len > 2 is checked twice, once in the "switch", on in > an "if" just after the "switch". Once is enough. I totally messed up doing git rebase and didn't double check the code. *facepalm* There shouldn't be hash_seed variable and the second 'len < 1 || len > 2' check. Sorry for that, fixed in the attached patch. > Calling random just usually initializes about 31 bits, so random > should be called 2 or maybe 3 times? Or maybe use the internal getrand > which has 48 pseudorandom bits? Done. I used code from get_random_uint64() as an example. > > For me "random seed" is what is passed to srandom, so the variable > should rather be named hash_seed because there could also be a random > seed (actually, there is in another parallel patch:-). Moreover, this > seed may or may not be random if set, so calling it "random_seed" is > not desirable. > My intention was to introduce seed variable which potentially could be used in different contexts, not only for hash functions. I renamed it to 'hash_seed' for now. But what do you think about naming it simply 'seed' or choosing some other general name? >> I didn't do the executor simplification thing yet because I'm a >> little concerned about inventive users, who may want to change >> random_seed variable in runtime (which is possible since pgbench >> doesn't have read only variables aka constants AFAIK). > > If the user choses to overide hash_seed in their script, it is their > decision, the documentation has only to be clear about :hash_seed > being the default seed. I see no clear reason to work around this > possibility by evaluating the seed at parse time, especially as the > variable may not have its final value yet depending on the option > order. I'd suggest to just use make_variable("hash_seed") for the > default second argument and simplify the executor. That is a great idea, I didn't see that possibility. Done. > > The seed variable is not tested explicitely in the script, you could add > a "hash(5432) == hash(5432, :hash_seed)" for instance. > > It would be nice if an native English speaker could proofread the > documentation text. I'd suggest: "*an* optional seed parameter". "In > case *the* seed...". "<literal>:hash_seed</literal>". "shared for" -> > "shared by". "following listing" -> "following pgbench script". "few > accounts generates" -> "few accounts generate". > Done as well. > For the document example, I'd use larger values for the random & > modulo, eg 100000000 and 1000000. The drawback is that zipfian does a > costly computation of the generalized harmonic number when the > parameter is lower than 1.0. For cities, the parameter found by Zipf > was 1.07 (says Wikipedia). Maybe use this historical value. Or maybe > use an exponential distribution in the example. > Changed parameter to 1.07. Thanks! -- Ildar Musin Postgres Professional: http://www.postgrespro.com Russian Postgres Company
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Claudio FreireДата:
Сообщение: Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem