Re: rand48 replacement
| От | Fabien COELHO | 
|---|---|
| Тема | Re: rand48 replacement | 
| Дата | |
| Msg-id | alpine.DEB.2.22.394.2107022340250.2359988@pseudo обсуждение исходный текст | 
| Ответ на | Re: rand48 replacement (Dean Rasheed <dean.a.rasheed@gmail.com>) | 
| Ответы | Re: rand48 replacement | 
| Список | pgsql-hackers | 
Hello Dean, > It may be true that the bias is of the same magnitude as FP multiply, > but it is not of the same nature. With FP multiply, the > more-likely-to-be-chosen values are more-or-less evenly distributed > across the range, whereas modulo concentrates them all at one end, > making it more likely to bias test results. Yes, that is true. > It's worth paying attention to how other languages/libraries implement > this, and basically no one chooses the modulo method, which ought to > raise alarm bells. Of the biased methods, it has the worst kind of > bias and the worst performance. Hmmm. That is not exactly how I interpreted the figures in the blog. > If a biased method is OK, then the biased integer multiply method > seems to be the clear winner. This requires the high part of a > 64x64-bit product, which is trivial if 128-bit integers are available, > but would need a little more work otherwise. There's some code in > common/d2s that might be suitable. And yes, modulo is expensive. If we allow 128 bits integers operations, I would not choose this RNPG in the first place, I'd take PCG with a 128 bits state. That does not change the discussion about bias, though. > Most other implementations tend to use an unbiased method though, and I > think it's worth doing the same. It might be a bit slower, or even > faster depending on implementation and platform, but in the context of > the DB as a whole, I don't think a few extra cycles matters either way. Ok ok ok, I surrender! > The method recommended at the very end of that blog seems to be pretty > good all round, but any of the other commonly used unbiased methods > would probably be OK too. That does not address my other issues with the proposed methods, in particular the fact that the generated sequence is less deterministic, but I think I have a simple way around that. I'm hesitating to skip to the the bitmask method, and give up performance uniformity. I'll try to come up with something over the week-end, and also address Tom's comments in passing. -- Fabien.
В списке pgsql-hackers по дате отправления: