Re: pgbench - allow to specify scale as a size
От | Fabien COELHO |
---|---|
Тема | Re: pgbench - allow to specify scale as a size |
Дата | |
Msg-id | alpine.DEB.2.20.1802190832140.10483@lancre обсуждение исходный текст |
Ответ на | Re: pgbench - allow to specify scale as a size (Alvaro Hernandez <aht@ongres.com>) |
Список | pgsql-hackers |
Hello Alvaro & Tom, >>> Why not then insert a "few" rows, measure size, truncate the table, >>> compute the formula and then insert to the desired user requested >>> size? (or insert what should be the minimum, scale 1, measure, and >>> extrapolate what's missing). It doesn't sound too complicated to me, >>> and targeting a size is something that I believe it's quite good for >>> user. >> >> The formula I used approximates the whole database, not just one table. >> There was one for the table, but this is only part of the issue. In >> particular, ISTM that index sizes should be included when caching is >> considered. >> >> Also, index sizes are probably in n ln(n), so some level of >> approximation is inevitable. >> >> Moreover, the intrinsic granularity of TPC-B as multiple of 100,000 >> rows makes it not very precise wrt size anyway. > > Sure, makes sense, so my second suggestion seems more reasonable: insert > with scale 1, measure there (ok, you might need to crete indexes only to > later drop them), and if computed scale > 1 then insert whatever is left > to insert. Shouldn't be a big deal to me. I could implement that, even if it would lead to some approximation nevertheless: ISTM that the very large scale regression performed by Kaarel is significantly more precise than testing with scale 1 (typically a few MiB) and extrapolation that to hundreds of GiB. Maybe it could be done with kind of an open ended dichotomy, but creating and recreating index looks like an ugly solution, and what should be significant is the whole database size, including tellers & branches tables and all indexes, so I'm not convinced. Now as tellers & branches tables have basically the same structure as accounts, it could be just scaled by assuming that it would incur the same storage per row. Anyway, even if I do not like it, it could be better than nothing. The key point for me is that if Tom is dead set against the feature the patch is dead anyway. Tom, would Alvaro approach be more admissible to you that a fixed formula that would need updating, keeping in mind that such a feature implies some level approximation? -- Fabien.
В списке pgsql-hackers по дате отправления: