Re: pgbench throttling latency limit

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: pgbench throttling latency limit
Дата
Msg-id 54106E21.8020701@vmware.com
обсуждение исходный текст
Ответ на Re: pgbench throttling latency limit  (Fabien COELHO <coelho@cri.ensmp.fr>)
Ответы Re: pgbench throttling latency limit  (Jan Wieck <jan@wi3ck.info>)
Список pgsql-hackers
On 09/10/2014 05:57 PM, Fabien COELHO wrote:
>
> Hello Heikki,
>
>> I looked closer at the this, and per Jan's comments, realized that we don't
>> log the lag time in the per-transaction log file. I think that's a serious
>> omission; when --rate is used, the schedule lag time is important information
>> to make sense of the result. I think we have to apply the attached patch, and
>> backpatch to 9.4. (The documentation on the log file format needs updating)
>
> Indeed. I think that people do not like it to change. I remember that I
> suggested to change timestamps to "xxxx.yyyyyy" instead of the unreadable
> "xxxx yyy", and be told not to, because some people have tool which
> process the output so the format MUST NOT CHANGE. So my behavior is not to
> avoid touching anything in this area.
>
> I'm fine if you do it, though:-) However I have not time to have a precise
> look at your patch to cross-check it before Friday.

This is different from changing "xxx yyy" to "xxx.yyy" in two ways. 
First, this adds new information to the log that just isn't there 
otherwise, it's not just changing the format for the sake of it. Second, 
in this case it's easy to write a parser for the log format so that it 
works with the old and new formats. Many such programs would probably 
ignore any unexpected extra fields, as a matter of lazy programming, 
while changing the separator from space to a dot would surely require 
changing every parsing program.

We could leave out the lag fields, though, when --rate is not used. 
Without --rate, the lag is always zero anyway. That would keep the file 
format unchanged, when you don't use the new --rate feature. I'm not 
sure if that would be better or worse for programs that might want to 
parse the information.

>> Also, this is bizarre:
>>
>>> int64 wait = (int64) (throttle_delay *
>>>    1.00055271703 * -log(getrand(thread, 1, 10000) / 10000.0));
>>
>> We're using getrand() to generate a uniformly distributed random value
>> between 1 and 10000, and then convert it to a double between 0.0 and 1.0.
>
> The reason for this conversion is to have randomness but to still avoid
> going to extreme multiplier values. The idea is to avoid a very large
> multiplier which would generate (even if it is not often) a 0 tps when 100
> tps is required. The 10000 granularity is basically random but the
> multiplier stays bounded (max 9.2, so the minimum possible tps would be
> around 11 for a target of 100 tps, bar issues from the database for
> processing the transactions).
>
> So although this feature can be discussed and amended, it is fully
> voluntary. I think that it make sense so I would prefer to keep it as is.
> Maybe the comments could be update to be clearer.

Ok, yeah, the comments indeed didn't mention anything about that. I 
don't think such clamping is necessary. With that 9.2x clamp on the 
multiplier, the probability that any given transaction hits it is about 
1/10000. And a delay 9.2 times the average is still quite reasonable, 
IMHO. The maximum delay on my laptop, when pg_erand48() returns DBL_MIN, 
seems to be about 700 x the average, which is still reasonable if you 
run a decent number of transactions. And of course, the probability of 
hitting such an extreme value is miniscule, so if you're just doing a 
few quick test runs with a small total number of transactions, you won't 
hit that.

- Heikki




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: add modulo (%) operator to pgbench
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: pgcrypto: PGP armor headers