Re: pgbench - exclude pthread_create() from connection start timing

Поиск
Список
Период
Сортировка
От Fabien COELHO
Тема Re: pgbench - exclude pthread_create() from connection start timing
Дата
Msg-id alpine.DEB.2.02.1309260852540.29589@sto
обсуждение исходный текст
Ответ на Re: pgbench - exclude pthread_create() from connection start timing  (Noah Misch <noah@leadboat.com>)
Ответы Re: pgbench - exclude pthread_create() from connection start timing  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
>> pgbench changes, when adding the throttling stuff. Having the start time
>> taken when the thread really starts is just sanity, and I needed that
>> just to rule out that it was the source of the "strange" measures.
>
> I don't get it; why is taking the time just after pthread_create() more sane
> than taking it just before pthread_create()?

Thread create time seems to be expensive as well, maybe up 0.1 seconds 
under some conditions (?). Under --rate, this create delay means that 
throttling is laging behind schedule by about that time, so all the first 
transactions are trying to catch up.

> typically far more expensive than pthread_create().  The patch for threaded
> pgbench made the decision to account for pthread_create() as though it were
> part of establishing the connection.  You're proposing to not account for it
> all.  Both of those designs are reasonable to me, but I do not comprehend the
> benefit you anticipate from switching from one to the other.
>
>> -j 800 vs -j 100 : ITM that if I you create more threads, the time delay
>> incurred is cumulative, so the strangeness of the result should worsen.
>
> Not in general; we do one INSTR_TIME_SET_CURRENT() per thread, just before
> calling pthread_create().  However, thread 0 is a special case; we set its
> start time first and actually start it last.  Your observation of cumulative
> delay fits those facts.

Yep, that must be thread 0 which has a very large delay. I think it is 
simpler that each threads record its start time when it has started, 
without exception.

>  Initializing the thread-0 start time later, just before calling its 
> threadRun(), should clear this anomaly without changing other aspects of 
> the measurement.

Always taking the thread start time when the thread is started does solve 
the issue as well, and it is homogeneous for all cases, so the solution I 
suggest seems reasonable and simple.

> While pondering this area of the code, it occurs to me -- shouldn't we 
> initialize the throttle rate trigger later, after establishing 
> connections and sending startup queries?  As it stands, we build up a 
> schedule deficit during those tasks.  Was that deliberate?

On the principle, I agree with you.

The connection creation time is another thing, but it depends on the 
options set. Under some options the connection is open and closed for 
every transaction, so there is no point in avoiding it in the measure or 
in the scheduling, and I want to avoid having to distinguish those cases. 
Morover, ISTM that one of the thread reuse the existing connection while 
other recreate is. So I left it "as is".

-- 
Fabien.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Support for REINDEX CONCURRENTLY
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE