Re: PATCH: pgbench - merging transaction logs

Поиск
Список
Период
Сортировка
От didier
Тема Re: PATCH: pgbench - merging transaction logs
Дата
Msg-id CAJRYxuJNJgqK_eyYJwNfhzdnQA-cYmMsczSGwLw5BrzHAz2RXw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: PATCH: pgbench - merging transaction logs  (Fabien COELHO <coelho@cri.ensmp.fr>)
Ответы Re: PATCH: pgbench - merging transaction logs  (Fabien COELHO <coelho@cri.ensmp.fr>)
Список pgsql-hackers
Hi,

On Sat, Mar 21, 2015 at 8:42 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>
> Hello Didier,
>
>>> If fprintf takes p = 0.025 (1/40) of the time, then with 2 threads the
>>> collision probability would be about 1/40 and the delayed thread would be
>>> waiting for half this time on average, so the performance impact due to
>>> fprintf locking would be negligeable (1/80 delay occured in 1/40 cases =>
>>> 1/3200 time added on the computed average, if I'm not mistaken).
Yes but for a third thread (each on a physical core) it will be 1/40 +
1/40 and so on up to roughly 40/40 for 40 cores.

>
>
>> If  threads run more or less the same code with the same timing after
>> a while they will lockstep  on synchronization primitives and your
>> collision probability will be very close to 1.
>
>
> I'm not sure I understand. If transaction times were really constant, then
> after a while the mutexes would be synchronised so as to avoid contention,
> i.e. the collision probability would be 0?
But they aren't constant only close. It may or not show up in this
case but I've noticed that often the collision rate is a lot higher
than the probability would suggest, I'm not sure why,

>
>> Moreover  they will write to the same cache lines for every fprintf
>> and this is very very bad even without atomic operations.
>
>
> We're talking of transactions that involve network messages and possibly
> disk IOs on the server, so some cache issues issues within pgbench would not
> be a priori the main performance driver.
Sure but :
- good measurement is hard and by adding locking in fprintf it make
its timing more noisy.

- it's against 'good practices' for scalable code. Trivial code can
show that elapsed time for as low as  four cores writing to same cache
line in a loop, without locking or synchronization, is greater than
the elapsed time for running these four loops sequentially on one
core. If they write to different cache lines it scales linearly.

Regards
Didier



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Table-level log_autovacuum_min_duration
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: PATCH: numeric timestamp in log_line_prefix