Re: Do you know the reason for increased max latency due to xlog scaling?

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Do you know the reason for increased max latency due to xlog scaling?
Дата
Msg-id 53039480.7060401@vmware.com
обсуждение исходный текст
Ответ на Re: Do you know the reason for increased max latency due to xlog scaling?  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: Do you know the reason for increased max latency due to xlog scaling?  (Andres Freund <andres@2ndquadrant.com>)
Re: Do you know the reason for increased max latency due to xlog scaling?  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
On 02/18/2014 06:27 PM, Jeff Janes wrote:
> On Tue, Feb 18, 2014 at 3:49 AM, MauMau <maumau307@gmail.com> wrote:
>
>> --- or in other words, greater variance in response times.  With my simple
>> understanding, that sounds like a problem for response-sensitive users.
>
> If you need the throughput provided by 9.4, then using 9.3 gets lower
> variance simply be refusing to do 80% of the assigned work.  If you don't
> need the throughput provided by 9.4, then you probably have some natural
> throttling in place.
>
> If you want a real-world like test, you might try to crank up the -c and -j
> to the limit in 9.3 in a vain effort to match 9.4's performance, and see
> what that does to max latency.  (After all, that is what a naive web app is
> likely to do--continue to make more and more connections as requests come
> in faster than they can finish.)

You're missing MauMau's point. In essence, he's comparing two systems 
with the same number of clients, issuing queries as fast as they can, 
and one can do 2000 TPS while the other one can do 10000 TPS. You would 
expect the lower-throughput system to have a *higher* average latency. 
Each query takes longer, that's why the throughput is lower. If you look 
at the avg_latency columns in the graphs 
(http://hlinnaka.iki.fi/xloginsert-scaling/padding/), that's exactly 
what you see.

But what MauMau is pointing out is that the *max* latency is much higher 
in the system that can do 10000 TPS. So some queries are taking much 
longer, even though in average the latency is lower. In an ideal, 
totally fair system, each query would take the same amount of time to 
execute, and after it's saturated, increasing the number of clients just 
makes that constant latency higher.

Yeah, I'm pretty sure that's because of the extra checkpoints. If you 
look at the individual test graphs, there are clear spikes in latency, 
but the latency is otherwise small. With a higher TPS, you reach 
checkpoint_segments quicker; I should've eliminated that effect in the 
tests I ran...

- Heikki



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "MauMau"
Дата:
Сообщение: Re: [bug fix] postgres.exe fails to start on Windows Server 2012 due to ASLR
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease