Re: Syncrep and improving latency due to WAL throttling

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Syncrep and improving latency due to WAL throttling
Дата
Msg-id 20231108212125.h46sdygzg55rooir@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: Syncrep and improving latency due to WAL throttling  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers
Hi,

On 2023-11-08 19:29:38 +0100, Tomas Vondra wrote:
> >>> I haven't checked, but I'd assume that 100bytes back and forth should easily
> >>> fit a new message to update LSNs and the existing feedback response. Even just
> >>> the difference between sending 100 bytes and sending 10k (a bit more than a
> >>> single WAL page) is pretty significant on a 1gbit network.
> >>>
> >>
> >> I'm on decaf so I may be a bit slow, but it's not very clear to me what
> >> conclusion to draw from these numbers. What is the takeaway?
> >>
> >> My understanding is that in both cases the latency is initially fairly
> >> stable, independent of the request size. This applies to request up to
> >> ~1000B. And then the latency starts increasing fairly quickly, even
> >> though it shouldn't hit the bandwidth (except maybe the 1MB requests).
> >
> > Except for the smallest end, these are bandwidth related, I think. Converting
> > 1gbit/s to bytes/us is 125 bytes / us - before tcp/ip overhead. Even leaving
> > the overhead aside, 10kB/100kB outstanding take ~80us/800us to send on
> > 1gbit. If you subtract the minmum latency of about 130us, that's nearly all of
> > the latency.
> >
>
> Maybe I don't understand what you mean "bandwidth related" but surely
> the smaller requests are not limited by bandwidth. I mean, 100B and 1kB
> (and even 10kB) requests have almost the same transaction rate, yet
> there's an order of magnitude difference in bandwidth (sure, there's
> overhead, but this much magnitude?).

What I mean is that bandwidth is the biggest factor determining latency in the
numbers I showed (due to decent sized packet and it being a local network). At
line rate it takes ~80us to send 10kB over 1gbit ethernet. So a roundtrip
cannot be faster than 80us, even if everything else added zero latency.

That's why my numbers show such a lower latency for the 10gbit network - it's
simply faster to put even small-ish amounts of data onto the wire.

That does not mean that the link is fully utilized over time - because we wait
for the other side to receive the data, wake up a user space process, send
back 100 bytes, wait for the data be transmitted, and then wake up a process,
there are periods where the link in one direction is largely idle.  But in
case of a 10kB packet on the 1gbit network, yes, we are bandwidth limited for
~80us (or perhaps more interestingly, we are bandwidth limited for 0.8ms when
sending 100kB).

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: max_parallel_workers question
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: XX000: tuple concurrently deleted during DROP STATISTICS