Re: Flushing large data immediately in pqcomm

Поиск
Список
Период
Сортировка
От Melih Mutlu
Тема Re: Flushing large data immediately in pqcomm
Дата
Msg-id CAGPVpCTN-Z9F5Wsq+LEirAv5OYZWPj1j7mE_Yumqxebsdu8YKw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Flushing large data immediately in pqcomm  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Flushing large data immediately in pqcomm  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers

Robert Haas <robertmhaas@gmail.com>, 31 Oca 2024 Çar, 20:23 tarihinde şunu yazdı:
On Tue, Jan 30, 2024 at 6:39 PM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
> I agree that it's hard to prove that such heuristics will always be
> better in practice than the status quo. But I feel like we shouldn't
> let perfect be the enemy of good here.

Sure, I agree.

> I one approach that is a clear
> improvement over the status quo is:
> 1. If the buffer is empty AND the data we are trying to send is larger
> than the buffer size, then don't use the buffer.
> 2. If not, fill up the buffer first (just like we do now) then send
> that. And if the left over data is then still larger than the buffer,
> then now the buffer is empty so 1. applies.

That seems like it might be a useful refinement of Melih Mutlu's
original proposal, but consider a message stream that consists of
messages exactly 8kB in size. If that message stream begins when the
buffer is empty, all messages are sent directly. If it begins when
there are any number of bytes in the buffer, we buffer every message
forever. That's kind of an odd artifact, but maybe it's fine in
practice. I say again that it's good to test out a bunch of scenarios
and see what shakes out.

Isn't this already the case? Imagine sending exactly 8kB messages, the first pq_putmessage() call will buffer 8kB. Any call after this point simply sends a 8kB message already buffered from the previous call and buffers a new 8kB message. Only difference here is we keep the message in the buffer for a while instead of sending it directly. In theory, the proposed idea should not bring any difference in the number of flushes and the size of data we send in each time, but can remove unnecessary copies to the buffer in this case. I guess the behaviour is also the same with or without the patch in case the buffer has already some bytes.

Robert Haas <robertmhaas@gmail.com>, 31 Oca 2024 Çar, 21:28 tarihinde şunu yazdı:
Personally, I don't think it's likely that anything will get committed
here without someone doing more legwork than I've seen on the thread
so far. I don't have any plan to pick up this patch anyway, but if I
were thinking about it, I would abandon the idea unless I were
prepared to go test a bunch of stuff myself. I agree with the core
idea of this work, but not with the idea that the bar is as low as "if
it can't lose relative to today, it's good enough."

You're right and I'm open to doing more legwork. I'd also appreciate any suggestion about how to test this properly and/or useful scenarios to test. That would be really helpful.

I understand that I should provide more/better analysis around this change to prove that it doesn't hurt (hopefully) but improves some cases even though not all the cases. That may even help us to find a better approach than what's already proposed. Just to clarify, I don't think anyone here suggests that the bar should be at "if it can't lose relative to today, it's good enough". IMHO "a change that improves some cases, but regresses nowhere" does not translate to that.

Thanks,
--
Melih Mutlu
Microsoft

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jelte Fennema-Nio
Дата:
Сообщение: Re: psql not responding to SIGINT upon db reconnection
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Flushing large data immediately in pqcomm