Re: Parallel Apply

Поиск
Список
Период
Сортировка
От Andrei Lepikhov
Тема Re: Parallel Apply
Дата
Msg-id bc9b3e10-fa02-4b58-9cc8-7ed9d80de496@gmail.com
обсуждение исходный текст
Ответ на RE: Parallel Apply  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Ответы Re: Parallel Apply
RE: Parallel Apply
Список pgsql-hackers
On 18/12/25 07:44, Hayato Kuroda (Fujitsu) wrote:
> Dear Andrei,
> 
>>> I have been spending time for benchmarking the patch set. Here is an updated
>>> report.
>>>
>> I apologise if my question is incorrect. But what about asynchronous
>> replication? Does this method help to reduce lag?
>>
>> My case is a replica located far from the main instance. There are an
>> inevitable lag exists. Do your benchmarks provide any insights into the
>> lag reduction?
> 
> Yes, ideally parallel apply can reduce the lag, but note that it affects after
> changes are reached to the subscriber. It may not be so effective if lag is
> caused by the network. If your transaction is large and you did not enable the
> streaming option, changing it to 'on' or 'parallel' can improve the lag.
> It allows to replicate changes before huge transactions are committed.

Sorry if I was inaccurate. I want to understand the scope of this 
feature: what benefit does the code provide to the current master in the 
case of async LR? Of course, it is a prerequisite to enable streaming 
and parallel apply - without these settings, your code is not working, 
is it?

Put aside transaction sizes - it's usually hard to predict. We may think 
about a mix, but it would be enough to benchmark two corner cases - very 
short (single row) and long  (let’s say 10% of a table) transactions to 
be sure we have no degradation.

I just wonder if the main use case for this approach is synchronous 
commit and a good-enough network. Is it correct?

> 
>> Or the WALsender process that decodes WAL records from a
>> hundred actively committing backends, a bottleneck here?
> 
> Can you clarify your use case bit more? E.g., how many instances subscribe the
> change from the same publisher. The cheat sheet [1] may be helpful to distinguish
> the bottleneck.

I keep in mind two cases (For simplicity, let’s imagine we have only one 
publisher-subscriber.):

1. We have a low-latency network. If we add more and more load to the 
main instance, which process will be the first bottleneck: walsender or 
subscriber?
2. We have a stable load and walsender cope the WAL decoding and fills 
the output socket with transactions. In case latency goes down 
(geographically distributed configuration), may we profit from these new 
changes in parallel apply feature if the network bandwidth is wide enough?

-- 
regards, Andrei Lepikhov,
pgEdge



В списке pgsql-hackers по дате отправления: