Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes

Поиск
Список
Период
Сортировка
От SATYANARAYANA NARLAPURAM
Тема Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes
Дата
Msg-id CAHg+QDc+8SDf-oJPsQtUT=HCgtNVoytzCyjD80mMiDw1=-6D+w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers


On Wed, Dec 29, 2021 at 11:16 AM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

On Wed, Dec 29, 2021 at 14:04 SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> wrote:
Stephen, thank you!

On Wed, Dec 29, 2021 at 5:46 AM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* SATYANARAYANA NARLAPURAM (satyanarlapuram@gmail.com) wrote:
> On Sat, Dec 25, 2021 at 9:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > On Sun, Dec 26, 2021 at 10:36 AM SATYANARAYANA NARLAPURAM <
> > satyanarlapuram@gmail.com> wrote:
> >>> Actually all the WAL insertions are done under a critical section
> >>> (except few exceptions), that means if you see all the references of
> >>> XLogInsert(), it is always called under the critical section and that is my
> >>> main worry about hooking at XLogInsert level.
> >>>
> >>
> >> Got it, understood the concern. But can we document the limitations of
> >> the hook and let the hook take care of it? I don't expect an error to be
> >> thrown here since we are not planning to allocate memory or make file
> >> system calls but instead look at the shared memory state and add delays
> >> when required.
> >>
> >>
> > Yet another problem is that if we are in XlogInsert() that means we are
> > holding the buffer locks on all the pages we have modified, so if we add a
> > hook at that level which can make it wait then we would also block any of
> > the read operations needed to read from those buffers.  I haven't thought
> > what could be better way to do this but this is certainly not good.
> >
>
> Yes, this is a problem. The other approach is adding a hook at
> XLogWrite/XLogFlush? All the other backends will be waiting behind the
> WALWriteLock. The process that is performing the write enters into a busy
> loop with small delays until the criteria are met. Inability to process the
> interrupts inside the critical section is a challenge in both approaches.
> Any other thoughts?

Why not have this work the exact same way sync replicas do, except that
it's based off of some byte/time lag for some set of async replicas?
That is, in RecordTransactionCommit(), perhaps right after the
SyncRepWaitForLSN() call, or maybe even add this to that function?  Sure
seems like there's a lot of similarity.

I was thinking of achieving log governance (throttling WAL MB/sec) and also providing RPO guarantees. In this model, it is hard to throttle WAL generation of a long running transaction (for example copy/select into).

Long running transactions have a lot of downsides and are best discouraged. I don’t know that we should be designing this for that case specifically, particularly given the complications it would introduce as discussed on this thread already.

However, this meets my RPO needs. Are you in support of adding a hook or the actual change? IMHO, the hook allows more creative options. I can go ahead and make a patch accordingly.

I would think this would make more sense as part of core rather than a hook, as that then requires an extension and additional setup to get going, which raises the bar quite a bit when it comes to actually being used.

Sounds good, I will work on making the changes accordingly.

Thanks,

Stephen

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes