Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5
Дата
Msg-id CAD21AoBaiMiAMLF-daEyB43hLbWA6fMmWWToGDMyp9V3kp149w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Logical replication 'invalid memory alloc request size 1585837200' after upgrading to 17.5  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-bugs
On Wed, Jun 4, 2025 at 11:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jun 5, 2025 at 3:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jun 3, 2025 at 11:48 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Wed, 4 Jun 2025 at 01:14, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Thank you for updating the patch. I have some comments and questions:
> >
> > In ReorderBufferAbort():
> >
> >         /*
> >          * We might have decoded changes for this transaction that could load
> >          * the cache as per the current transaction's view (consider DDL's
> >          * happened in this transaction). We don't want the decoding of future
> >          * transactions to use those cache entries so execute invalidations.
> >          */
> >         if (txn->ninvalidations > 0)
> >             ReorderBufferImmediateInvalidation(rb, txn->ninvalidations,
> >                                                txn->invalidations);
> >
> > I think that if the txn->invalidations_distributed is overflowed, we
> > would miss executing the txn->invalidations here. Probably the same is
> > true for ReorderBufferForget() and ReorderBufferInvalidate().
> >
>
> This is because of the following check "if
> (!rbtxn_inval_overflowed(txn))" in function
> ReorderBufferAddInvalidations(). What is the need of such a check in
> this function? We don't need to execute distributed invalidations in
> cases like ReorderBufferForget() when we haven't decoded any changes.



>
> > ---
> > I'd like to make it clear again which case we need to execute
> > txn->invalidations as well as txn->invalidations_distributed (like in
> > ReorderBufferProcessTXN()) and which case we need to execute only
> > txn->invalidations (like in ReorderBufferForget() and
> > ReorderBufferAbort()). I think it might be worth putting some comments
> > about overall strategy somewhere.
> >
> > ---
> > BTW for back branches, a simple fix without ABI breakage would be to
> > introduce the RBTXN_INVAL_OVERFLOWED flag to limit the size of
> > txn->invalidations. That is, we accumulate inval messages both coming
> > from the current transaction and distributed by other transactions but
> > once the size reaches the threshold we invalidate all caches. Is it
> > worth considering for back branches?
> >
>
> It should work and is worth considering. The main concern would be
> that it will hit sooner than we expect in the field, seeing the recent
> reports. So, such a change has the potential to degrade the
> performance. I feel that the number of people impacted due to
> performance would be more than the number of people impacted due to
> such an ABI change (adding the new members at the end of
> ReorderBufferTXN).

That's a fair point. I initially assumed that DDLs were not executed
often in practice, but analyzing this bug has made me realize this
assumption was misguided.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



В списке pgsql-bugs по дате отправления: