Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Дата
Msg-id CAA4eK1KyrP6m53FQV5v0gysEnaTHPdc2xYT0KeNN+fhOcedRwA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Dilip Kumar <dilipbalaut@gmail.com>)
Ответы Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
On Wed, Mar 4, 2020 at 9:14 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Mar 4, 2020 at 3:16 AM Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
> >
> >
> > The first thing I realized that WAL-logging of assignments in v12 does
> > both the "old" logging (using dedicated message) and "new" with
> > toplevel-XID embedded in the first message. Yes, the patch was wrong,
> > because it eliminated all calls to ProcArrayApplyXidAssignment() and so
> > it was trivial to crash the replica due to KnownAssignedXids overflow.
> > But I don't think re-introducing XLOG_XACT_ASSIGNMENT message is the
> > right fix.
> >
> > I actually proposed doing this (having both ways to log assignments) so
> > that there's no regression risk with (wal_level < logical). But IIRC
> > Andres objected to it, argumenting that we should not log the same piece
> > of information in two very different ways at the same time (IIRC it was
> > discussed on the FOSDEM dev meeting, so I don't have a link to share).
> > And I do agree with him ...
> >
> > The question is, why couldn't the replica use the same assignment info
> > we already write for logical decoding? The main challenge is that now
> > the assignment can be sent in many different xlog messages, from a bunch
> > of resource managers (essentially, any xlog message with a xid can have
> > embedded XID of the toplevel xact). So the handling would either need to
> > happen in every rmgr, or we need to move it before we call the rmgr.
> >
> > For exampple, we might do this e.g. in StartupXLOG() I think, per the
> > attached patch (FWIW this particular fix was written by Masahiko Sawada,
> > not me). This does the trick for me - I'm no longer able to reproduce
> > the KnownAssignedXids overflow.
> >
> > The one difference is that we used to call ProcArrayApplyXidAssignment
> > for larger groups of XIDs, as sent in the assignment message. Now we
> > call it for each individual assignment. I don't know if this is an
> > issue, but I suppose we might introduce some sort of local caching
> > (accumulate the assignments into a local array, call the function only
> > when we have enough of them).
>
> Thanks for the pointers,  I will think over these points.
>

I have looked at the solution proposed and I would like to share my
findings.  I think calling ProcArrayApplyXidAssignment for each
subtransaction is not a good idea for a couple of reasons:
(a) It will just beat the purpose of maintaining KnowAssignedXids
array which is to avoid looking at pg_subtrans in
TransactionIdIsInProgress() on standby.  Basically, if we remove it
for each subXid, it will consider the KnowAssignedXids to be
overflowed and check pg_subtrans frequently.
(b)  Calling ProcArrayApplyXidAssignment() for each subtransaction can
be costly from the perspective of concurrency because it acquires
ProcArrayLock in Exclusive mode, so concurrently running transactions
might start blocking at this lock.  Also, I see that
SubTransSetParent() makes the page dirty, so it might lead to more
writes if we spread out setting that by calling it separately for each
sub-transaction.

Apart from this, I don't see how the proposed fix is correct because
as far as I can see it tries to remove the Xid before we even record
it via RecordKnownAssignedTransactionIds().  It seems after patch
RecordKnownAssignedTransactionIds() will be called after
ProcArrayApplyXidAssignment(), how could that be correct.

Thoughts?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: Berserk Autovacuum (let's save next Mandrill)
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions