Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Дата
Msg-id CAA4eK1JpRrES8EAqJtBQfvo-RapA2pSnLJ-SeS+3GS39zNG0xA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Dilip Kumar <dilipbalaut@gmail.com>)
Ответы Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Amit Kapila <amit.kapila16@gmail.com>)
Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
On Sun, May 17, 2020 at 12:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, May 15, 2020 at 4:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Review comments:
> > ------------------------------
> > 1.
> > @@ -1762,10 +1952,16 @@ ReorderBufferCommit(ReorderBuffer *rb,
> > TransactionId xid,
> >   }
> >
> >   case REORDER_BUFFER_CHANGE_MESSAGE:
> > - rb->message(rb, txn, change->lsn, true,
> > - change->data.msg.prefix,
> > - change->data.msg.message_size,
> > - change->data.msg.message);
> > + if (streaming)
> > + rb->stream_message(rb, txn, change->lsn, true,
> > +    change->data.msg.prefix,
> > +    change->data.msg.message_size,
> > +    change->data.msg.message);
> > + else
> > + rb->message(rb, txn, change->lsn, true,
> > +    change->data.msg.prefix,
> > +    change->data.msg.message_size,
> > +    change->data.msg.message);
> >
> > Don't we need to set any_data_sent flag while streaming messages as we
> > do for other types of changes?
>
> I think any_data_sent, was added to avoid sending abort to the
> subscriber if we haven't sent any data,  but this is not complete as
> the output plugin can also take the decision not to send.  So I think
> this should not be done as part of this patch and can be done
> separately.  I think there is already a thread for handling the
> same[1]
>

Hmm, but prior to this patch, we never use to send (empty) aborts but
now that will be possible. It is probably okay to deal that with
another patch mentioned by you but I felt at least any_data_sent will
work for some cases.  OTOH, it appears to be half-baked solution, so
we should probably refrain from adding it.  BTW, how do the pgoutput
plugin deal with it? I see that apply_handle_stream_abort will
unconditionally try to unlink the file and it will probably fail.
Have you tested this scenario after your latest changes?

>
> > 4.
> > In ReorderBufferProcessTXN(), the patch is calling stream_stop in both
> > the try and catch block.  If there is an error after calling it in a
> > try block, we might call it again via catch.  I think that will lead
> > to sending a stop message twice.  Won't that be a problem?  See the
> > usage of iterstate in the catch block, we have made it safe from a
> > similar problem.
>
> IMHO, we don't need that, because we only call stream_stop in the
> catch block if the error type is ERRCODE_TRANSACTION_ROLLBACK.  So if
> in TRY block we have already stopped the stream then we should not get
> that error.  I have added the comments for the same.
>

I am still slightly nervous about it as I don't see any solid
guarantee for the same.  You are right as the code stands today but
due to any code that gets added in the future, it might not remain
true. I feel it is better to have an Assert here to ensure that
stream_stop won't be called the second time.  I don't see any good way
of doing it other than by maintaining flag or some state but I think
it will be good to ensure this.

>
> > 6.
> > PG_CATCH();
> >   {
> > + MemoryContext ecxt = MemoryContextSwitchTo(ccxt);
> > + ErrorData  *errdata = CopyErrorData();
> >
> > I don't understand the usage of memory context in this part of the
> > code.  Basically, you are switching to CurrentMemoryContext here, do
> > some error handling and then again reset back to some random context
> > before rethrowing the error.  If there is some purpose for it, then it
> > might be better if you can write a few comments to explain the same.
>
> Basically, the ccxt is the CurrentMemoryContext when we started the
> streaming and ecxt it the context when we catch the error.  So
> ideally, before this change, it will rethrow in the context when we
> catch the error i.e. ecxt.  So what we are trying to do is put it back
> to normal context (ccxt) and copy the error data in the normal
> context.  And, if we are not handling it gracefully then put it back
> to the context it was in, and rethrow.
>

Okay, but when errorcode is *not* ERRCODE_TRANSACTION_ROLLBACK, don't
we need to clean up the reorderbuffer by calling
ReorderBufferCleanupTXN?  If so, then you can try to combine it with
the not-streaming else loop.

>
> > 8.
> > @@ -2295,6 +2677,13 @@ ReorderBufferXidSetCatalogChanges(ReorderBuffer
> > *rb, TransactionId xid,
> >   txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
> >
> >   txn->txn_flags |= RBTXN_HAS_CATALOG_CHANGES;
> > +
> > + /*
> > + * TOCHECK: Mark toplevel transaction as having catalog changes too
> > + * if one of its children has.
> > + */
> > + if (txn->toptxn != NULL)
> > + txn->toptxn->txn_flags |= RBTXN_HAS_CATALOG_CHANGES;
> >  }
> >
> > Why are we marking top transaction here?
>
> We need to mark top transaction to decide whether to build tuplecid
> hash or not.  In non-streaming mode, we are only sending during the
> commit time, and during commit time we know whether the top
> transaction has any catalog changes or not based on the invalidation
> message so we are marking the top transaction there in DecodeCommit.
> Since here we are not waiting till commit so we need to mark the top
> transaction as soon as we mark any of its child transactions.
>

But how does it help?  We use this flag (via
ReorderBufferXidHasCatalogChanges) in SnapBuildCommitTxn which is
anyway done in DecodeCommit and that too after setting this flag for
the top transaction if required.  So, how will it help in setting it
while processing for subxid.  Also, even if we have to do it won't it
add the xid needlessly in builder->committed.xip array?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: [PATCH] Add support to psql for edit-and-execute-command
Следующее
От: Richard Guo
Дата:
Сообщение: Re: Optimizer docs typos