Re: Perform streaming logical transactions by background workers and parallel apply

Поиск

Список

Период

Сортировка

От	Amit Kapila
Тема	Re: Perform streaming logical transactions by background workers and parallel apply
Дата	6 января 2023 г. 06:35:42
Msg-id	CAA4eK1LGmZsevrqJra0V4O8oBU_eKyzm2VMpSAYQaDgC6n4fkA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Perform streaming logical transactions by background workers and parallel apply (Dilip Kumar <dilipbalaut@gmail.com>)
Ответы	Re: Perform streaming logical transactions by background workers and parallel apply
Список	pgsql-hackers

Дерево обсуждения

On Fri, Jan 6, 2023 at 11:24 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Jan 6, 2023 at 9:37 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Thursday, January 5, 2023 7:54 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > Thanks, I have started another thread[1]
> >
> > Attach the parallel apply patch set here again. I didn't change the patch set,
> > attach it here just to let the CFbot keep testing it.
>
> I have completed the review and some basic testing and it mostly looks
> fine to me.  Here is my last set of comments/suggestions.
>
> 1.
>     /*
>      * Don't start a new parallel worker if user has set skiplsn as it's
>      * possible that they want to skip the streaming transaction. For
>      * streaming transactions, we need to serialize the transaction to a file
>      * so that we can get the last LSN of the transaction to judge whether to
>      * skip before starting to apply the change.
>      */
>     if (!XLogRecPtrIsInvalid(MySubscription->skiplsn))
>         return false;
>
>
> I think this is fine to block parallelism in this case, but it is also
> possible to make it less restrictive, basically, only if the first lsn
> of the transaction is <= skiplsn, then only it is possible that the
> final_lsn might match with skiplsn otherwise that is not possible. And
> if we want then we can allow parallelism in that case.
>
> I understand that currently we do not have first_lsn of the
> transaction in stream start message but I think that should be easy to
> do?  Although I am not sure if it is worth it, it's good to make a
> note at least.
>

Yeah, I also don't think sending extra eight bytes with stream_start
message is worth it. But it is fine to mention the same in the
comments.

> 2.
>
> +     * XXX Additionally, we also stop the worker if the leader apply worker
> +     * serialize part of the transaction data due to a send timeout. This is
> +     * because the message could be partially written to the queue and there
> +     * is no way to clean the queue other than resending the message until it
> +     * succeeds. Instead of trying to send the data which anyway would have
> +     * been serialized and then letting the parallel apply worker deal with
> +     * the spurious message, we stop the worker.
> +     */
> +    if (winfo->serialize_changes ||
> +        list_length(ParallelApplyWorkerPool) >
> +        (max_parallel_apply_workers_per_subscription / 2))
>
> IMHO this reason (XXX Additionally, we also stop the worker if the
> leader apply worker serialize part of the transaction data due to a
> send timeout) for stopping the worker looks a bit hackish to me.  It
> may be a rare case so I am not talking about the performance but the
> reasoning behind stopping is not good. Ideally we should be able to
> clean up the message queue and reuse the worker.
>

TBH, I don't know what is the better way to deal with this with the
current infrastructure. I thought we can do this as a separate
enhancement in the future.

> 3.
> +        else if (shmq_res == SHM_MQ_WOULD_BLOCK)
> +        {
> +            /* Replay the changes from the file, if any. */
> +            if (pa_has_spooled_message_pending())
> +            {
> +                pa_spooled_messages();
> +            }
>
> I think we do not need this pa_has_spooled_message_pending() function.
> Because this function is just calling pa_get_fileset_state() which is
> acquiring mutex and getting filestate then if the filestate is not
> FS_EMPTY then we call pa_spooled_messages() that will again call
> pa_get_fileset_state() which will again acquire mutex.  I think when
> the state is FS_SERIALIZE_IN_PROGRESS it will frequently call
> pa_get_fileset_state() consecutively 2 times, and I think we can
> easily achieve the same behavior with just one call.
>

This is just to keep the code easy to follow. As this would be a rare
case, so thought of giving preference to code clarity.

-- 
With Regards,
Amit Kapila.

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Perform streaming logical transactions by background workers and parallel apply