Re: Speed up Clog Access by increasing CLOG buffers

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Speed up Clog Access by increasing CLOG buffers
Дата
Msg-id CAA4eK1JwMRcpDMYE72dAByVTM0Z95zaU-1-UiqW5X+-wwopd7A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Speed up Clog Access by increasing CLOG buffers  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Speed up Clog Access by increasing CLOG buffers  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Thu, Feb 11, 2016 at 8:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On the substantive part of the patch, this doesn't look safe:
>
> +    /*
> +     * Add ourselves to the list of processes needing a group XID status
> +     * update.
> +     */
> +    proc->clogGroupMember = true;
> +    proc->clogGroupMemberXid = xid;
> +    proc->clogGroupMemberXidStatus = status;
> +    proc->clogGroupMemberPage = pageno;
> +    proc->clogGroupMemberLsn = lsn;
> +    while (true)
> +    {
> +        nextidx = pg_atomic_read_u32(&procglobal->clogGroupFirst);
> +
> +        /*
> +         * Add the proc to list if the clog page where we need to update the
> +         * current transaction status is same as group leader's clog page.
> +         */
> +        if (nextidx != INVALID_PGPROCNO &&
> +            ProcGlobal->allProcs[nextidx].clogGroupMemberPage !=
> proc->clogGroupMemberPage)
> +            return false;
>
> DANGER HERE!
>
> +        pg_atomic_write_u32(&proc->clogGroupNext, nextidx);
> +
> +        if (pg_atomic_compare_exchange_u32(&procglobal->clogGroupFirst,
> +                                           &nextidx,
> +                                           (uint32) proc->pgprocno))
> +            break;
> +    }
>
> There is a potential ABA problem here.  Suppose that this code
> executes in one process as far as the line that says DANGER HERE.
> Then, the group leader wakes up, performs all of the CLOG
> modifications, performs another write transaction, and again becomes
> the group leader, but for a different member page.  Then, the original
> process that went to sleep at DANGER HERE wakes up.  At this point,
> the pg_atomic_compare_exchange_u32 will succeed and we'll have
> processes with different pages in the list, contrary to the intention
> of the code.
>

Very Good Catch.  I think if we want to address this we can detect
the non-group leader transactions that tries to update the different
CLOG page (different from group-leader) after acquiring
CLogControlLock and then mark these transactions such that
after waking they need to perform CLOG update via normal path.
Now this can decrease the latency of such transactions, but I
think there will be only very few transactions if at-all there which
can face this condition, because most of the concurrent transactions
should be on same page, otherwise the idea of multiple-slots we
have tried upthread would have shown benefits.
Another idea could be that we update the comments indicating the
possibility of multiple Clog-page updates in same group on the basis
that such cases will be less and even if it happens, it won't effect the
transaction status update.

Do you have anything else in mind?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: custom function for converting human readable sizes to bytes