Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Дата
Msg-id CAA4eK1KcqrRyVEvn6PBsR=MeY1C8NUO0o3jmgOWhZEUOvRkaig@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Список pgsql-hackers
On Tue, Jan 16, 2018 at 6:24 AM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Fri, Jan 12, 2018 at 10:28 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> More comments:
>
> Attached patch has all open issues worked through, including those
> that I respond to or comment on below, as well as the other feedback
> from your previous e-mails. Note also that I fixed the issue that Amit
> raised,
>

I could still reproduce it. I think the way you have fixed it has a
race condition.  In _bt_parallel_scan_and_sort(), the value of
brokenhotchain is set after you signal the leader that the worker is
done (by incrementing workersFinished). Now, the leader is free to
decide based on the current shared state which can give the wrong
value.  Similarly, I think the value of havedead and reltuples can
also be wrong.

You neither seem to have fixed nor responded to the second problem
mentioned in my email upthread [1].  To reiterate, the problem is that
we can't assume that the workers we have launched will always start
and finish. It is possible that postmaster fails to start the worker
due to fork failure. In such conditions, tuplesort_leader_wait will
hang indefinitely because it will wait for the workersFinished count
to become equal to launched workers (+1, if leader participates) which
will never happen.  Am I missing something due to which this won't be
a problem?

Now, I think one argument is that such a problem can happen in a
parallel query, so it is not the responsibility of this patch to solve
it.  However, we already have a patch (there are some review comments
that needs to be addressed in the proposed patch) to solve it and this
patch is adding a new path in the code which has similar symptoms
which can't be fixed with the already proposed patch.

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BizMyxzFD6k81Deyar35YJ5qdpbRTUp9cQvo%2BniQom7Q%40mail.gmail.com


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: Setting BLCKSZ 4kB
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Proposal: Local indexes for partitioned table