Re: INSERT INTO SELECT, Why Parallelism is not selected?

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: INSERT INTO SELECT, Why Parallelism is not selected?
Дата
Msg-id CAA4eK1L3Dca1zmPptcYzqZUa8qfARxYkpZrRfas19vpyEaHBFQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: INSERT INTO SELECT, Why Parallelism is not selected?  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: INSERT INTO SELECT, Why Parallelism is not selected?  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Wed, Jul 29, 2020 at 7:18 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> I still don't agree with this as proposed.
>
> + * For now, we don't allow parallel inserts of any form not even where the
> + * leader can perform the insert.  This restriction can be uplifted once
> + * we allow the planner to generate parallel plans for inserts.  We can
>
> If I'm understanding this correctly, this logic is completely
> backwards. We don't prohibit inserts here because we know the planner
> can't generate them. We prohibit inserts here because, if the planner
> somehow did generate them, it wouldn't be safe. You're saying that
> it's not allowed because we don't try to do it yet, but actually it's
> not allowed because we want to make sure that we don't accidentally
> try to do it. That's very different.
>

Right, so how about something like: "To allow parallel inserts, we
need to ensure that they are safe to be performed in workers.  We have
the infrastructure to allow parallel inserts in general except for the
case where inserts generate a new commandid (eg. inserts into a table
having a foreign key column)."  We can extend this for tuple locking
if required as per the below discussion.  Kindly suggest if you prefer
a different wording here.

>
> + * We should be able to parallelize
> + * the later case if we can ensure that no two parallel processes can ever
> + * operate on the same page.
>
> I don't know whether this is talking about two processes operating on
> the same page at the same time, or ever within a single query
> execution. If it's the former, perhaps we need to explain why that's a
> concern for parallel query but not otherwise;
>

I am talking about the former case and I know that as per current
design it is not possible that two worker processes try to operate on
the same page but I was trying to be pessimistic so that we can ensure
that via some form of Assert.  I don't know whether it is important to
mention this case or not but for the sake of extra safety, I have
mentioned it.

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kasahara Tatsuhito
Дата:
Сообщение: Re: Creating a function for exposing memory usage of backend process
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Doc patch: mention indexes in pg_inherits docs