RE: Parallel INSERT SELECT take 2

Поиск
Список
Период
Сортировка
От houzj.fnst@fujitsu.com
Тема RE: Parallel INSERT SELECT take 2
Дата
Msg-id OS0PR01MB57163256280A17FC0FEDEACA94429@OS0PR01MB5716.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Parallel INSERT SELECT take 2  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Ответы Re: Parallel INSERT SELECT take 2
Re: Parallel INSERT SELECT take 2
Список pgsql-hackers
> > Based on above, we plan to move forward with the apporache 2) (declarative
> idea).
> 
> IIUC, the declarative behaviour idea attributes parallel safe/unsafe/restricted
> tags to each table with default being the unsafe. Does it mean for a parallel
> unsafe table, no parallel selects, inserts (may be updates) will be picked up? Or
> is it only the parallel inserts? If both parallel inserts, selects will be picked, then
> the existing tables need to be adjusted to set the parallel safety tags while
> migrating?

Thanks for looking into this.

The parallel attributes in table means the parallel safety when user does some data-modification operations on it.
So, It only limit the use of parallel plan when using INSERT/UPDATE/DELETE.

> Another point, what does it mean a table being parallel restricted?
> What should happen if it is present in a query of other parallel safe tables?

If a table is parallel restricted, it means the table contains some parallel restricted objects(such as: parallel
restrictedfunctions in index expressions).
 
And in planner, it means parallel insert plan will not be chosen, but it can use parallel select(with serial insert).

> I may be wrong here: IIUC, the main problem we are trying to solve with the
> declarative approach is to let the user decide parallel safety for partition tables
> as it may be costlier for postgres to determine it. And for the normal tables we
> can perform parallel safety checks without incurring much cost. So, I think we
> should restrict the declarative approach to only partitioned tables?

Yes, we are tring to avoid overhead when checking parallel safety.
The cost to check all the partition's parallel safety is the biggest one.
Another is the safety check of index's expression.
Currently, for INSERT, the planner does not open the target table's indexinfo and does not
parse the expression of the index. We need to parse the expression in planner if we want
to do parallel safety check for it which can bring some overhead(it will open the index the do the parse in executor
again).
So, we plan to skip all of the extra check and let user take responsibility for the safety.

Of course, maybe we can try to pass the indexinfo to the executor but it need some further refactor and I will take a
lookinto it.
 

> While reading the design, I came across this "erroring out during execution of a
> query when a parallel unsafe function is detected". If this is correct, isn't it
> warranting users to run pg_get_parallel_safety to know the parallel unsafe
> objects, set parallel safety to all of them if possible, otherwise disable
> parallelism to run the query? Isn't this burdensome? 

How about:
If detecting parallel unsafe objects in executor, then, alter the table to parallel unsafe internally.
So, user do not need to alter it manually.

> Instead, how about
> postgres retries the query upon detecting the error that came from a parallel
> unsafe function during execution, disable parallelism and run the query? I think
> this kind of retry query feature can be built outside of the core postgres, but
> IMO it will be good to have inside (of course configurable). IIRC, the Teradata
> database has a Query Retry feature.
> 

Thanks for the suggestion. 
The retry query feature sounds like a good idea to me.
OTOH, it sounds more like an independent feature which parallel select can also benefit from it.
I think maybe we can try to achieve it after we commit the parallel insert ?

Best regards,
houzj

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiro Ikeda
Дата:
Сообщение: Re: wal stats questions
Следующее
От: Peter Smith
Дата:
Сообщение: Re: [PATCH] add concurrent_abort callback for output plugin