Обсуждение: Allow parallel plan for referential integrity checks?

Поиск
Список
Период
Сортировка

Allow parallel plan for referential integrity checks?

От
Frédéric Yhuel
Дата:
Hello,

I noticed that referential integrity checks aren't currently 
parallelized. Is it on purpose?

 From the documentation [1], the planner will not generate a parallel 
plan for a given query if any of the following are true:

1) The system is running in single-user mode.
2) max_parallel_workers_per_gather = 0.
3) The query writes any data or locks any database rows.
4) The query might be suspended during execution (cursor or  PL/pgSQL loop).
5) The query uses any function marked PARALLEL UNSAFE.
6) The query is running inside of another query that is already parallel.

 From my understanding of the code, it seems to me that the fourth 
condition is not always accurately detected. In particular, the query 
triggered by a foreign key addition (a Left Excluding JOIN between the 
two tables) isn't parallelized, because the flag CURSOR_OPT_PARALLEL_OK 
hasn't been set at some point. I couldn't find any reason why not, but 
maybe I missed something.

Attached is a (naive) patch that aims to fix the case of a FK addition, 
but the handling of the flag CURSOR_OPT_PARALLEL_OK, generally speaking, 
looks rather hackish.

Best regards,
Frédéric

[1] 
https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.html
Вложения

Re: Allow parallel plan for referential integrity checks?

От
Andreas Karlsson
Дата:
On 2/7/22 11:26, Frédéric Yhuel wrote:
> Attached is a (naive) patch that aims to fix the case of a FK addition, 
> but the handling of the flag CURSOR_OPT_PARALLEL_OK, generally speaking, 
> looks rather hackish.

Thanks, for the patch. You can add it to the current open commitfest 
(https://commitfest.postgresql.org/37/).

Best regards,
Andreas



Re: Allow parallel plan for referential integrity checks?

От
Frédéric Yhuel
Дата:

On 2/11/22 00:16, Andreas Karlsson wrote:
> On 2/7/22 11:26, Frédéric Yhuel wrote:
>> Attached is a (naive) patch that aims to fix the case of a FK 
>> addition, but the handling of the flag CURSOR_OPT_PARALLEL_OK, 
>> generally speaking, looks rather hackish.
> 
> Thanks, for the patch. You can add it to the current open commitfest 
> (https://commitfest.postgresql.org/37/).
> 

OK, I just did. Please let me know if I did something wrong.

Best regards,
Frédéric



Re: Allow parallel plan for referential integrity checks?

От
Robert Haas
Дата:
On Mon, Feb 7, 2022 at 5:26 AM Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:
> I noticed that referential integrity checks aren't currently
> parallelized. Is it on purpose?

It's not 100% clear to me that it is safe. But on the other hand, it's
also not 100% clear to me that it is unsafe.

Generally, I only added CURSOR_OPT_PARALLEL_OK in places where I was
confident that nothing bad would happen, and this wasn't one of those
places. It's something of a nested-query environment -- your criterion
#6. How do we know that the surrounding query isn't already parallel?
Perhaps because it must be DML, but I think it must be more important
to support parallel DML than to support this.

I'm not sure what the right thing to do here is, but I think it would
be good if your patch included a test case.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Allow parallel plan for referential integrity checks?

От
Frédéric Yhuel
Дата:
Hello, sorry for the late reply.

On 2/14/22 15:33, Robert Haas wrote:
> On Mon, Feb 7, 2022 at 5:26 AM Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:
>> I noticed that referential integrity checks aren't currently
>> parallelized. Is it on purpose?
> 
> It's not 100% clear to me that it is safe. But on the other hand, it's
> also not 100% clear to me that it is unsafe.
> 
> Generally, I only added CURSOR_OPT_PARALLEL_OK in places where I was
> confident that nothing bad would happen, and this wasn't one of those
> places. It's something of a nested-query environment -- your criterion
> #6. How do we know that the surrounding query isn't already parallel?
> Perhaps because it must be DML,

Did you mean DDL?

> but I think it must be more important
> to support parallel DML than to support this.


> I'm not sure what the right thing to do here is, but I think it would
> be good if your patch included a test case.
> 

Would that be a regression test? (in src/test/regress ?)

If yes, what should I check? Is it good enough to load auto_explain and 
check that the query triggered by a foreign key addition is parallelized?

Best regards,
Frédéric



Re: Allow parallel plan for referential integrity checks?

От
"Imseih (AWS), Sami"
Дата:
I looked at your patch and it's a good idea to make foreign key validation
use parallel query on large relations.

It would be valuable to add logging to ensure that the ActiveSnapshot and TransactionSnapshot 
is the same for the leader and the workers. This logging could be tested in the TAP test.

Also, inside RI_Initial_Check you may want to set max_parallel_workers to 
max_parallel_maintenance_workers.

Currently the work_mem is set to maintenance_work_mem. This will also require
a doc change to call out.

/*
     * Temporarily increase work_mem so that the check query can be executed
     * more efficiently.  It seems okay to do this because the query is simple
     * enough to not use a multiple of work_mem, and one typically would not
     * have many large foreign-key validations happening concurrently.  So
     * this seems to meet the criteria for being considered a "maintenance"
     * operation, and accordingly we use maintenance_work_mem.  However, we






Re: Allow parallel plan for referential integrity checks?

От
Frédéric Yhuel
Дата:

On 3/19/22 01:57, Imseih (AWS), Sami wrote:
> I looked at your patch and it's a good idea to make foreign key validation
> use parallel query on large relations.
> 
> It would be valuable to add logging to ensure that the ActiveSnapshot and TransactionSnapshot
> is the same for the leader and the workers. This logging could be tested in the TAP test.
> 
> Also, inside RI_Initial_Check you may want to set max_parallel_workers to
> max_parallel_maintenance_workers.
> 
> Currently the work_mem is set to maintenance_work_mem. This will also require
> a doc change to call out.
> 
> /*
>       * Temporarily increase work_mem so that the check query can be executed
>       * more efficiently.  It seems okay to do this because the query is simple
>       * enough to not use a multiple of work_mem, and one typically would not
>       * have many large foreign-key validations happening concurrently.  So
>       * this seems to meet the criteria for being considered a "maintenance"
>       * operation, and accordingly we use maintenance_work_mem.  However, we
> 

Hello Sami,

Thank you for your review!

I will try to do as you say, but it will take time, since my daily job 
as database consultant takes most of my time and energy.

Best regards,
Frédéric



Re: Allow parallel plan for referential integrity checks?

От
Frédéric Yhuel
Дата:

On 4/14/22 14:25, Frédéric Yhuel wrote:
> 
> 
> On 3/19/22 01:57, Imseih (AWS), Sami wrote:
>> I looked at your patch and it's a good idea to make foreign key 
>> validation
>> use parallel query on large relations.
>>
>> It would be valuable to add logging to ensure that the ActiveSnapshot 
>> and TransactionSnapshot
>> is the same for the leader and the workers. This logging could be 
>> tested in the TAP test.
>>
>> Also, inside RI_Initial_Check you may want to set max_parallel_workers to
>> max_parallel_maintenance_workers.
>>
>> Currently the work_mem is set to maintenance_work_mem. This will also 
>> require
>> a doc change to call out.
>>
>> /*
>>       * Temporarily increase work_mem so that the check query can be 
>> executed
>>       * more efficiently.  It seems okay to do this because the query 
>> is simple
>>       * enough to not use a multiple of work_mem, and one typically 
>> would not
>>       * have many large foreign-key validations happening 
>> concurrently.  So
>>       * this seems to meet the criteria for being considered a 
>> "maintenance"
>>       * operation, and accordingly we use maintenance_work_mem.  
>> However, we
>>
> 
> Hello Sami,
> 
> Thank you for your review!
> 
> I will try to do as you say, but it will take time, since my daily job 
> as database consultant takes most of my time and energy.
> 

Hi,

As suggested by Jacob, here is a quick message to say that I didn't find 
enough time to work further on this patch, but I didn't completely 
forget it either. I moved it to the next commitfest. Hopefully I will 
find enough time and motivation in the coming months :-)

Best regards,
Frédéric



Re: Allow parallel plan for referential integrity checks?

От
Ian Lawrence Barwick
Дата:
2022年7月26日(火) 20:58 Frédéric Yhuel <frederic.yhuel@dalibo.com>:
>
>
>
> On 4/14/22 14:25, Frédéric Yhuel wrote:
> >
> >
> > On 3/19/22 01:57, Imseih (AWS), Sami wrote:
> >> I looked at your patch and it's a good idea to make foreign key
> >> validation
> >> use parallel query on large relations.
> >>
> >> It would be valuable to add logging to ensure that the ActiveSnapshot
> >> and TransactionSnapshot
> >> is the same for the leader and the workers. This logging could be
> >> tested in the TAP test.
> >>
> >> Also, inside RI_Initial_Check you may want to set max_parallel_workers to
> >> max_parallel_maintenance_workers.
> >>
> >> Currently the work_mem is set to maintenance_work_mem. This will also
> >> require
> >> a doc change to call out.
> >>
> >> /*
> >>       * Temporarily increase work_mem so that the check query can be
> >> executed
> >>       * more efficiently.  It seems okay to do this because the query
> >> is simple
> >>       * enough to not use a multiple of work_mem, and one typically
> >> would not
> >>       * have many large foreign-key validations happening
> >> concurrently.  So
> >>       * this seems to meet the criteria for being considered a
> >> "maintenance"
> >>       * operation, and accordingly we use maintenance_work_mem.
> >> However, we
> >>
> >
> > Hello Sami,
> >
> > Thank you for your review!
> >
> > I will try to do as you say, but it will take time, since my daily job
> > as database consultant takes most of my time and energy.
> >
>
> Hi,
>
> As suggested by Jacob, here is a quick message to say that I didn't find
> enough time to work further on this patch, but I didn't completely
> forget it either. I moved it to the next commitfest. Hopefully I will
> find enough time and motivation in the coming months :-)

Hi Frédéric

This patch has been carried forward for a couple more commitfests since
your message; do you think you'll be able to work on it further during this
release cycle?

Thanks

Ian Barwick



Re: Allow parallel plan for referential integrity checks?

От
Frédéric Yhuel
Дата:

On 12/11/22 06:29, Ian Lawrence Barwick wrote:
> 2022年7月26日(火) 20:58 Frédéric Yhuel <frederic.yhuel@dalibo.com>:
>>
>>
>>
>> On 4/14/22 14:25, Frédéric Yhuel wrote:
>>>
>>>
>>> On 3/19/22 01:57, Imseih (AWS), Sami wrote:
>>>> I looked at your patch and it's a good idea to make foreign key
>>>> validation
>>>> use parallel query on large relations.
>>>>
>>>> It would be valuable to add logging to ensure that the ActiveSnapshot
>>>> and TransactionSnapshot
>>>> is the same for the leader and the workers. This logging could be
>>>> tested in the TAP test.
>>>>
>>>> Also, inside RI_Initial_Check you may want to set max_parallel_workers to
>>>> max_parallel_maintenance_workers.
>>>>
>>>> Currently the work_mem is set to maintenance_work_mem. This will also
>>>> require
>>>> a doc change to call out.
>>>>
>>>> /*
>>>>        * Temporarily increase work_mem so that the check query can be
>>>> executed
>>>>        * more efficiently.  It seems okay to do this because the query
>>>> is simple
>>>>        * enough to not use a multiple of work_mem, and one typically
>>>> would not
>>>>        * have many large foreign-key validations happening
>>>> concurrently.  So
>>>>        * this seems to meet the criteria for being considered a
>>>> "maintenance"
>>>>        * operation, and accordingly we use maintenance_work_mem.
>>>> However, we
>>>>
>>>
>>> Hello Sami,
>>>
>>> Thank you for your review!
>>>
>>> I will try to do as you say, but it will take time, since my daily job
>>> as database consultant takes most of my time and energy.
>>>
>>
>> Hi,
>>
>> As suggested by Jacob, here is a quick message to say that I didn't find
>> enough time to work further on this patch, but I didn't completely
>> forget it either. I moved it to the next commitfest. Hopefully I will
>> find enough time and motivation in the coming months :-)
> 
> Hi Frédéric
> 
> This patch has been carried forward for a couple more commitfests since
> your message; do you think you'll be able to work on it further during this
> release cycle?
> 

Hi Ian,

I've planned to work on it full time on week 10 (6-10 March), if you 
agree to bear with me. The idea would be to bootstrap my brain on it, 
and then continue to work on it from time to time.

Best regards,
Frédéric



Re: Allow parallel plan for referential integrity checks?

От
vignesh C
Дата:
On Mon, 12 Dec 2022 at 22:06, Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:
>
>
>
> On 12/11/22 06:29, Ian Lawrence Barwick wrote:
> > 2022年7月26日(火) 20:58 Frédéric Yhuel <frederic.yhuel@dalibo.com>:
> >>
> >>
> >>
> >> On 4/14/22 14:25, Frédéric Yhuel wrote:
> >>>
> >>>
> >>> On 3/19/22 01:57, Imseih (AWS), Sami wrote:
> >>>> I looked at your patch and it's a good idea to make foreign key
> >>>> validation
> >>>> use parallel query on large relations.
> >>>>
> >>>> It would be valuable to add logging to ensure that the ActiveSnapshot
> >>>> and TransactionSnapshot
> >>>> is the same for the leader and the workers. This logging could be
> >>>> tested in the TAP test.
> >>>>
> >>>> Also, inside RI_Initial_Check you may want to set max_parallel_workers to
> >>>> max_parallel_maintenance_workers.
> >>>>
> >>>> Currently the work_mem is set to maintenance_work_mem. This will also
> >>>> require
> >>>> a doc change to call out.
> >>>>
> >>>> /*
> >>>>        * Temporarily increase work_mem so that the check query can be
> >>>> executed
> >>>>        * more efficiently.  It seems okay to do this because the query
> >>>> is simple
> >>>>        * enough to not use a multiple of work_mem, and one typically
> >>>> would not
> >>>>        * have many large foreign-key validations happening
> >>>> concurrently.  So
> >>>>        * this seems to meet the criteria for being considered a
> >>>> "maintenance"
> >>>>        * operation, and accordingly we use maintenance_work_mem.
> >>>> However, we
> >>>>
> >>>
> >>> Hello Sami,
> >>>
> >>> Thank you for your review!
> >>>
> >>> I will try to do as you say, but it will take time, since my daily job
> >>> as database consultant takes most of my time and energy.
> >>>
> >>
> >> Hi,
> >>
> >> As suggested by Jacob, here is a quick message to say that I didn't find
> >> enough time to work further on this patch, but I didn't completely
> >> forget it either. I moved it to the next commitfest. Hopefully I will
> >> find enough time and motivation in the coming months :-)
> >
> > Hi Frédéric
> >
> > This patch has been carried forward for a couple more commitfests since
> > your message; do you think you'll be able to work on it further during this
> > release cycle?
> >
>
> Hi Ian,
>
> I've planned to work on it full time on week 10 (6-10 March), if you
> agree to bear with me. The idea would be to bootstrap my brain on it,
> and then continue to work on it from time to time.

I have moved this to Mar commitfest as the patch update is expected to
happen during March commitfest.

Regards,
Vignesh



Re: Allow parallel plan for referential integrity checks?

От
"Gregory Stark (as CFM)"
Дата:
On Mon, 12 Dec 2022 at 11:37, Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:
>
> I've planned to work on it full time on week 10 (6-10 March), if you
> agree to bear with me. The idea would be to bootstrap my brain on it,
> and then continue to work on it from time to time.

Any updates on this patch? Should we move it to next release at this
point? Even if you get time to work on it this commitfest do you think
it's likely to be committable in the next few weeks?

--
Gregory Stark
As Commitfest Manager



Re: Allow parallel plan for referential integrity checks?

От
Frédéric Yhuel
Дата:

On 3/20/23 15:58, Gregory Stark (as CFM) wrote:
> On Mon, 12 Dec 2022 at 11:37, Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:
>>
>> I've planned to work on it full time on week 10 (6-10 March), if you
>> agree to bear with me. The idea would be to bootstrap my brain on it,
>> and then continue to work on it from time to time.
> 
> Any updates on this patch? 

I had the opportunity to discuss it with Melanie Plageman when she came 
to Paris last month and teach us (Dalibo's folk) about the patch review 
process.

She advised me to provide a good test case first, and to explain better 
how it would be useful, which I'm going to do.

> Should we move it to next release at this
> point? Even if you get time to work on it this commitfest do you think
> it's likely to be committable in the next few weeks?
> 

It is very unlikely. Maybe it's better to remove it from CF and put it 
back later if the test case I will provide does a better job at 
convincing the Postgres folks that RI checks should be parallelized.



Re: Allow parallel plan for referential integrity checks?

От
Daniel Gustafsson
Дата:
> On 20 Mar 2023, at 16:48, Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:
> On 3/20/23 15:58, Gregory Stark (as CFM) wrote:

>> Should we move it to next release at this
>> point? Even if you get time to work on it this commitfest do you think
>> it's likely to be committable in the next few weeks?
>
> It is very unlikely. Maybe it's better to remove it from CF and put it back later if the test case I will provide
doesa better job at convincing the Postgres folks that RI checks should be parallelized. 

As there is no new patch submitted I will go ahead and do that, please feel
free to resubmit when there is renewed interest in working on this.

--
Daniel Gustafsson




Re: Allow parallel plan for referential integrity checks?

От
Juan José Santamaría Flecha
Дата:

On Tue, Jul 4, 2023 at 9:45 AM Daniel Gustafsson <daniel@yesql.se> wrote:

As there is no new patch submitted I will go ahead and do that, please feel
free to resubmit when there is renewed interest in working on this.


Recently I restored a database from a directory format backup and having this feature would have been quite useful. So, I would like to resume the work on this patch, unless OP or someone else is already on it.

Regards,

Juan José Santamaría Flecha

Re: Allow parallel plan for referential integrity checks?

От
Juan José Santamaría Flecha
Дата:

On Thu, Aug 10, 2023 at 5:06 PM Juan José Santamaría Flecha <juanjo.santamaria@gmail.com> wrote:
On Tue, Jul 4, 2023 at 9:45 AM Daniel Gustafsson <daniel@yesql.se> wrote:
As there is no new patch submitted I will go ahead and do that, please feel
free to resubmit when there is renewed interest in working on this.

Recently I restored a database from a directory format backup and having this feature would have been quite useful. So, I would like to resume the work on this patch, unless OP or someone else is already on it.

Hearing no one advising me otherwise I'll pick up the work on the patch. First, I'll try to address all previous comments.

On Mon, Feb 14, 2022 at 3:34 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Feb 7, 2022 at 5:26 AM Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:
> I noticed that referential integrity checks aren't currently
> parallelized. Is it on purpose?

It's not 100% clear to me that it is safe. But on the other hand, it's
also not 100% clear to me that it is unsafe.

Generally, I only added CURSOR_OPT_PARALLEL_OK in places where I was
confident that nothing bad would happen, and this wasn't one of those
places. It's something of a nested-query environment -- your criterion
#6. How do we know that the surrounding query isn't already parallel?
Perhaps because it must be DML, but I think it must be more important
to support parallel DML than to support this.

Currently I don't see a scenario where the validation might run inside another parallel query, but I could be missing something. Suggestions on some edge cases to test are more than welcome.

We already have a case where parallel operations can occur when adding a table constraint, and that's the index creation for a primary key. Take for example:

postgres=# SET max_parallel_maintenance_workers TO 4;
postgres=# SET parallel_setup_cost TO 0;
postgres=# SET parallel_tuple_cost TO 0;
postgres=# SET parallel_leader_participation TO 0;
postgres=# SET min_parallel_table_scan_size TO 0;
postgres=# SET client_min_messages TO debug1;
postgres=# CREATE TABLE parallel_pk_table (a int) WITH (autovacuum_enabled = off);
postgres=# ALTER TABLE parallel_pk_table ADD PRIMARY KEY (a);
DEBUG:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "parallel_pk_table_pkey" for table "parallel_pk_table"
DEBUG:  building index "parallel_pk_table_pkey" on table "parallel_pk_table" with request for 1 parallel workers
...
DEBUG:  index "parallel_pk_table_pkey" can safely use deduplication
DEBUG:  verifying table "parallel_pk_table"

This should be better documented. Maybe "parallel index build" [1] could have its own section, so it can be cited from other sections?
 
I'm not sure what the right thing to do here is, but I think it would
be good if your patch included a test case.

I have included a basic test for parallel ADD PRIMARY KEY and ADD FOREIGN KEY.

On Sat, Mar 19, 2022 at 1:57 AM Imseih (AWS), Sami <simseih@amazon.com> wrote:
It would be valuable to add logging to ensure that the ActiveSnapshot and TransactionSnapshot
is the same for the leader and the workers. This logging could be tested in the TAP test.

That machinery is used in any parallel query, I'm not sure this patch should be responsible for carrying that requirement.

Also, inside RI_Initial_Check you may want to set max_parallel_workers to
max_parallel_maintenance_workers.

Currently the work_mem is set to maintenance_work_mem. This will also require
a doc change to call out.

Done.

PFA the patch.


Regards,

Juan José Santamaría Flecha
Вложения

Re: Allow parallel plan for referential integrity checks?

От
Frédéric Yhuel
Дата:

On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
> Recently I restored a database from a directory format backup and having 
> this feature would have been quite useful

Hi,

Thanks for resuming work on this patch. I forgot to mention this in my 
original email, but the motivation was also to speed up the restore 
process. Parallelizing the FK checks could make a huge difference in 
certain cases. We should probably provide such a test case (with perf 
numbers), and maybe this is it what Robert asked for.



Re: Allow parallel plan for referential integrity checks?

От
Frédéric Yhuel
Дата:

On 8/17/23 09:32, Frédéric Yhuel wrote:
> 
> 
> On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
>> Recently I restored a database from a directory format backup and 
>> having this feature would have been quite useful
> 
> Hi,
> 
> Thanks for resuming work on this patch. I forgot to mention this in my 
> original email, but the motivation was also to speed up the restore 
> process. Parallelizing the FK checks could make a huge difference in 
> certain cases. We should probably provide such a test case (with perf 
> numbers), and maybe this is it what Robert asked for.

I have attached two scripts which demonstrate the following problems:

1a. if the tables aren't analyzed nor vacuumed before the post-data 
step, then they are index-only scanned, with a lot of heap fetches 
(depending on their size, the planner sometimes chooses a seq scan instead).

1b. if the tables have been analyzed but not vacuumed before the 
post-data-step, then they are scanned sequentially. Usually better, but 
still not so good without a parallel plan.

2. if the visibility maps have been created, then the tables are 
index-only scanned without heap fetches, but this can still be slower 
than a parallel seq scan.

So it would be nice if pg_restore could vacuum analyze the tables before 
the post-data step. I believe it would be faster in most cases.

And it would be nice to allow a parallel plan for RI checks.

Best regards,
Frédéric
Вложения

Re: Allow parallel plan for referential integrity checks?

От
Frédéric Yhuel
Дата:

On 8/17/23 14:00, Frédéric Yhuel wrote:
> 
> 
> On 8/17/23 09:32, Frédéric Yhuel wrote:
>>
>>
>> On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
>>> Recently I restored a database from a directory format backup and 
>>> having this feature would have been quite useful
>>
>> Hi,
>>
>> Thanks for resuming work on this patch. I forgot to mention this in my 
>> original email, but the motivation was also to speed up the restore 
>> process. Parallelizing the FK checks could make a huge difference in 
>> certain cases. We should probably provide such a test case (with perf 
>> numbers), and maybe this is it what Robert asked for.
> 
> I have attached two scripts which demonstrate the following problems:
> 

Let me add the plans for more clarity:

> 1a. if the tables aren't analyzed nor vacuumed before the post-data 
> step, then they are index-only scanned, with a lot of heap fetches 
> (depending on their size, the planner sometimes chooses a seq scan 
> instead).
> 

https://explain.dalibo.com/plan/7491ga22c5293683#raw

> 1b. if the tables have been analyzed but not vacuumed before the 
> post-data-step, then they are scanned sequentially. Usually better, but 
> still not so good without a parallel plan.
> 

https://explain.dalibo.com/plan/e92c5161g880bdhf#raw

> 2. if the visibility maps have been created, then the tables are 
> index-only scanned without heap fetches, but this can still be slower 
> than a parallel seq scan.
> 

https://explain.dalibo.com/plan/de22bdb4ggc3dffg#raw

And at last, with the patch applied : 
https://explain.dalibo.com/plan/54612a09ffh2565a#raw



Re: Allow parallel plan for referential integrity checks?

От
Juan José Santamaría Flecha
Дата:

On Thu, Aug 17, 2023 at 3:51 PM Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:
On 8/17/23 14:00, Frédéric Yhuel wrote:
> On 8/17/23 09:32, Frédéric Yhuel wrote:
>> On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
>>> Recently I restored a database from a directory format backup and
>>> having this feature would have been quite useful
>>
>> Thanks for resuming work on this patch. I forgot to mention this in my
>> original email, but the motivation was also to speed up the restore
>> process. Parallelizing the FK checks could make a huge difference in
>> certain cases. We should probably provide such a test case (with perf
>> numbers), and maybe this is it what Robert asked for.
>
> I have attached two scripts which demonstrate the following problems:

Thanks for the scripts, but I think Robert's concerns come from the safety, and not the performance, of the parallel operation.

Proving its vulnerability could be easy with a counter example, but assuring its safety is trickier. What test would suffice to do that?

Regards,

Juan José Santamaría Flecha

Re: Allow parallel plan for referential integrity checks?

От
vignesh C
Дата:
On Fri, 18 Aug 2023 at 16:29, Juan José Santamaría Flecha
<juanjo.santamaria@gmail.com> wrote:
>
>
> On Thu, Aug 17, 2023 at 3:51 PM Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:
>>
>> On 8/17/23 14:00, Frédéric Yhuel wrote:
>> > On 8/17/23 09:32, Frédéric Yhuel wrote:
>> >> On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
>> >>> Recently I restored a database from a directory format backup and
>> >>> having this feature would have been quite useful
>> >>
>> >> Thanks for resuming work on this patch. I forgot to mention this in my
>> >> original email, but the motivation was also to speed up the restore
>> >> process. Parallelizing the FK checks could make a huge difference in
>> >> certain cases. We should probably provide such a test case (with perf
>> >> numbers), and maybe this is it what Robert asked for.
>> >
>> > I have attached two scripts which demonstrate the following problems:
>
>
> Thanks for the scripts, but I think Robert's concerns come from the safety, and not the performance, of the parallel
operation.
>
> Proving its vulnerability could be easy with a counter example, but assuring its safety is trickier. What test would
sufficeto do that? 

I'm seeing that there has been no activity in this thread for more
than 5 months, I'm planning to close this in the current commitfest
unless someone is planning to take it forward. It can be opened again
when there is more interest.

Regards,
Vignesh



Re: Allow parallel plan for referential integrity checks?

От
vignesh C
Дата:
On Sun, 21 Jan 2024 at 07:56, vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 18 Aug 2023 at 16:29, Juan José Santamaría Flecha
> <juanjo.santamaria@gmail.com> wrote:
> >
> >
> > On Thu, Aug 17, 2023 at 3:51 PM Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:
> >>
> >> On 8/17/23 14:00, Frédéric Yhuel wrote:
> >> > On 8/17/23 09:32, Frédéric Yhuel wrote:
> >> >> On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
> >> >>> Recently I restored a database from a directory format backup and
> >> >>> having this feature would have been quite useful
> >> >>
> >> >> Thanks for resuming work on this patch. I forgot to mention this in my
> >> >> original email, but the motivation was also to speed up the restore
> >> >> process. Parallelizing the FK checks could make a huge difference in
> >> >> certain cases. We should probably provide such a test case (with perf
> >> >> numbers), and maybe this is it what Robert asked for.
> >> >
> >> > I have attached two scripts which demonstrate the following problems:
> >
> >
> > Thanks for the scripts, but I think Robert's concerns come from the safety, and not the performance, of the
paralleloperation. 
> >
> > Proving its vulnerability could be easy with a counter example, but assuring its safety is trickier. What test
wouldsuffice to do that? 
>
> I'm seeing that there has been no activity in this thread for more
> than 5 months, I'm planning to close this in the current commitfest
> unless someone is planning to take it forward. It can be opened again
> when there is more interest.

Since the author or no one else showed interest in taking it forward
and the patch had no activity for more than 5 months, I have changed
the status to RWF. Feel free to add a new CF entry when someone is
planning to resume work more actively.

Regards,
Vignesh