Обсуждение: Prepared Statement support for Parallel query

Поиск
Список
Период
Сортировка

Prepared Statement support for Parallel query

От
Amit Kapila
Дата:
Commit d1b7c1ffe72e86932b5395f29e006c3f503bc53d has added
the support for passing bind parameters to parallel workers, however
prepared statement which uses bind parameters wasn't enabled
for parallel query as the 'Execute' message in FE-BE protocol
can pass the row_count which can make parallel plans unusable.
(parallel plans are only possible when query can run to completion)

Later Commit bfc78d7196eb28cd4e3d6c24f7e607bacecf1129 has
ensure that if the row_count is non-zero then we won't enter
parallel mode which means that even if parallel plan is selected
by optimizer, it will run such a plan locally.

With above support, it was just a matter of enabling parallel
mode for prepared statements which is done in attached patch
(prepared_stmt_parallel_query_v1.patch).

I have tested that parallel plans are getting generated both
via Prepare/Execute statements and libpq prepared
statement execution.  Attached is a libpq program
(prepare_parallel_query.c) which I have used for testing prepared
statement support.  I have done the verification manually
(using auto_explain) to ensure that parallel plans gets generated
and executed via this libpq program.  This program expects some
data to be generated before-hand and the information of same is
added in file-header.

Thoughts?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Вложения

Re: Prepared Statement support for Parallel query

От
Robert Haas
Дата:
On Wed, Feb 17, 2016 at 6:41 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Commit d1b7c1ffe72e86932b5395f29e006c3f503bc53d has added
> the support for passing bind parameters to parallel workers, however
> prepared statement which uses bind parameters wasn't enabled
> for parallel query as the 'Execute' message in FE-BE protocol
> can pass the row_count which can make parallel plans unusable.
> (parallel plans are only possible when query can run to completion)
>
> Later Commit bfc78d7196eb28cd4e3d6c24f7e607bacecf1129 has
> ensure that if the row_count is non-zero then we won't enter
> parallel mode which means that even if parallel plan is selected
> by optimizer, it will run such a plan locally.
>
> With above support, it was just a matter of enabling parallel
> mode for prepared statements which is done in attached patch
> (prepared_stmt_parallel_query_v1.patch).
>
> I have tested that parallel plans are getting generated both
> via Prepare/Execute statements and libpq prepared
> statement execution.  Attached is a libpq program
> (prepare_parallel_query.c) which I have used for testing prepared
> statement support.  I have done the verification manually
> (using auto_explain) to ensure that parallel plans gets generated
> and executed via this libpq program.  This program expects some
> data to be generated before-hand and the information of same is
> added in file-header.

Hmm.   I agree we should change exec_parse_message like this, but
changing PrepareQuery seems wrong.  I mean, there's a very good chance
that a parse message will be followed by an Execute message with a
zero row count, so we'll get parallel execution.  But if the user says
they want to PREPARE the query, they are probably not going to fetch
all rows.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Prepared Statement support for Parallel query

От
Robert Haas
Дата:
On Wed, Feb 17, 2016 at 6:41 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Commit d1b7c1ffe72e86932b5395f29e006c3f503bc53d has added
> the support for passing bind parameters to parallel workers, however
> prepared statement which uses bind parameters wasn't enabled
> for parallel query as the 'Execute' message in FE-BE protocol
> can pass the row_count which can make parallel plans unusable.
> (parallel plans are only possible when query can run to completion)
>
> Later Commit bfc78d7196eb28cd4e3d6c24f7e607bacecf1129 has
> ensure that if the row_count is non-zero then we won't enter
> parallel mode which means that even if parallel plan is selected
> by optimizer, it will run such a plan locally.
>
> With above support, it was just a matter of enabling parallel
> mode for prepared statements which is done in attached patch
> (prepared_stmt_parallel_query_v1.patch).
>
> I have tested that parallel plans are getting generated both
> via Prepare/Execute statements and libpq prepared
> statement execution.  Attached is a libpq program
> (prepare_parallel_query.c) which I have used for testing prepared
> statement support.  I have done the verification manually
> (using auto_explain) to ensure that parallel plans gets generated
> and executed via this libpq program.  This program expects some
> data to be generated before-hand and the information of same is
> added in file-header.

Hmm.   I agree we should change exec_parse_message like this, but
changing PrepareQuery seems wrong.  I mean, there's a very good chance
that a parse message will be followed by an Execute message with a
zero row count, so we'll get parallel execution.  But if the user says
they want to PREPARE the query, they are probably not going to fetch
all rows.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Prepared Statement support for Parallel query

От
Amit Kapila
Дата:
On Wed, Feb 24, 2016 at 7:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Feb 17, 2016 at 6:41 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Commit d1b7c1ffe72e86932b5395f29e006c3f503bc53d has added
> the support for passing bind parameters to parallel workers, however
> prepared statement which uses bind parameters wasn't enabled
> for parallel query as the 'Execute' message in FE-BE protocol
> can pass the row_count which can make parallel plans unusable.
> (parallel plans are only possible when query can run to completion)
>
> Later Commit bfc78d7196eb28cd4e3d6c24f7e607bacecf1129 has
> ensure that if the row_count is non-zero then we won't enter
> parallel mode which means that even if parallel plan is selected
> by optimizer, it will run such a plan locally.
>
> With above support, it was just a matter of enabling parallel
> mode for prepared statements which is done in attached patch
> (prepared_stmt_parallel_query_v1.patch).
>
> I have tested that parallel plans are getting generated both
> via Prepare/Execute statements and libpq prepared
> statement execution.  Attached is a libpq program
> (prepare_parallel_query.c) which I have used for testing prepared
> statement support.  I have done the verification manually
> (using auto_explain) to ensure that parallel plans gets generated
> and executed via this libpq program.  This program expects some
> data to be generated before-hand and the information of same is
> added in file-header.

Hmm.   I agree we should change exec_parse_message like this, but
changing PrepareQuery seems wrong.  I mean, there's a very good chance
that a parse message will be followed by an Execute message with a
zero row count, so we'll get parallel execution.

Yes and I think libpq doesn't even provide a interface to specify row
count for prepared statements.
 
  But if the user says
they want to PREPARE the query, they are probably not going to fetch
all rows.


After PREPARE, user will execute the statement using EXECUTE and
I don't see how user can decide number of rows to fetch which can
influence the execution.  Can you please elaborate your point more
and what is your expectation for the same?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Prepared Statement support for Parallel query

От
Robert Haas
Дата:
On Thu, Feb 25, 2016 at 8:53 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>   But if the user says
>> they want to PREPARE the query, they are probably not going to fetch
>> all rows.
>
> After PREPARE, user will execute the statement using EXECUTE and
> I don't see how user can decide number of rows to fetch which can
> influence the execution.  Can you please elaborate your point more
> and what is your expectation for the same?

Argh.  I'm getting confused between prepared statements and cursors.
So if the user does PREPARE followed by EXECUTE, then that is OK.  The
problem is only if they use DECLARE .. CURSOR FOR, which your patch
doesn't affect.

So, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Prepared Statement support for Parallel query

От
Robert Haas
Дата:
On Thu, Feb 25, 2016 at 1:09 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Feb 25, 2016 at 8:53 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>   But if the user says
>>> they want to PREPARE the query, they are probably not going to fetch
>>> all rows.
>>
>> After PREPARE, user will execute the statement using EXECUTE and
>> I don't see how user can decide number of rows to fetch which can
>> influence the execution.  Can you please elaborate your point more
>> and what is your expectation for the same?
>
> Argh.  I'm getting confused between prepared statements and cursors.
> So if the user does PREPARE followed by EXECUTE, then that is OK.  The
> problem is only if they use DECLARE .. CURSOR FOR, which your patch
> doesn't affect.
>
> So, committed.

And, I'm going to revert this part.  If you'd run the regression tests
under force_parallel_mode=regress, max_parallel_degree>0, you would
have noticed that this part breaks it, because of CREATE TABLE ... AS
EXECUTE.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Prepared Statement support for Parallel query

От
Amit Kapila
Дата:
On Fri, Feb 26, 2016 at 4:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Feb 25, 2016 at 1:09 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Thu, Feb 25, 2016 at 8:53 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>>   But if the user says
> >>> they want to PREPARE the query, they are probably not going to fetch
> >>> all rows.
> >>
> >> After PREPARE, user will execute the statement using EXECUTE and
> >> I don't see how user can decide number of rows to fetch which can
> >> influence the execution.  Can you please elaborate your point more
> >> and what is your expectation for the same?
> >
> > Argh.  I'm getting confused between prepared statements and cursors.
> > So if the user does PREPARE followed by EXECUTE, then that is OK.  The
> > problem is only if they use DECLARE .. CURSOR FOR, which your patch
> > doesn't affect.
> >
> > So, committed.
>
> And, I'm going to revert this part.  If you'd run the regression tests
> under force_parallel_mode=regress, max_parallel_degree>0, you would
> have noticed that this part breaks it, because of CREATE TABLE ... AS
> EXECUTE.
>

I will look into it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Prepared Statement support for Parallel query

От
Amit Kapila
Дата:
On Fri, Feb 26, 2016 at 4:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
>
> And, I'm going to revert this part.  If you'd run the regression tests
> under force_parallel_mode=regress, max_parallel_degree>0, you would
> have noticed that this part breaks it, because of CREATE TABLE ... AS
> EXECUTE.
>

I have looked into this issue and found that the reason for the failure is that in force_parallel_mode=regress, we enable parallel mode restrictions even if the parallel plan is not choosen as part of below code in standard_planner()

if (force_parallel_mode == FORCE_PARALLEL_OFF || !glob->parallelModeOK)

{

glob->parallelModeNeeded = false;

glob->wholePlanParallelSafe = false; /* either false or don't care */

}

else

{

glob->parallelModeNeeded = true;

glob->wholePlanParallelSafe =

!has_parallel_hazard((Node *) parse, false);

}


The failure cases fall into that category, basically wholePlanParallelSafe will be false, but parallelModeNeeded will be true which will enable parallel mode restrictions even though the plan won't contain Gather node.  I think if we want to operate in above way for testing purpose, then we need to force during execution that statements for non read-only operations should not enter into parallel mode similar to what we are doing for non-zero tuple count case.  Attached patch fixes the problem.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Вложения

Re: Prepared Statement support for Parallel query

От
Robert Haas
Дата:
On Mon, Mar 14, 2016 at 9:18 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Feb 26, 2016 at 4:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> And, I'm going to revert this part.  If you'd run the regression tests
>> under force_parallel_mode=regress, max_parallel_degree>0, you would
>> have noticed that this part breaks it, because of CREATE TABLE ... AS
>> EXECUTE.
>>
>
> I have looked into this issue and found that the reason for the failure is
> that in force_parallel_mode=regress, we enable parallel mode restrictions
> even if the parallel plan is not choosen as part of below code in
> standard_planner()
>
> if (force_parallel_mode == FORCE_PARALLEL_OFF || !glob->parallelModeOK)
>
> {
>
> glob->parallelModeNeeded = false;
>
> glob->wholePlanParallelSafe = false; /* either false or don't care */
>
> }
>
> else
>
> {
>
> glob->parallelModeNeeded = true;
>
> glob->wholePlanParallelSafe =
>
> !has_parallel_hazard((Node *) parse, false);
>
> }
>
>
> The failure cases fall into that category, basically wholePlanParallelSafe
> will be false, but parallelModeNeeded will be true which will enable
> parallel mode restrictions even though the plan won't contain Gather node.
> I think if we want to operate in above way for testing purpose, then we need
> to force during execution that statements for non read-only operations
> should not enter into parallel mode similar to what we are doing for
> non-zero tuple count case.  Attached patch fixes the problem.

This seems like a really ugly fix.  It might be possible to come up
with a fix along these lines, but I don't have much confidence in the
specific new test you've injected into executePlan().  Broadly, any
change of this type implicitly changes the contract between
executePlan() and the planner infrastructure - the planner can now
legally generate parallel plans in some cases where that would
previously have been unacceptable.  But I'm not in a hurry to rethink
where we've drawn the line there for 9.6.  Let's punt this issue for
now and come back to it in a future release.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Prepared Statement support for Parallel query

От
Amit Kapila
Дата:
On Tue, Mar 15, 2016 at 12:21 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Mar 14, 2016 at 9:18 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Fri, Feb 26, 2016 at 4:37 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >
> >
> > The failure cases fall into that category, basically wholePlanParallelSafe
> > will be false, but parallelModeNeeded will be true which will enable
> > parallel mode restrictions even though the plan won't contain Gather node.
> > I think if we want to operate in above way for testing purpose, then we need
> > to force during execution that statements for non read-only operations
> > should not enter into parallel mode similar to what we are doing for
> > non-zero tuple count case.  Attached patch fixes the problem.
>
> This seems like a really ugly fix.  It might be possible to come up
> with a fix along these lines, but I don't have much confidence in the
> specific new test you've injected into executePlan().  Broadly, any
> change of this type implicitly changes the contract between
> executePlan() and the planner infrastructure - the planner can now
> legally generate parallel plans in some cases where that would
> previously have been unacceptable.  But I'm not in a hurry to rethink
> where we've drawn the line there for 9.6.  Let's punt this issue for
> now and come back to it in a future release.
>

No issues.  I felt that it might be good to support parallel query via Prepare statement as there is no fundamental issue in the same, but as you say, we can do that in future release as well.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com