Обсуждение: [HACKERS] A GUC to prevent leader processes from running subplans?

Поиск
Список
Период
Сортировка

[HACKERS] A GUC to prevent leader processes from running subplans?

От
Thomas Munro
Дата:
Hi hackers,

While testing parallelism work I've wanted to be able to prevent
gather nodes from running the plan in the leader process, and I've
heard others say the same.  One way would be to add a GUC
"multiplex_gather", like in the attached patch.  If you set it to off,
Gather and Gather Merge won't run the subplan unless they have to
because no workers could be launched.  I thought about adding a new
value for force_parallel_mode instead, but someone mentioned they
might want to do this on a production system too and
force_parallel_mode is not really for end users.  Better ideas?

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] A GUC to prevent leader processes from running subplans?

От
Robert Haas
Дата:
On Tue, Oct 17, 2017 at 7:27 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> While testing parallelism work I've wanted to be able to prevent
> gather nodes from running the plan in the leader process, and I've
> heard others say the same.  One way would be to add a GUC
> "multiplex_gather", like in the attached patch.  If you set it to off,
> Gather and Gather Merge won't run the subplan unless they have to
> because no workers could be launched.  I thought about adding a new
> value for force_parallel_mode instead, but someone mentioned they
> might want to do this on a production system too and
> force_parallel_mode is not really for end users.  Better ideas?

I don't think overloading force_parallel_mode is a good idea, but
having some other GUC for this seems OK to me.  Not sure I like
multiplex_gather, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A GUC to prevent leader processes from running subplans?

От
Thomas Munro
Дата:
On Sat, Oct 21, 2017 at 8:09 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Oct 17, 2017 at 7:27 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> While testing parallelism work I've wanted to be able to prevent
>> gather nodes from running the plan in the leader process, and I've
>> heard others say the same.  One way would be to add a GUC
>> "multiplex_gather", like in the attached patch.  If you set it to off,
>> Gather and Gather Merge won't run the subplan unless they have to
>> because no workers could be launched.  I thought about adding a new
>> value for force_parallel_mode instead, but someone mentioned they
>> might want to do this on a production system too and
>> force_parallel_mode is not really for end users.  Better ideas?
>
> I don't think overloading force_parallel_mode is a good idea, but
> having some other GUC for this seems OK to me.  Not sure I like
> multiplex_gather, though.

How about parallel_leader_participation = on|off?  The attached
version has it that way, and adds regression tests to exercise on, off
and off-but-couldn't-start-any-workers for both kinds of gather node.

I'm not sure why node->need_to_rescan is initialised by both
ExecGatherInit() and ExecGather().  Only the latter's value matters,
right?

I've added this to the January Commitfest.

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: [HACKERS] A GUC to prevent leader processes from running subplans?

От
Amit Kapila
Дата:
On Sun, Nov 12, 2017 at 9:18 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Sat, Oct 21, 2017 at 8:09 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Oct 17, 2017 at 7:27 AM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>>
>> I don't think overloading force_parallel_mode is a good idea, but
>> having some other GUC for this seems OK to me.  Not sure I like
>> multiplex_gather, though.
>
> How about parallel_leader_participation = on|off?  The attached
> version has it that way, and adds regression tests to exercise on, off
> and off-but-couldn't-start-any-workers for both kinds of gather node.
>
> I'm not sure why node->need_to_rescan is initialised by both
> ExecGatherInit() and ExecGather().  Only the latter's value matters,
> right?
>

I don't see anything like need_to_rescan in the GatherState node.  Do
you intend to say need_to_scan_locally?  If yes, then I think whatever
you said is right.


> I've added this to the January Commitfest.
>

+1 to this idea.  Do you think such an option at table level can be
meaningful?  We have a parallel_workers as a storage option for
tables, so users might want leader to participate in parallelism only
for some of the tables.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A GUC to prevent leader processes from running subplans?

От
Thomas Munro
Дата:
On Sun, Nov 12, 2017 at 8:51 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sun, Nov 12, 2017 at 9:18 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> How about parallel_leader_participation = on|off?  The attached
>> version has it that way, and adds regression tests to exercise on, off
>> and off-but-couldn't-start-any-workers for both kinds of gather node.
>>
>> I'm not sure why node->need_to_rescan is initialised by both
>> ExecGatherInit() and ExecGather().  Only the latter's value matters,
>> right?
>>
>
> I don't see anything like need_to_rescan in the GatherState node.  Do
> you intend to say need_to_scan_locally?  If yes, then I think whatever
> you said is right.

Right, that's what I meant to write.  Thanks.

>> I've added this to the January Commitfest.
>>
>
> +1 to this idea.  Do you think such an option at table level can be
> meaningful?  We have a parallel_workers as a storage option for
> tables, so users might want leader to participate in parallelism only
> for some of the tables.

I'm not sure.  I think the reason for turning it off (other than
developer testing) would be that the leader is getting tied up doing
work that takes a long time (sorting, hashing, aggregating) and that's
causing the workers to be blocked because their output queue is full.
I think that type of behaviour comes from certain plan types, and it
probably wouldn't make sense to associate this behaviour with the
tables you're scanning.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] A GUC to prevent leader processes from running subplans?

От
Robert Haas
Дата:
On Sat, Nov 11, 2017 at 10:48 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> How about parallel_leader_participation = on|off?  The attached
> version has it that way, and adds regression tests to exercise on, off
> and off-but-couldn't-start-any-workers for both kinds of gather node.

This looks mostly fine to me, but I think the documentation is strange:

+        Allows the leader process to execute the query plan under
+        <literal>Gather</literal> and <literal>Gather Merge</literal> nodes
+        instead of waiting for worker processes.  The default is
+        <literal>on</literal>.  Setting this value to <literal>on</literal>
+        can cause the leader process to begin producing tuples sooner instead
+        of waiting for worker processes to start up, but might in some cases
+        also cause workers to become blocked waiting for the leader to clear
+        tuple queues.

This documentation would seem exactly right to me if the default value
were off, but as it is it seems kinda backwards, because it's
explaining why you might want to set the value to what is anyway the
default.  Also, it's always possible for the workers to become blocked
waiting for the leader, regardless of how this GUC is set.  It becomes
more likely when this turned off, but that's not quite the same thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] A GUC to prevent leader processes from running subplans?

От
Thomas Munro
Дата:
On Tue, Nov 14, 2017 at 10:30 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Nov 11, 2017 at 10:48 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> How about parallel_leader_participation = on|off?  The attached
>> version has it that way, and adds regression tests to exercise on, off
>> and off-but-couldn't-start-any-workers for both kinds of gather node.
>
> This looks mostly fine to me, but I think the documentation is strange:
>
> +        Allows the leader process to execute the query plan under
> +        <literal>Gather</literal> and <literal>Gather Merge</literal> nodes
> +        instead of waiting for worker processes.  The default is
> +        <literal>on</literal>.  Setting this value to <literal>on</literal>
> +        can cause the leader process to begin producing tuples sooner instead
> +        of waiting for worker processes to start up, but might in some cases
> +        also cause workers to become blocked waiting for the leader to clear
> +        tuple queues.
>
> This documentation would seem exactly right to me if the default value
> were off, but as it is it seems kinda backwards, because it's
> explaining why you might want to set the value to what is anyway the
> default.  Also, it's always possible for the workers to become blocked
> waiting for the leader, regardless of how this GUC is set.  It becomes
> more likely when this turned off, but that's not quite the same thing.

Thanks.  You're right.  Rebased and updated to describe what "off" does.

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

Re: [HACKERS] A GUC to prevent leader processes from running subplans?

От
Robert Haas
Дата:
On Tue, Nov 14, 2017 at 12:28 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> Thanks.  You're right.  Rebased and updated to describe what "off" does.

Committed.  I noticed that you didn't add the new GUC to
postgresql.conf.sample, so I did that.  But then I thought it didn't
really belong in the section you put it, so I moved it to
RESOURCES_ASYNCHRONOUS.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] A GUC to prevent leader processes from runningsubplans?

От
Jesper Pedersen
Дата:
On 11/15/2017 08:28 AM, Robert Haas wrote:
> On Tue, Nov 14, 2017 at 12:28 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> Thanks.  You're right.  Rebased and updated to describe what "off" does.
> 
> Committed.  I noticed that you didn't add the new GUC to
> postgresql.conf.sample, so I did that.  But then I thought it didn't
> really belong in the section you put it, so I moved it to
> RESOURCES_ASYNCHRONOUS.
> 

Small typo.

Best regards,
  Jesper


Вложения

Re: [HACKERS] A GUC to prevent leader processes from running subplans?

От
Robert Haas
Дата:
On Wed, Nov 15, 2017 at 8:35 AM, Jesper Pedersen
<jesper.pedersen@redhat.com> wrote:
> Small typo.

Thanks.  That just proves no task is so simple that I can't foul it up.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] A GUC to prevent leader processes from running subplans?

От
Amit Kapila
Дата:
On Wed, Nov 15, 2017 at 6:58 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Nov 14, 2017 at 12:28 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> Thanks.  You're right.  Rebased and updated to describe what "off" does.
>
> Committed.
>

Thanks, reflected the same in CF entry
(https://commitfest.postgresql.org/16/1375/).

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com