Обсуждение: Pg10 : Client Configuration for Parallelism ?

Поиск
Список
Период
Сортировка

Pg10 : Client Configuration for Parallelism ?

От
Дата:
Hi,

I am working on PostgreSQL 10.5 and I have a discrepancy between clients regarding parallelism feature.

For a simple query (say a simple SELECT COUNT(*) FROM BIG_TABLE), I can see PostgreSQL use parallelism when the query
islaunched from psql or PgAdmin4. However the same query launched with DBeaver (ie connected through JDBC) does not use
parallelism.
 

SELECT current_setting('max_parallel_workers_per_gather')  gives 10 from my session.

Is there a client configuration that prevents from using parallelism ?

Thanks.

Laurent


_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.


Re: Pg10 : Client Configuration for Parallelism ?

От
Andreas Kretschmer
Дата:

Am 17.04.19 um 08:30 schrieb laurent.dechambe@orange.com:
> SELECT current_setting('max_parallel_workers_per_gather')  gives 10 from my session.
>
> Is there a client configuration that prevents from using parallelism ?
unlikely.

if i were you, i would compare all settings, using the different client 
software. (show all, and compare)



Regards, Andreas

-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com




Re: Pg10 : Client Configuration for Parallelism ?

От
Thomas Kellerer
Дата:
laurent.dechambe@orange.com schrieb am 17.04.2019 um 08:30:
> I am working on PostgreSQL 10.5 and I have a discrepancy between clients regarding parallelism feature.
> 
> For a simple query (say a simple SELECT COUNT(*) FROM BIG_TABLE), I
> can see PostgreSQL use parallelism when the query is launched from
> psql or PgAdmin4. However the same query launched with DBeaver (ie
> connected through JDBC) does not use parallelism.
> 
> SELECT current_setting('max_parallel_workers_per_gather') gives 10
> from my session.
> 
> Is there a client configuration that prevents from using parallelism?

Maybe DBeaver wraps the statement for some reason? (I have seen SQL clients do that)
A CTE would prevent parallelism. 




Sv: Pg10 : Client Configuration for Parallelism ?

От
Andreas Joseph Krogh
Дата:
På onsdag 17. april 2019 kl. 08:30:28, skrev <laurent.dechambe@orange.com>:
Hi,

I am working on PostgreSQL 10.5 and I have a discrepancy between clients regarding parallelism feature.

For a simple query (say a simple SELECT COUNT(*) FROM BIG_TABLE), I can see PostgreSQL use parallelism when the query is launched from psql or PgAdmin4. However the same query launched with DBeaver (ie connected through JDBC) does not use parallelism.

SELECT current_setting('max_parallel_workers_per_gather')  gives 10 from my session.

Is there a client configuration that prevents from using parallelism ?

Thanks.

Laurent
 
Set in postgresql.conf:
 
log_statement = 'all'
 
reload settings and check the logs for what statemets are acutally issued.
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 
Вложения

RE: Pg10 : Client Configuration for Parallelism ?

От
Дата:
Thanks for the tip. I have compared all settings and they are identical.

Very strange.

-----Message d'origine-----
De : Andreas Kretschmer [mailto:andreas@a-kretschmer.de] 
Envoyé : mercredi 17 avril 2019 10:07
À : pgsql-performance@lists.postgresql.org
Objet : Re: Pg10 : Client Configuration for Parallelism ?



Am 17.04.19 um 08:30 schrieb laurent.dechambe@orange.com:
> SELECT current_setting('max_parallel_workers_per_gather')  gives 10 from my session.
>
> Is there a client configuration that prevents from using parallelism ?
unlikely.

if i were you, i would compare all settings, using the different client 
software. (show all, and compare)



Regards, Andreas

-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com





_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.


RE: Pg10 : Client Configuration for Parallelism ?

От
Дата:

As answered to Andreas Kretschmer all settings are identical.

 

I have made some other tests, even testing a basic jdbc program (open connection, execute statement, display result, close connection)

 

Here are the logs (with log_error_verbosity = verbose) :

 

<DBEAVER>

2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;00000;LOG:  00000: execute <unnamed>: SELECT COUNT(1) FROM big_table

2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;00000;LOCATION:  exec_execute_message, postgres.c:1959

2019-04-17 11:31:08 CEST;35895;thedbuser;thedb;00000;LOG:  00000: duration: 25950.908 ms

2019-04-17 11:31:08 CEST;35895;thedbuser;thedb;00000;LOCATION:  exec_execute_message, postgres.c:2031

 

<BASIC JDBC>

2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;00000;LOG:  00000: execute <unnamed>: SELECT COUNT(1) FROM big_table

2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;00000;LOCATION:  exec_execute_message, postgres.c:1959

2019-04-17 11:31:32 CEST;37257;thedbuser;thedb;00000;LOG:  00000: duration: 11459.943 ms

2019-04-17 11:31:32 CEST;37257;thedbuser;thedb;00000;LOCATION:  exec_execute_message, postgres.c:2031

 

<PGADMIN4>

2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;00000;LOG:  00000: statement: SELECT COUNT(1) FROM big_table;

2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;00000;LOCATION:  exec_simple_query, postgres.c:940

2019-04-17 11:33:08 CEST;37324;thedbuser;thedb;00000;LOG:  00000: duration: 11334.677 ms

2019-04-17 11:33:08 CEST;37313;thedbuser;thedb;00000;LOG:  00000: statement: SELECT oid, format_type(oid, NULL) AS typname FROM pg_type WHERE oid IN (20) ORDER BY oid;

2019-04-17 11:33:08 CEST;37313;thedbuser;thedb;00000;LOCATION:  exec_simple_query, postgres.c:940

2019-04-17 11:33:08 CEST;37313;thedbuser;thedb;00000;LOG:  00000: duration: 0.900 ms

2019-04-17 11:33:08 CEST;37313;thedbuser;thedb;00000;LOCATION:  exec_simple_query, postgres.c:1170

 

I don’t see any difference a part from the query duration.  Note that while monitoring the server I saw that there was parallelism with JDBC program and PGAdmin4, but not with Dbeaver. And the JDBC driver is the same in both “Basic JDBC” and DBeaver.

 

Regards.

 

Laurent.

 

 

 

De : Andreas Joseph Krogh [mailto:andreas@visena.com]
Envoyé : mercredi 17 avril 2019 11:08
À : pgsql-performance@lists.postgresql.org
Objet : Sv: Pg10 : Client Configuration for Parallelism ?

 

På onsdag 17. april 2019 kl. 08:30:28, skrev <laurent.dechambe@orange.com>:

Hi,

I am working on PostgreSQL 10.5 and I have a discrepancy between clients regarding parallelism feature.

For a simple query (say a simple SELECT COUNT(*) FROM BIG_TABLE), I can see PostgreSQL use parallelism when the query is launched from psql or PgAdmin4. However the same query launched with DBeaver (ie connected through JDBC) does not use parallelism.

SELECT current_setting('max_parallel_workers_per_gather')  gives 10 from my session.

Is there a client configuration that prevents from using parallelism ?

Thanks.

Laurent

 

Set in postgresql.conf:

 

log_statement = 'all'

 

reload settings and check the logs for what statemets are acutally issued.

 

--

Andreas Joseph Krogh

CTO / Partner - Visena AS

Mobile: +47 909 56 963

 

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
Вложения

Re: Pg10 : Client Configuration for Parallelism ?

От
Andreas Kretschmer
Дата:

Am 17.04.19 um 11:51 schrieb laurent.dechambe@orange.com:
>
> Here are the logs (with log_error_verbosity = verbose) :
>
> <DBEAVER>
>
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;00000;LOG:  00000: 
> execute <unnamed>: SELECT COUNT(1) FROM big_table
>
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;00000;LOCATION: 
> exec_execute_message, postgres.c:1959
>
> 2019-04-17 11:31:08 CEST;35895;thedbuser;thedb;00000;LOG:  00000: 
> duration: 25950.908 ms
>
> <BASIC JDBC>
>
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;00000;LOG:  00000: 
> execute <unnamed>: SELECT COUNT(1) FROM big_table
>
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;00000;LOCATION: 
> exec_execute_message, postgres.c:1959
>
> 2019-04-17 11:31:32 CEST;37257;thedbuser;thedb;00000;LOG:  00000: 
> duration: 11459.943 ms
>
>
> <PGADMIN4>
>
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;00000;LOG:  00000: 
> statement: SELECT COUNT(1) FROM big_table;
>
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;00000;LOCATION:  
> exec_simple_query, postgres.c:940
>
> 2019-04-17 11:33:08 CEST;37324;thedbuser;thedb;00000;LOG:  00000: 
> duration: 11334.677 ms
>
>

That's compareable. The first one took more time, cold cache. The 2nd 
and 3rd are faster, warm cache.

But: we can't see if the execution is paralell or not. If you want to 
know that, install and use auto_explain.


Regards, Andreas



-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com




RE: Pg10 : Client Configuration for Parallelism ?

От
Дата:
I can see whether there is parallelism with pg_top or barely top on the server. 

<DBEAVER>
   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 38584 postgres  20   0 8863828 8.153g 8.151g R 100.0  3.2   1:23.01 postgres
    10 root      20   0       0      0      0 S   0.3  0.0  88:07.26 rcu_sched

<BASIC JDBC>
   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 46687 postgres  20   0 8864620 0.978g 0.977g S  38.5  0.4   0:01.16 postgres
 46689 postgres  20   0 8864348 996.4m 995.1m R  38.5  0.4   0:01.16 postgres
 46690 postgres  20   0 8864348 987.2m 985.8m S  38.5  0.4   0:01.16 postgres
 46691 postgres  20   0 8864348 998436 997084 R  38.5  0.4   0:01.16 postgres
 46692 postgres  20   0 8864348 982612 981260 S  38.5  0.4   0:01.16 postgres
 46693 postgres  20   0 8864348 979.9m 978.6m R  38.5  0.4   0:01.16 postgres
 46694 postgres  20   0 8864348 987.9m 986.6m S  38.5  0.4   0:01.16 postgres
 46696 postgres  20   0 8864348 996864 995512 S  38.5  0.4   0:01.16 postgres
 46688 postgres  20   0 8864348 982.3m 981.0m R  38.2  0.4   0:01.15 postgres
 46695 postgres  20   0 8864348 986.9m 985.6m S  38.2  0.4   0:01.15 postgres
 21323 postgres  20   0 8862788 8.096g 8.095g S   0.7  3.2   2:24.75 postgres
 46682 postgres  20   0  157996   2596   1548 R   0.7  0.0   0:00.05 top

This is not a matter of cache. If I execute the queries in a different order the result will be the same : DBeaver
queryis longer.
 

There is something in documentation that says that there won't be parallelism if " The client sends an Execute message
witha non-zero fetch count."
 
I am not sure what this sentence means. 

-----Message d'origine-----
De : Andreas Kretschmer [mailto:andreas@a-kretschmer.de] 
Envoyé : mercredi 17 avril 2019 12:39
À : pgsql-performance@lists.postgresql.org
Objet : Re: Pg10 : Client Configuration for Parallelism ?



Am 17.04.19 um 11:51 schrieb laurent.dechambe@orange.com:
>
> Here are the logs (with log_error_verbosity = verbose) :
>
> <DBEAVER>
>
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;00000;LOG:  00000: 
> execute <unnamed>: SELECT COUNT(1) FROM big_table
>
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;00000;LOCATION: 
> exec_execute_message, postgres.c:1959
>
> 2019-04-17 11:31:08 CEST;35895;thedbuser;thedb;00000;LOG:  00000: 
> duration: 25950.908 ms
>
> <BASIC JDBC>
>
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;00000;LOG:  00000: 
> execute <unnamed>: SELECT COUNT(1) FROM big_table
>
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;00000;LOCATION: 
> exec_execute_message, postgres.c:1959
>
> 2019-04-17 11:31:32 CEST;37257;thedbuser;thedb;00000;LOG:  00000: 
> duration: 11459.943 ms
>
>
> <PGADMIN4>
>
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;00000;LOG:  00000: 
> statement: SELECT COUNT(1) FROM big_table;
>
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;00000;LOCATION:  
> exec_simple_query, postgres.c:940
>
> 2019-04-17 11:33:08 CEST;37324;thedbuser;thedb;00000;LOG:  00000: 
> duration: 11334.677 ms
>
>

That's compareable. The first one took more time, cold cache. The 2nd 
and 3rd are faster, warm cache.

But: we can't see if the execution is paralell or not. If you want to 
know that, install and use auto_explain.


Regards, Andreas



-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com





_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.


Re: Pg10 : Client Configuration for Parallelism ?

От
Gunther Schadow
Дата:
On 4/17/2019 4:33, Thomas Kellerer wrote:
A CTE would prevent parallelism. 

You mean like always? His

SELECT count(1) FROM BigTable 

would be parallel if run alone but as

WITH Data AS (SELECT count(1) FROM BigTable) SELECT * FROM Data 

nothing would be parallel any more? How about:

SELECT * FROM (SELECT count(1) FROM BigTable) x

Parallel or not?

-Gunther

Re: Pg10 : Client Configuration for Parallelism ?

От
Gunther Schadow
Дата:
By the way

On 4/17/2019 7:26, laurent.dechambe@orange.com wrote:
> I can see whether there is parallelism with pg_top or barely top on the server.
>
> <DBEAVER>
>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
>   38584 postgres  20   0 8863828 8.153g 8.151g R 100.0  3.2   1:23.01 postgres
>      10 root      20   0       0      0      0 S   0.3  0.0  88:07.26 rcu_sched
>
> <BASIC JDBC>
>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
>   46687 postgres  20   0 8864620 0.978g 0.977g S  38.5  0.4   0:01.16 postgres
>   46689 postgres  20   0 8864348 996.4m 995.1m R  38.5  0.4   0:01.16 postgres
>   46690 postgres  20   0 8864348 987.2m 985.8m S  38.5  0.4   0:01.16 postgres
>   46691 postgres  20   0 8864348 998436 997084 R  38.5  0.4   0:01.16 postgres
>   ...
>   46682 postgres  20   0  157996   2596   1548 R   0.7  0.0   0:00.05 top

If you just use top with the -c option, you will see each postgres 
process identify itself as to its role, e.g.

postgres: parallel worker for PID 46687

or

postgres: SELECT ...

or

postgres: wal writer

extremely useful this.

-Gunther




RE: Pg10 : Client Configuration for Parallelism ?

От
Дата:
Auto explain shows that in both cases there are workers planned, but with DBeaver they are not launched.

Here's what I get with auto_explain : 

<DBEAVER> 
2019-04-17 14:46:09 CEST;54882;thedbuser;thedb;00000;LOG:  00000: duration: 0.095 ms
2019-04-17 14:46:09 CEST;54882;thedbuser;thedb;00000;LOCATION:  exec_parse_message, postgres.c:1433
2019-04-17 14:46:09 CEST;54882;thedbuser;thedb;00000;LOG:  00000: duration: 0.191 ms
2019-04-17 14:46:09 CEST;54882;thedbuser;thedb;00000;LOCATION:  exec_bind_message, postgres.c:1813
2019-04-17 14:46:09 CEST;54882;thedbuser;thedb;00000;LOG:  00000: execute <unnamed>: SELECT COUNT(1) FROM big_table
2019-04-17 14:46:09 CEST;54882;thedbuser;thedb;00000;LOCATION:  exec_execute_message, postgres.c:1959
2019-04-17 14:46:45 CEST;54882;thedbuser;thedb;00000;LOG:  00000: duration: 35842.146 ms
2019-04-17 14:46:45 CEST;54882;thedbuser;thedb;00000;LOCATION:  exec_execute_message, postgres.c:2031
2019-04-17 14:46:45 CEST;54882;thedbuser;thedb;00000;LOG:  00000: duration: 35842.110 ms  plan:
        Query Text: SELECT COUNT(1) FROM big_table
        Finalize Aggregate  (cost=3081157.61..3081157.62 rows=1 width=8) (actual time=35842.072..35842.072 rows=1
loops=1)
          Output: count(1)
          ->  Gather  (cost=3081156.68..3081157.59 rows=9 width=8) (actual time=35842.062..35842.062 rows=1 loops=1)
                Output: (PARTIAL count(1))
                Workers Planned: 9
                Workers Launched: 0
                ->  Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual time=35842.060..35842.060
rows=1loops=1)
 
                      Output: PARTIAL count(1)
                      ->  Parallel Index Only Scan using idx_big_table__inact on big_table  (cost=0.57..3029148.07
rows=20403444width=0) (actual time=0.036..24038.340 rows=183778867 loops=1)
 
                            Heap Fetches: 57043846
2019-04-17 14:46:45 CEST;54882;thedbuser;thedb;00000;LOCATION:  explain_ExecutorEnd, auto_explain.c:359

<BASIC JDBC>
2019-04-17 14:47:39 CEST;55222;thedbuser;thedb;00000;LOCATION:  exec_parse_message, postgres.c:1433
2019-04-17 14:47:39 CEST;55222;thedbuser;thedb;00000;LOG:  00000: duration: 2.077 ms
2019-04-17 14:47:39 CEST;55222;thedbuser;thedb;00000;LOCATION:  exec_bind_message, postgres.c:1813
2019-04-17 14:47:39 CEST;55222;thedbuser;thedb;00000;LOG:  00000: execute <unnamed>: SELECT COUNT(1) FROM big_table
2019-04-17 14:47:39 CEST;55222;thedbuser;thedb;00000;LOCATION:  exec_execute_message, postgres.c:1959
2019-04-17 14:47:50 CEST;55235;;;00000;LOG:  00000: duration: 11317.118 ms  plan:
        Query Text: SELECT COUNT(1) FROM big_table
        Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual time=11317.095..11317.095 rows=1
loops=1)
          Output: PARTIAL count(1)
          ->  Parallel Index Only Scan using idx_big_table__inact on big_table  (cost=0.57..3029148.07 rows=20403444
width=0)(actual time=0.135..10036.104 rows=18161056 loops=1)
 
                Heap Fetches: 5569541
2019-04-17 14:47:50 CEST;55235;;;00000;LOCATION:  explain_ExecutorEnd, auto_explain.c:359
2019-04-17 14:47:50 CEST;55236;;;00000;LOG:  00000: duration: 11316.071 ms  plan:
        Query Text: SELECT COUNT(1) FROM big_table
        Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual time=11316.043..11316.043 rows=1
loops=1)
          Output: PARTIAL count(1)
          ->  Parallel Index Only Scan using idx_big_table__inact on big_table  (cost=0.57..3029148.07 rows=20403444
width=0)(actual time=0.171..10000.782 rows=18377525 loops=1)
 
                Heap Fetches: 5735254
2019-04-17 14:47:50 CEST;55236;;;00000;LOCATION:  explain_ExecutorEnd, auto_explain.c:359
2019-04-17 14:47:50 CEST;55237;;;00000;LOG:  00000: duration: 11315.871 ms  plan:
        Query Text: SELECT COUNT(1) FROM big_table
        Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual time=11315.851..11315.852 rows=1
loops=1)
          Output: PARTIAL count(1)
          ->  Parallel Index Only Scan using idx_big_table__inact on big_table  (cost=0.57..3029148.07 rows=20403444
width=0)(actual time=0.140..10042.102 rows=18082389 loops=1)
 
                Heap Fetches: 5579176
2019-04-17 14:47:50 CEST;55237;;;00000;LOCATION:  explain_ExecutorEnd, auto_explain.c:359
2019-04-17 14:47:50 CEST;55232;;;00000;LOG:  00000: duration: 11317.573 ms  plan:
        Query Text: SELECT COUNT(1) FROM big_table
        Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual time=11317.553..11317.553 rows=1
loops=1)
          Output: PARTIAL count(1)
          ->  Parallel Index Only Scan using idx_big_table__inact on big_table  (cost=0.57..3029148.07 rows=20403444
width=0)(actual time=0.115..10047.908 rows=18732838 loops=1)
 
                Heap Fetches: 5849965
2019-04-17 14:47:50 CEST;55232;;;00000;LOCATION:  explain_ExecutorEnd, auto_explain.c:359
2019-04-17 14:47:50 CEST;55234;;;00000;LOG:  00000: duration: 11317.221 ms  plan:
        Query Text: SELECT COUNT(1) FROM big_table
        Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual time=11317.202..11317.202 rows=1
loops=1)
          Output: PARTIAL count(1)
          ->  Parallel Index Only Scan using idx_big_table__inact on big_table  (cost=0.57..3029148.07 rows=20403444
width=0)(actual time=0.116..10027.937 rows=18517339 loops=1)
 
                Heap Fetches: 5849910
2019-04-17 14:47:50 CEST;55234;;;00000;LOCATION:  explain_ExecutorEnd, auto_explain.c:359
2019-04-17 14:47:50 CEST;55238;;;00000;LOG:  00000: duration: 11316.571 ms  plan:
        Query Text: SELECT COUNT(1) FROM big_table
        Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual time=11316.553..11316.554 rows=1
loops=1)
          Output: PARTIAL count(1)
          ->  Parallel Index Only Scan using idx_big_table__inact on big_table  (cost=0.57..3029148.07 rows=20403444
width=0)(actual time=0.111..10047.353 rows=18722306 loops=1)
 
                Heap Fetches: 5829235
2019-04-17 14:47:50 CEST;55238;;;00000;LOCATION:  explain_ExecutorEnd, auto_explain.c:359
2019-04-17 14:47:50 CEST;55230;;;00000;LOG:  00000: duration: 11320.223 ms  plan:
        Query Text: SELECT COUNT(1) FROM big_table
        Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual time=11320.198..11320.198 rows=1
loops=1)
          Output: PARTIAL count(1)
          ->  Parallel Index Only Scan using idx_big_table__inact on big_table  (cost=0.57..3029148.07 rows=20403444
width=0)(actual time=0.132..10040.186 rows=18164384 loops=1)
 
                Heap Fetches: 5585309
2019-04-17 14:47:50 CEST;55230;;;00000;LOCATION:  explain_ExecutorEnd, auto_explain.c:359
2019-04-17 14:47:50 CEST;55231;;;00000;LOG:  00000: duration: 11319.001 ms  plan:
        Query Text: SELECT COUNT(1) FROM big_table
        Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual time=11318.981..11318.981 rows=1
loops=1)
          Output: PARTIAL count(1)
          ->  Parallel Index Only Scan using idx_big_table__inact on big_table  (cost=0.57..3029148.07 rows=20403444
width=0)(actual time=0.157..10035.136 rows=18189018 loops=1)
 
                Heap Fetches: 5638358
2019-04-17 14:47:50 CEST;55231;;;00000;LOCATION:  explain_ExecutorEnd, auto_explain.c:359
2019-04-17 14:47:50 CEST;55233;;;00000;LOG:  00000: duration: 11317.772 ms  plan:
        Query Text: SELECT COUNT(1) FROM big_table
        Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual time=11317.750..11317.751 rows=1
loops=1)
          Output: PARTIAL count(1)
          ->  Parallel Index Only Scan using idx_big_table__inact on big_table  (cost=0.57..3029148.07 rows=20403444
width=0)(actual time=0.113..10036.766 rows=18198240 loops=1)
 
                Heap Fetches: 5627716
2019-04-17 14:47:50 CEST;55233;;;00000;LOCATION:  explain_ExecutorEnd, auto_explain.c:359
2019-04-17 14:47:51 CEST;55222;thedbuser;thedb;00000;LOG:  00000: duration: 11735.201 ms
2019-04-17 14:47:51 CEST;55222;thedbuser;thedb;00000;LOCATION:  exec_execute_message, postgres.c:2031
2019-04-17 14:47:51 CEST;55222;thedbuser;thedb;00000;LOG:  00000: duration: 11735.174 ms  plan:
        Query Text: SELECT COUNT(1) FROM big_table
        Finalize Aggregate  (cost=3081157.61..3081157.62 rows=1 width=8) (actual time=11326.891..11326.891 rows=1
loops=1)
          Output: count(1)
          ->  Gather  (cost=3081156.68..3081157.59 rows=9 width=8) (actual time=11325.571..11735.108 rows=10 loops=1)
                Output: (PARTIAL count(1))
                Workers Planned: 9
                Workers Launched: 9
                ->  Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual time=11318.223..11318.223
rows=1loops=10)
 
                      Output: PARTIAL count(1)
                      Worker 0: actual time=11316.553..11316.554 rows=1 loops=1
                      Worker 1: actual time=11315.851..11315.852 rows=1 loops=1
                      Worker 2: actual time=11316.043..11316.043 rows=1 loops=1
                      Worker 3: actual time=11317.095..11317.095 rows=1 loops=1
                      Worker 4: actual time=11317.202..11317.202 rows=1 loops=1
                      Worker 5: actual time=11317.750..11317.751 rows=1 loops=1
                      Worker 6: actual time=11317.553..11317.553 rows=1 loops=1
                      Worker 7: actual time=11318.981..11318.981 rows=1 loops=1
                      Worker 8: actual time=11320.198..11320.198 rows=1 loops=1
                      ->  Parallel Index Only Scan using idx_big_table__inact on big_table  (cost=0.57..3029148.07
rows=20403444width=0) (actual time=0.131..10036.680 rows=18377887 loops=10)
 
                            Heap Fetches: 5779382
                            Worker 0: actual time=0.111..10047.353 rows=18722306 loops=1
                            Worker 1: actual time=0.140..10042.102 rows=18082389 loops=1
                            Worker 2: actual time=0.171..10000.782 rows=18377525 loops=1
                            Worker 3: actual time=0.135..10036.104 rows=18161056 loops=1
                            Worker 4: actual time=0.116..10027.937 rows=18517339 loops=1
                            Worker 5: actual time=0.113..10036.766 rows=18198240 loops=1
                            Worker 6: actual time=0.115..10047.908 rows=18732838 loops=1
                            Worker 7: actual time=0.157..10035.136 rows=18189018 loops=1
                            Worker 8: actual time=0.132..10040.186 rows=18164384 loops=1
2019-04-17 14:47:51 CEST;55222;thedbuser;thedb;00000;LOCATION:  explain_ExecutorEnd, auto_explain.c:359



-----Message d'origine-----
De : DECHAMBE Laurent DTSI/DSI 
Envoyé : mercredi 17 avril 2019 13:26
À : 'Andreas Kretschmer'; pgsql-performance@lists.postgresql.org
Objet : RE: Pg10 : Client Configuration for Parallelism ?

I can see whether there is parallelism with pg_top or barely top on the server. 

<DBEAVER>
   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 38584 postgres  20   0 8863828 8.153g 8.151g R 100.0  3.2   1:23.01 postgres
    10 root      20   0       0      0      0 S   0.3  0.0  88:07.26 rcu_sched

<BASIC JDBC>
   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 46687 postgres  20   0 8864620 0.978g 0.977g S  38.5  0.4   0:01.16 postgres
 46689 postgres  20   0 8864348 996.4m 995.1m R  38.5  0.4   0:01.16 postgres
 46690 postgres  20   0 8864348 987.2m 985.8m S  38.5  0.4   0:01.16 postgres
 46691 postgres  20   0 8864348 998436 997084 R  38.5  0.4   0:01.16 postgres
 46692 postgres  20   0 8864348 982612 981260 S  38.5  0.4   0:01.16 postgres
 46693 postgres  20   0 8864348 979.9m 978.6m R  38.5  0.4   0:01.16 postgres
 46694 postgres  20   0 8864348 987.9m 986.6m S  38.5  0.4   0:01.16 postgres
 46696 postgres  20   0 8864348 996864 995512 S  38.5  0.4   0:01.16 postgres
 46688 postgres  20   0 8864348 982.3m 981.0m R  38.2  0.4   0:01.15 postgres
 46695 postgres  20   0 8864348 986.9m 985.6m S  38.2  0.4   0:01.15 postgres
 21323 postgres  20   0 8862788 8.096g 8.095g S   0.7  3.2   2:24.75 postgres
 46682 postgres  20   0  157996   2596   1548 R   0.7  0.0   0:00.05 top

This is not a matter of cache. If I execute the queries in a different order the result will be the same : DBeaver
queryis longer.
 

There is something in documentation that says that there won't be parallelism if " The client sends an Execute message
witha non-zero fetch count."
 
I am not sure what this sentence means. 

-----Message d'origine-----
De : Andreas Kretschmer [mailto:andreas@a-kretschmer.de] 
Envoyé : mercredi 17 avril 2019 12:39
À : pgsql-performance@lists.postgresql.org
Objet : Re: Pg10 : Client Configuration for Parallelism ?



Am 17.04.19 um 11:51 schrieb laurent.dechambe@orange.com:
>
> Here are the logs (with log_error_verbosity = verbose) :
>
> <DBEAVER>
>
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;00000;LOG:  00000: 
> execute <unnamed>: SELECT COUNT(1) FROM big_table
>
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;00000;LOCATION: 
> exec_execute_message, postgres.c:1959
>
> 2019-04-17 11:31:08 CEST;35895;thedbuser;thedb;00000;LOG:  00000: 
> duration: 25950.908 ms
>
> <BASIC JDBC>
>
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;00000;LOG:  00000: 
> execute <unnamed>: SELECT COUNT(1) FROM big_table
>
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;00000;LOCATION: 
> exec_execute_message, postgres.c:1959
>
> 2019-04-17 11:31:32 CEST;37257;thedbuser;thedb;00000;LOG:  00000: 
> duration: 11459.943 ms
>
>
> <PGADMIN4>
>
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;00000;LOG:  00000: 
> statement: SELECT COUNT(1) FROM big_table;
>
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;00000;LOCATION:  
> exec_simple_query, postgres.c:940
>
> 2019-04-17 11:33:08 CEST;37324;thedbuser;thedb;00000;LOG:  00000: 
> duration: 11334.677 ms
>
>

That's compareable. The first one took more time, cold cache. The 2nd 
and 3rd are faster, warm cache.

But: we can't see if the execution is paralell or not. If you want to 
know that, install and use auto_explain.


Regards, Andreas



-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com





_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.


Re: Pg10 : Client Configuration for Parallelism ?

От
Justin Pryzby
Дата:
On Wed, Apr 17, 2019 at 09:51:02AM +0000, laurent.dechambe@orange.com wrote:
> <DBEAVER>
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;00000;LOG:  00000: execute <unnamed>: SELECT COUNT(1) FROM big_table
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;00000;LOCATION:  exec_execute_message, postgres.c:1959

"execute" means it's using the extended protocol.
https://www.postgresql.org/docs/11/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY

> <BASIC JDBC>
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;00000;LOG:  00000: execute <unnamed>: SELECT COUNT(1) FROM big_table
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;00000;LOCATION:  exec_execute_message, postgres.c:1959

Same.

> <PGADMIN4>
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;00000;LOG:  00000: statement: SELECT COUNT(1) FROM big_table;
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;00000;LOCATION:  exec_simple_query, postgres.c:940

This is a "simple query", not using the "extended protocol".

On Wed, Apr 17, 2019 at 11:26:07AM +0000, laurent.dechambe@orange.com wrote:
> There is something in documentation that says that there won't be parallelism if " The client sends an Execute
messagewith a non-zero fetch count."
 
> I am not sure what this sentence means. 

This is likely the cause of the difference.

Could you run wireshark to watch the protocol traffic ?

I think it'll show that dbeaver is retrieving a portion of the result set.

Justin



RE: Pg10 : Client Configuration for Parallelism ?

От
Дата:
Hello Justin and thank you for your clues.

Finally I found that putting blank to the option that limits the number of rows to retrieve (which is normal for this
kindof tool) allows PostgreSQL to parallelize the query. 

On jdbc it seems this is equivalent to write :
statement. setMaxRows(0);  // parallelism authorized, which is the default.

Thus on my jdbc basic program if I add :
statement. setMaxRows(100);  // No parallelism allowed (at least in Pg10)

Thanks to all who were kind enough to help.

Laurent

-----Message d'origine-----
De : Justin Pryzby [mailto:pryzby@telsasoft.com]
Envoyé : mercredi 17 avril 2019 15:57
À : DECHAMBE Laurent DTSI/DSI
Cc : Andreas Joseph Krogh; pgsql-performance@lists.postgresql.org
Objet : Re: Pg10 : Client Configuration for Parallelism ?

On Wed, Apr 17, 2019 at 09:51:02AM +0000, laurent.dechambe@orange.com wrote:
> <DBEAVER>
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;00000;LOG:  00000: execute <unnamed>: SELECT COUNT(1) FROM big_table
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;00000;LOCATION:  exec_execute_message, postgres.c:1959

"execute" means it's using the extended protocol.
https://www.postgresql.org/docs/11/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY

> <BASIC JDBC>
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;00000;LOG:  00000: execute <unnamed>: SELECT COUNT(1) FROM big_table
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;00000;LOCATION:  exec_execute_message, postgres.c:1959

Same.

> <PGADMIN4>
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;00000;LOG:  00000: statement: SELECT COUNT(1) FROM big_table;
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;00000;LOCATION:  exec_simple_query, postgres.c:940

This is a "simple query", not using the "extended protocol".

On Wed, Apr 17, 2019 at 11:26:07AM +0000, laurent.dechambe@orange.com wrote:
> There is something in documentation that says that there won't be parallelism if " The client sends an Execute
messagewith a non-zero fetch count." 
> I am not sure what this sentence means.

This is likely the cause of the difference.

Could you run wireshark to watch the protocol traffic ?

I think it'll show that dbeaver is retrieving a portion of the result set.

Justin


_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.




Re: Pg10 : Client Configuration for Parallelism ?

От
Laurenz Albe
Дата:
laurent.dechambe@orange.com wrote:
> There is something in documentation that says that there won't be parallelism
> if " The client sends an Execute message with a non-zero fetch count."
> I am not sure what this sentence means.

The JDBC driver sends an "Execute" message to the server.
https://www.postgresql.org/docs/current/protocol-message-formats.html says:

Execute (F)

    Byte1('E')
        Identifies the message as an Execute command.
    Int32
        Length of message contents in bytes, including self.
    String
        The name of the portal to execute (an empty string selects the unnamed portal).
    Int32
        Maximum number of rows to return, if portal contains a query that returns rows
        (ignored otherwise). Zero denotes “no limit”.

If you use setMaxRows non-zero, that number is sent as the "maximum number of rows".

Parallelism currently cannot be used if there is a limit on the row count.
Imagine you want ten rows and already have nine, now if two workers are busy
calculating the next row, there is no good way to stop one of them when the other
returns a row.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Pg10 : Client Configuration for Parallelism ?

От
Thomas Kellerer
Дата:
laurent.dechambe@orange.com schrieb am 17.04.2019 um 16:33:
> Hello Justin and thank you for your clues.
>
> Finally I found that putting blank to the option that limits the
> number of rows to retrieve (which is normal for this kind of tool)
> allows PostgreSQL to parallelize the query.
>
> On jdbc it seems this is equivalent to write :
> statement. setMaxRows(0);  // parallelism authorized, which is the default.
>
> Thus on my jdbc basic program if I add :
> statement. setMaxRows(100);  // No parallelism allowed (at least in Pg10)
>
> Thanks to all who were kind enough to help.

This isn't limited to Statement.setMaxRows()

If you use "LIMIT x" in your SQL query, the same thing happens.

Thomas




Re: Pg10 : Client Configuration for Parallelism ?

От
Tom Lane
Дата:
Thomas Kellerer <spam_eater@gmx.net> writes:
> laurent.dechambe@orange.com schrieb am 17.04.2019 um 16:33:
>> On jdbc it seems this is equivalent to write :
>> statement. setMaxRows(0);  // parallelism authorized, which is the default.
>> 
>> Thus on my jdbc basic program if I add :
>> statement. setMaxRows(100);  // No parallelism allowed (at least in Pg10)

> This isn't limited to Statement.setMaxRows()
> If you use "LIMIT x" in your SQL query, the same thing happens.

No, not true: queries with LIMIT x are perfectly parallelizable.

The trouble with the protocol-level limit (setMaxRows) is that it
requires being able to suspend the query and resume fetching rows
later.  We don't allow that for parallel query because it would
involve tying up vastly more resources, ie a bunch of worker
processes, not just some extra memory in the client's own backend.

            regards, tom lane