Re: [HACKERS] Parallel Index Scans

Поиск
Список
Период
Сортировка
От tushar
Тема Re: [HACKERS] Parallel Index Scans
Дата
Msg-id 1c6ae839-6a25-53b4-447c-d41c94f53131@enterprisedb.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Parallel Index Scans  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On 02/01/2017 06:50 PM, Amit Kapila wrote:
Used large table parallel index scans (both forward and backward
scans).  These tests have been done by Tushar and you can find
detailed report up thread [2].  Apart from that, the patch has been
tested with TPC-H queries at various scale factors and it is being
used in multiple queries and we have verified the results of same as
well.  TPC-H tests have been done by Rafia.

Tushar has done some further extensive test of this patch.  Tushar,
can you please share your test results?
Yes, We have
0)Tested on a high end machine with this following configuration

[edb@ip-10-0-38-61 pg_log]$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                128
On-line CPU(s) list:   0-127
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             4
NUMA node(s):          4
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Model name:            Intel(R) Xeon(R) CPU E7-8880 v3 @ 2.30GHz

[edb@ip-10-0-38-61 pg_log]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        961G   60K  961G   1% /dev
tmpfs           961G  556K  961G   1% /dev/shm
/dev/xvda1      197G  156G   42G  80% /

[edb@ip-10-0-38-61 pg_log]$ free
             total       used       free     shared    buffers     cached
Mem:    2014742800  170971292 1843771508     142668     166128  162463396
-/+ buffers/cache:    8341768 2006401032
Swap:            0          0          0

1)Executed the testcases with multiple clients ( e.g run our testcase file against 4 different psql terminal of the same server simultaneously)  for concurrency,
   We made a effort to execute same set of tests (testcase.sql file) via different terminals against the same server.
2) We checked count(*) of the query  before and after disabling/enabling max_parallel_workers_per_gather to make sure end result(o/p) is consistent.
3) We are able to get parallel workers =14 (highest degree of parallelism ) in our case

pgbench with -scaling factor =10,000 ( taken 149 GB data in the database, 100 million rows is inserted) on amanzon instance (128 cores ,4 nodes)

We are able to see 14  workers launched  out of 14 workers planned  against  this below query

postgres=# \di+ pgbench_accounts_pkey
                                    List of relations
 Schema |         Name          | Type  | Owner |      Table | Size  | Description
--------+-----------------------+-------+-------+------------------+-------+-------------
 public | pgbench_accounts_pkey | index | edb   | pgbench_accounts | 21 GB |
(1 row)

index size is now 21 GB

postgres=# explain analyse verbose select * from pgbench_accounts where aid <50000000 and bid <=1 ;
                                                                                  QUERY PLAN                                                                                 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=0.57..1745380.10 rows=4691 width=97) (actual time=0.546..2316.118 rows=100000 loops=1)
   Output: aid, bid, abalance, filler
   Workers Planned: 14
   Workers Launched: 14
   ->  Parallel Index Scan using pgbench_accounts_pkey on public.pgbench_accounts  (cost=0.57..1745380.10 rows=335 width=97) (actual time=0.081..2253.234 rows=6667 loops=15)
         Output: aid, bid, abalance, filler
         Index Cond: (pgbench_accounts.aid < 50000000)
         Filter: (pgbench_accounts.bid <= 1)
         Rows Removed by Filter: 3326667
         Worker 0: actual time=0.069..2251.456 rows=7036 loops=1
         Worker 1: actual time=0.070..2256.772 rows=6588 loops=1
         Worker 2: actual time=0.071..2257.164 rows=6954 loops=1
         Worker 3: actual time=0.079..2255.166 rows=6222 loops=1
         Worker 4: actual time=0.063..2254.814 rows=6588 loops=1
         Worker 5: actual time=0.091..2253.872 rows=6588 loops=1
         Worker 6: actual time=0.093..2254.237 rows=6222 loops=1
         Worker 7: actual time=0.068..2254.749 rows=7320 loops=1
         Worker 8: actual time=0.060..2253.953 rows=6588 loops=1
         Worker 9: actual time=0.127..2253.546 rows=8052 loops=1
         Worker 10: actual time=0.091..2252.737 rows=7686 loops=1
         Worker 11: actual time=0.087..2252.056 rows=7320 loops=1
         Worker 12: actual time=0.091..2252.600 rows=7320 loops=1
         Worker 13: actual time=0.057..2252.341 rows=7686 loops=1
 Planning time: 0.165 ms
 Execution time: 2357.132 ms
(25 rows)

even for array keys,  index size is in MB . we are able to see 09 workers launched out of 09 workers planned

postgres=# set enable_bitmapscan =0;
SET
postgres=# set enable_seqscan =0;
SET
postgres=# \di+ ary_idx
                        List of relations
 Schema |  Name   | Type  | Owner |  Table  | Size  | Description
--------+---------+-------+-------+---------+-------+-------------
 public | ary_idx | index | edb   | ary_tab | 56 MB |
(1 row)

postgres=# explain analyze verbose select count(1) from ary_tab where ARRAY[7,8,9,10]=c2 and c1 = 'four';
                                                                         QUERY PLAN                                                                        
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=47083.83..47083.84 rows=1 width=8) (actual time=141.766..141.767 rows=1 loops=1)
   Output: count(1)
   ->  Gather  (cost=47083.80..47083.81 rows=9 width=8) (actual time=141.547..141.753 rows=10 loops=1)
         Output: (PARTIAL count(1))
         Workers Planned: 9
         Workers Launched: 9
         ->  Partial Aggregate  (cost=47083.80..47083.81 rows=1 width=8) (actual time=136.679..136.679 rows=1 loops=10)
               Output: PARTIAL count(1)
               Worker 0: actual time=135.215..135.215 rows=1 loops=1
               Worker 1: actual time=136.158..136.158 rows=1 loops=1
               Worker 2: actual time=136.348..136.349 rows=1 loops=1
               Worker 3: actual time=136.564..136.565 rows=1 loops=1
               Worker 4: actual time=135.759..135.760 rows=1 loops=1
               Worker 5: actual time=136.405..136.405 rows=1 loops=1
               Worker 6: actual time=136.158..136.158 rows=1 loops=1
               Worker 7: actual time=136.319..136.319 rows=1 loops=1
               Worker 8: actual time=136.597..136.597 rows=1 loops=1
               ->  Parallel Index Scan using ary_idx on public.ary_tab  (cost=0.42..47083.79 rows=4 width=0) (actual time=122.557..136.673 rows=5 loops=10)
                     Index Cond: ('{7,8,9,10}'::integer[] = ary_tab.c2)
                     Filter: (ary_tab.c1 = 'four'::text)
                     Rows Removed by Filter: 100000
                     Worker 0: actual time=135.211..135.211 rows=0 loops=1
                     Worker 1: actual time=136.153..136.153 rows=0 loops=1
                     Worker 2: actual time=136.342..136.342 rows=0 loops=1
                     Worker 3: actual time=136.559..136.559 rows=0 loops=1
                     Worker 4: actual time=135.756..135.756 rows=0 loops=1
                     Worker 5: actual time=136.402..136.402 rows=0 loops=1
                     Worker 6: actual time=136.150..136.150 rows=0 loops=1
                     Worker 7: actual time=136.314..136.314 rows=0 loops=1
                     Worker 8: actual time=136.592..136.592 rows=0 loops=1
 Planning time: 0.813 ms
 Execution time: 145.881 ms
(32 rows)

4)LCOV/Sql report can found for the same @ https://www.postgresql.org/message-id/1d6353a0-63cb-65d9-a70c-0913899d5b06@enterprisedb.com
-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
Следующее
От: Rafia Sabih
Дата:
Сообщение: Re: [HACKERS] WIP: [[Parallel] Shared] Hash