Re: Question of Parallel Hash Join on TPC-H Benchmark
От | Ba Jinsheng |
---|---|
Тема | Re: Question of Parallel Hash Join on TPC-H Benchmark |
Дата | |
Msg-id | SEZPR06MB649483EF863B323B090986AD8A7A2@SEZPR06MB6494.apcprd06.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: Question of Parallel Hash Join on TPC-H Benchmark (Andrei Lepikhov <lepihov@gmail.com>) |
Ответы |
Re: Question of Parallel Hash Join on TPC-H Benchmark
|
Список | pgsql-bugs |
> Could you provide SQL dump and settings to play with this case locally?
The dump file is too big, so I put it on Google Drive: https://drive.google.com/file/d/1e0s6ZLKLEPbZzS6BzftwpmVspWi7Okd1/view?usp=sharing
I also share my data directory here: https://drive.google.com/file/d/1ZBLHanIRwxbaMQIhRUSPv4I7y8g_0AWi/view?usp=sharing
>Also, I usually force parallel workers with settings like below:
>max_parallel_workers_per_gather = 32
>min_parallel_table_scan_size = 0
>min_parallel_index_scan_size = 0
>max_worker_processes = 64
>parallel_setup_cost = 0.001
>parallel_tuple_cost = 0.0001
>max_parallel_workers_per_gather = 32
>min_parallel_table_scan_size = 0
>min_parallel_index_scan_size = 0
>max_worker_processes = 64
>parallel_setup_cost = 0.001
>parallel_tuple_cost = 0.0001
I tried these configuration parameters and got the same worse query plan--- the HashJoin in fifth line is still not in parallel and the following HashJoin are in parallel.
However, this is an inefficient query plan.
I changed the code to generate an efficient query plan (only the HashJoin in fifth line is in parallel), so I am wondering whether it is possible to optimize the code to enable this efficient query plan in default? I believe at least, it would improve the performance of PostgreSQL on the standard benchmark TPC-H.
If you need, I can provide my environment in docker for your analysis.
Best regards,
Jinsheng Ba
В списке pgsql-bugs по дате отправления: