Re: Hybrid Hash/Nested Loop joins and caching results from subplans

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Дата
Msg-id 4bbece40-d648-b6f9-a323-fcb8f4b45e78@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Hybrid Hash/Nested Loop joins and caching results from subplans  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: Hybrid Hash/Nested Loop joins and caching results from subplans  (David Rowley <dgrowleyml@gmail.com>)
Re: Hybrid Hash/Nested Loop joins and caching results from subplans  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-hackers

On 09.12.2020 23:53, David Rowley wrote:
> On Tue, 8 Dec 2020 at 20:15, David Rowley <dgrowleyml@gmail.com> wrote:
>> I've attached another patchset that addresses some comments left by
>> Zhihong Yu over on [1].  The version number got bumped to v12 instead
>> of v11 as I still have a copy of the other version of the patch which
>> I made some changes to and internally named v11.
> If anyone else wants to have a look at these, please do so soon. I'm
> planning on starting to take a serious look at getting 0001-0003 in
> early next week.
>
> David
>
I tested the patched version of Postgres on JOBS benchmark:

https://github.com/gregrahn/join-order-benchmark

For most queries performance is the same, some queries are executed 
faster but
one query is 150 times slower:


explain analyze SELECT MIN(chn.name) AS character,
        MIN(t.title) AS movie_with_american_producer
FROM char_name AS chn,
      cast_info AS ci,
      company_name AS cn,
      company_type AS ct,
      movie_companies AS mc,
      role_type AS rt,
      title AS t
WHERE ci.note LIKE '%(producer)%'
   AND cn.country_code = '[us]'
   AND t.production_year > 1990
   AND t.id = mc.movie_id
   AND t.id = ci.movie_id
   AND ci.movie_id = mc.movie_id
   AND chn.id = ci.person_role_id
   AND rt.id = ci.role_id
   AND cn.id = mc.company_id
   AND ct.id = mc.company_type_id;
explain analyze SELECT MIN(cn.name) AS from_company,
        MIN(lt.link) AS movie_link_type,
        MIN(t.title) AS non_polish_sequel_movie
FROM company_name AS cn,
      company_type AS ct,
      keyword AS k,
      link_type AS lt,
      movie_companies AS mc,
      movie_keyword AS mk,
      movie_link AS ml,
      title AS t
WHERE cn.country_code !='[pl]'
   AND (cn.name LIKE '%Film%'
        OR cn.name LIKE '%Warner%')
   AND ct.kind ='production companies'
   AND k.keyword ='sequel'
   AND lt.link LIKE '%follow%'
   AND mc.note IS NULL
   AND t.production_year BETWEEN 1950 AND 2000
   AND lt.id = ml.link_type_id
   AND ml.movie_id = t.id
   AND t.id = mk.movie_id
   AND mk.keyword_id = k.id
   AND t.id = mc.movie_id
   AND mc.company_type_id = ct.id
   AND mc.company_id = cn.id
   AND ml.movie_id = mk.movie_id
   AND ml.movie_id = mc.movie_id
   AND mk.movie_id = mc.movie_id;


QUERY PLAN


-------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------
  Finalize Aggregate  (cost=300131.43..300131.44 rows=1 width=64) 
(actual time=522985.919..522993.614 rows=1 loops=1)
    ->  Gather  (cost=300131.00..300131.41 rows=4 width=64) (actual 
time=522985.908..522993.606 rows=5 loops=1)
          Workers Planned: 4
          Workers Launched: 4
          ->  Partial Aggregate  (cost=299131.00..299131.01 rows=1 
width=64) (actual time=522726.599..522726.606 rows=1 loops=5)
                ->  Hash Join  (cost=38559.78..298508.36 rows=124527 
width=33) (actual time=301521.477..522726.592 rows=2 loops=5)
                      Hash Cond: (ci.role_id = rt.id)
                      ->  Hash Join  (cost=38558.51..298064.76 
rows=124527 width=37) (actual time=301521.418..522726.529 rows=2 loops=5)
                            Hash Cond: (mc.company_type_id = ct.id)
                            ->  Nested Loop (cost=38557.42..297390.45 
rows=124527 width=41) (actual time=301521.392..522726.498 rows=2 loops=5)
                                  ->  Nested Loop 
(cost=38556.98..287632.46 rows=255650 width=29) (actual 
time=235.183..4596.950 rows=156421 loops=5)
                                        Join Filter: (t.id = ci.movie_id)
                                        ->  Parallel Hash Join 
(cost=38556.53..84611.99 rows=162109 width=29) (actual 
time=234.991..718.934 rows=119250 loops
=5)
                                              Hash Cond: (t.id = 
mc.movie_id)
                                              ->  Parallel Seq Scan on 
title t  (cost=0.00..43899.19 rows=435558 width=21) (actual 
time=0.010..178.332 rows=34
9806 loops=5)
                                                    Filter: 
(production_year > 1990)
                                                    Rows Removed by 
Filter: 155856
                                              ->  Parallel Hash 
(cost=34762.05..34762.05 rows=303558 width=8) (actual 
time=234.282..234.285 rows=230760 loops
=5)
                                                    Buckets: 2097152 
(originally 1048576)  Batches: 1 (originally 1)  Memory Usage: 69792kB
                                                    ->  Parallel Hash 
Join  (cost=5346.12..34762.05 rows=303558 width=8) (actual 
time=11.846..160.085 rows=230
760 loops=5)
                                                          Hash Cond: 
(mc.company_id = cn.id)
                                                          -> Parallel 
Seq Scan on movie_companies mc  (cost=0.00..27206.55 rows=841655 
width=12) (actual time
=0.013..40.426 rows=521826 loops=5)
                                                          -> Parallel 
Hash  (cost=4722.92..4722.92 rows=49856 width=4) (actual 
time=11.658..11.659 rows=16969
  loops=5)
Buckets: 131072  Batches: 1  Memory Usage: 4448kB
->  Parallel Seq Scan on company_name cn  (cost=0.00..4722.92 rows=49856 
width=4) (actual time
=0.014..8.324 rows=16969 loops=5)
Filter: ((country_code)::text = '[us]'::text)
Rows Removed by Filter: 30031
                                        ->  Result Cache 
(cost=0.45..1.65 rows=2 width=12) (actual time=0.019..0.030 rows=1 
loops=596250)
                                              Cache Key: mc.movie_id
                                              Hits: 55970  Misses: 
62602  Evictions: 0  Overflows: 0  Memory Usage: 6824kB
                                              Worker 0:  Hits: 56042 
Misses: 63657  Evictions: 0  Overflows: 0  Memory Usage: 6924kB
                                              Worker 1:  Hits: 56067 
Misses: 63659  Evictions: 0  Overflows: 0  Memory Usage: 6906kB
                                              Worker 2:  Hits: 55947 
Misses: 62171  Evictions: 0  Overflows: 0  Memory Usage: 6767kB
                                              Worker 3:  Hits: 56150 
Misses: 63985  Evictions: 0  Overflows: 0  Memory Usage: 6945kB
                                              ->  Index Scan using 
cast_info_movie_id_idx on cast_info ci  (cost=0.44..1.64 rows=2 
width=12) (actual time=0.03
3..0.053 rows=1 loops=316074)
                                                    Index Cond: 
(movie_id = mc.movie_id)
                                                    Filter: 
((note)::text ~~ '%(producer)%'::text)
                                                    Rows Removed by
Filter: 25
                                  ->  Result Cache (cost=0.44..0.59 
rows=1 width=20) (actual time=3.311..3.311 rows=0 loops=782104)
                                        Cache Key: ci.person_role_id
                                        Hits: 5  Misses: 156294 
Evictions: 0  Overflows: 0  Memory Usage: 9769kB
                                        Worker 0:  Hits: 0  Misses: 
156768  Evictions: 0  Overflows: 0  Memory Usage: 9799kB
                                        Worker 1:  Hits: 1  Misses: 
156444  Evictions: 0  Overflows: 0  Memory Usage: 9778kB
                                        Worker 2:  Hits: 0  Misses: 
156222  Evictions: 0  Overflows: 0  Memory Usage: 9764kB
                                        Worker 3:  Hits: 0  Misses: 
156370  Evictions: 0  Overflows: 0  Memory Usage: 9774kB
                                        ->  Index Scan using 
char_name_pkey on char_name chn  (cost=0.43..0.58 rows=1 width=20) 
(actual time=0.001..0.001 rows
=0 loops=782098)
                                              Index Cond: (id = 
ci.person_role_id)
                            ->  Hash  (cost=1.04..1.04 rows=4 width=4) 
(actual time=0.014..0.014 rows=4 loops=5)
                                  Buckets: 1024  Batches: 1  Memory 
Usage: 9kB
                                  ->  Seq Scan on company_type ct 
(cost=0.00..1.04 rows=4 width=4) (actual time=0.012..0.012 rows=4 loops=5)
                      ->  Hash  (cost=1.12..1.12 rows=12 width=4) 
(actual time=0.027..0.028 rows=12 loops=5)
                            Buckets: 1024  Batches: 1  Memory Usage: 9kB
                            ->  Seq Scan on role_type rt 
(cost=0.00..1.12 rows=12 width=4) (actual time=0.022..0.023 rows=12 loops=5)
  Planning Time: 2.398 ms
  Execution Time: 523002.608 ms
(55 rows)

I attach file with times of query execution.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Fetter
Дата:
Сообщение: Re: [HACKERS] [PATCH] Generic type subscripting
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: Change default of checkpoint_completion_target