[HACKERS] Performance degradation in TPC-H Q18

Поиск
Список
Период
Сортировка
От Kuntal Ghosh
Тема [HACKERS] Performance degradation in TPC-H Q18
Дата
Msg-id CAGz5QC+=fNTYgzMLTBUNeKt6uaWZFXJbkB5+7oWm-n9DwVxcLA@mail.gmail.com
обсуждение исходный текст
Ответы Re: [HACKERS] Performance degradation in TPC-H Q18  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Hello everyone,

While conducting the experiments for parallelism, Rafia came across a
hang in Q18 when plan uses partial and finalize hash aggregate. This
could be seen on both scale factors - 20 and 300, on setting work_mem
high enough so that the query uses hash aggregate. It seems that
commit b81b5a96f424531b97cdd1dba97d9d1b9c9d372e does not solve the
issue completely.

I've tested on scale factor 10 and work_mem 100mb and the issue is
reproducible. Below is the simplified part of the query which uses the
hash aggregate.

tpch10GB=# explain select l_orderkey from lineitem group by l_orderkey;
                                          QUERY PLAN
------------------------------------------------------------------------------------------------Finalize HashAggregate
(cost=1612872.73..1616801.32rows=392859 width=8)  Group Key: l_orderkey  ->  Gather  (cost=1528408.04..1610908.43
rows=785718width=8)        Workers Planned: 2        ->  Partial HashAggregate  (cost=1527408.04..1531336.63
 
rows=392859 width=8)              Group Key: l_orderkey              ->  Parallel Seq Scan on lineitem
(cost=0.00..1464922.43 rows=24994243 width=8)
(7 rows)

I've also tried to execute the same query with a different column name.

tpch10GB=# explain analyze select l_suppkey from lineitem group by l_suppkey;
             QUERY PLAN
 

-----------------------------------------------------------------------------------------------------------------------------------------------------Finalize
HashAggregate (cost=1549511.80..1550493.37 rows=98157 width=8)               (actual time=22363.901..22380.361
rows=100000loops=1)  Group Key: l_suppkey  ->  Gather  (cost=1528408.04..1549021.01 rows=196314 width=8)
(actualtime=22107.387..22244.848 rows=300000 loops=1)        Workers Planned: 2        Workers Launched: 2        ->
PartialHashAggregate  (cost=1527408.04..1528389.61
 
rows=98157 width=8)                   (actual time=22100.776..22135.063 rows=100000 loops=3)              Group Key:
l_suppkey             ->  Parallel Seq Scan on lineitem
 
(cost=0.00..1464922.43 rows=24994243 width=8)                           (actual time=0.846..10664.258 rows=19995351
loops=3)Planningtime: 0.171 msExecution time: 22393.030 ms
 
(10 rows)

And, it finished in time although the stats in both the plans are
exactly same. Hence, the problem is with the query itself. As Robert
suggested offline, I've produced the perf report for the query.

99.44%     0.01%  postgres  postgres           [.] LookupTupleHashEntry
74.31%    53.55%  postgres  postgres           [.] tuplehash_insert
17.05%    17.04%  postgres  libc-2.17.so       [.] __memcpy_ssse3_back
11.06%    11.05%  postgres  postgres           [.] tuplehash_prev
10.81%    10.80%  postgres  postgres           [.] tuplehash_next

It's clear from the perf report that something terrible is going on in
the simple hash insert. I've enabled the logging for hash
ops(SH_STAT()) during hash growing phase.

2017-02-21 16:42:39.634 IST [9372] LOG:  size: 1048576, members:
838860, filled: 0.799999,
total chain: 2008236860, max chain: 66743, avg chain: 2394.007176,
total_collisions: 278348, max_collisions: 8, avg_collisions: 0.331817
2017-02-21 16:42:39.634 IST [9372] STATEMENT:  explain analyze select
l_orderkey from lineitem group by l_orderkey;

2017-02-21 16:42:39.741 IST [9372] LOG:  size: 2097152, members:
838860, filled: 0.400000,
total chain: 788175, max chain: 124, avg chain: 0.939579,
total_collisions: 216052, max_collisions: 6, avg_collisions: 0.257554
2017-02-21 16:42:39.741 IST [9372] STATEMENT:  explain analyze select
l_orderkey from lineitem group by l_orderkey;

The value of max chain length is really high which hampers the query's
performance. To further understand the scenario, I've taken the
following results:

tpch10GB=# select count(distinct l_orderkey) from lineitem; count
----------15000000

tpch10GB=# select correlation from pg_stats where tablename='lineitem'
and attname='l_orderkey';correlation
-------------          1

tpch10GB=# select count(distinct l_suppkey) from lineitem;count
--------100000

tpch10GB=# select correlation from pg_stats where tablename='lineitem'
and attname='l_suppkey';correlation
------------- 0.00900259

It seems that l_orderkey produces a pretty bad case for hashing with
linear probing and a bad choice of hash function(as linear probing is
sensitive to elements with nearby hash codes). In [1], Andres's
proposed a hack to resize the hashtable when the chain is growing
quickly. Below is a comment from the patch explaining the objective:

+            /*
+             * To avoid negative consequences from overly imbalanced
+             * hashtables, grow the hashtable if collisions lead to large
+             * runs. The most likely cause of such imbalance is filling a
+             * (currently) small table, from a currently big one, in
+             * hash-table order.
+             *
+             * FIXME: compute boundaries in a more principled manner.
+             */

After applying the patch, TPC-H Q18 and the simplified query, both
completes in time. Although the patch works in this scenario, we need
a smarter way to trigger the hash expansion. The problem with the
patch is that it triggers a hash expansion even when the filled
percentage is pretty low, causing unnecessary memory consumption.

Thoughts?

[1] https://www.postgresql.org/message-id/20161123083351.5vramz52nmdokhzz%40alap3.anarazel.de
0001-Resize-simplehash.h-table-in-case-of-long-runs.patch


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jim Nasby
Дата:
Сообщение: Re: [HACKERS] mat views stats
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: [HACKERS] Change in "policy" on dump ordering?