Macro customizable hashtable / bitmapscan & aggregation perf

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Macro customizable hashtable / bitmapscan & aggregation perf
Дата
Msg-id 20160727004333.r3e2k2y6fvk2ntup@alap3.anarazel.de
обсуждение исходный текст
Ответы Re: Macro customizable hashtable / bitmapscan & aggregation perf  (Robert Haas <robertmhaas@gmail.com>)
Re: Macro customizable hashtable / bitmapscan & aggregation perf  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi,

As previously mentioned, dynahash isn't particularly fast. In many cases
that doesn't particularly matter. But e.g. hash aggregation and bitmap
index scans are quite sensitive to hash performance.

The biggest problems with dynahash (also discussed at [1]) are

a) two level structure doubling the amount of indirect lookups
b) indirect function calls
c) using separate chaining based conflict resolution
d) being too general.

In the post referenced above I'd implemented an open-coded hashtable
addressing these issues for the tidbitmap.c case, with quite some
success.

It's not particularly desirable to have various slightly differing
hash-table implementations across the backend though. The only solution
I could think of, that preserves the efficiency, is to have a hash-table
implementation which is customizable to the appropriate types et, via
macros.

In the attached patch I've attached simplehash.h, which can be
customized by a bunch of macros, before being inlined.  There's also a
patch using this for tidbitmap.c and nodeAgg/nodeSubplan/... via
execGrouping.c.

To show the motivation:
Bitmapscan:
before:
tpch[22005][1]=# EXPLAIN ANALYZE SELECT SUM(l_extendedprice) FROM lineitem WHERE (l_shipdate >= '1995-01-01'::date) AND
(l_shipdate<= '1996-12-31'::date); 

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                    QUERY PLAN
                          │ 

├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Aggregate  (cost=942330.27..942330.28 rows=1 width=8) (actual time=5283.026..5283.026 rows=1 loops=1)
                          │ 
│   ->  Bitmap Heap Scan on lineitem  (cost=193670.02..919511.38 rows=9127557 width=8) (actual time=3041.072..4436.569
rows=9113219loops=1)       │ 
│         Recheck Cond: ((l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date))
                          │ 
│         Heap Blocks: exact=585542
                          │ 
│         ->  Bitmap Index Scan on i_l_shipdate  (cost=0.00..191388.13 rows=9127557 width=0) (actual
time=2807.246..2807.246rows=9113219 loops=1) │ 
│               Index Cond: ((l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date))
                          │ 
│ Planning time: 0.077 ms
                          │ 
│ Execution time: 5284.038 ms
                          │ 

└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(8 rows)

after:
tpch[21953][1]=# EXPLAIN ANALYZE SELECT SUM(l_extendedprice) FROM lineitem WHERE (l_shipdate >= '1995-01-01'::date) AND
(l_shipdate<= '1996-12-31'::date); 

┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                                   QUERY PLAN
                        │ 

├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Aggregate  (cost=942330.27..942330.28 rows=1 width=8) (actual time=3431.602..3431.602 rows=1 loops=1)
                        │ 
│   ->  Bitmap Heap Scan on lineitem  (cost=193670.02..919511.38 rows=9127557 width=8) (actual time=1158.783..2588.357
rows=9113219loops=1)     │ 
│         Recheck Cond: ((l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date))
                        │ 
│         Heap Blocks: exact=585542
                        │ 
│         ->  Bitmap Index Scan on i_l_shipdate  (cost=0.00..191388.13 rows=9127557 width=0) (actual
time=958.341..958.341rows=9113219 loops=1) │ 
│               Index Cond: ((l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date))
                        │ 
│ Planning time: 0.070 ms
                        │ 
│ Execution time: 3435.259 ms
                        │ 

└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(8 rows)


Hash-Agg:

before:

tpch[22005][1]=# EXPLAIN (ANALYZE, TIMING OFF) SELECT l_partkey, SUM(l_extendedprice) FROM lineitem GROUP BY 1 ORDER BY
2DESC LIMIT 10; 

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                      QUERY PLAN
│

├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Limit  (cost=1069653.05..1069653.08 rows=10 width=12) (actual rows=10 loops=1)
│
│   ->  Sort  (cost=1069653.05..1072083.33 rows=972112 width=12) (actual rows=10 loops=1)
│
│         Sort Key: (sum(l_extendedprice)) DESC
│
│         Sort Method: top-N heapsort  Memory: 25kB
│
│         ->  HashAggregate  (cost=1038924.94..1048646.06 rows=972112 width=12) (actual rows=1000000 loops=1)
│
│               Group Key: l_partkey
│
│               ->  Seq Scan on lineitem  (cost=0.00..888925.96 rows=29999796 width=12) (actual rows=29999795 loops=1)
│
│ Planning time: 0.210 ms
│
│ Execution time: 20887.906 ms
│

└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(9 rows)

after:

tpch[22342][1]=# EXPLAIN (ANALYZE, TIMING OFF) SELECT l_partkey, SUM(l_extendedprice) FROM lineitem GROUP BY 1 ORDER BY
2DESC LIMIT 10; 

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                      QUERY PLAN
│

├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Limit  (cost=1069653.05..1069653.08 rows=10 width=12) (actual rows=10 loops=1)
│
│   ->  Sort  (cost=1069653.05..1072083.33 rows=972112 width=12) (actual rows=10 loops=1)
│
│         Sort Key: (sum(l_extendedprice)) DESC
│
│         Sort Method: top-N heapsort  Memory: 25kB
│
│         ->  HashAggregate  (cost=1038924.94..1048646.06 rows=972112 width=12) (actual rows=1000000 loops=1)
│
│               Group Key: l_partkey
│
│               ->  Seq Scan on lineitem  (cost=0.00..888925.96 rows=29999796 width=12) (actual rows=29999795 loops=1)
│
│ Planning time: 0.104 ms
│
│ Execution time: 14617.481 ms
│

└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(9 rows)


The hash-table performance difference itself is bigger in both cases,
but other things, notably slot deforming and expression evaluation,
becomes the bottleneck.


Does anybody have a better idea how to approach the hash-table problem?
While it'd be nice not to resort to such macro-magic, I can't think of
an alternative short of switching to C++ and using templates.  Currently
this is used like:

#define SH_PREFIX pagetable
#define SH_KEYTYPE BlockNumber
#define SH_KEY blockno
#define SH_CONTAINS PagetableEntry
#define SH_HASH_KEY(tb, key) hash_blockno(key)
#define SH_EQUAL(tb, a, b) a == b
#define SH_SCOPE static
#define SH_DEFINE
#define SH_DECLARE
#include "lib/simplehash.h"

which then creates functions like pagetable_create/insert/lookup/delete/...


I strongly suspect there are some other cases that could benefit from
another hash-table implementation. Particularly catcache.c seems like a
candidate (instead of it's hand-rolled stuff, which frequently shows up
in profiles).


Note that these patches are *NOT* in a shape for in-detail review. I'd
first like to know what people think about the general approach, and
about better alternatives.

Also note that some queries in the regression test change their result,
because the ordering is unspecified...

Greetings,

Andres Freund

[1] http://archives.postgresql.org/message-id/20160714030616.dvlkboz6cw555ixb%40alap3.anarazel.de

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Why we lost Uber as a user
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Increasing timeout of poll_query_until for TAP tests