Extremely slow HashAggregate in simple UNION query

Поиск
Список
Период
Сортировка
От Felix Geisendörfer
Тема Extremely slow HashAggregate in simple UNION query
Дата
Msg-id 9ADCDD19-555B-4DB0-B00C-1852AC361557@felixge.de
обсуждение исходный текст
Ответы Re: Extremely slow HashAggregate in simple UNION query  (Pavel Stehule <pavel.stehule@gmail.com>)
Re: Extremely slow HashAggregate in simple UNION query  (Andres Freund <andres@anarazel.de>)
Re: Extremely slow HashAggregate in simple UNION query  (Michael Lewis <mlewis@entrata.com>)
Re: Extremely slow HashAggregate in simple UNION query  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-performance
Hi all,

today I debugged a query that was executing about 100x slower than expected, and was very surprised by what I found.

I'm posting to this list to see if this might be an issue that should be fixed in PostgreSQL itself.

Below is a simplified version of the query in question:

SET work_mem='64MB';
EXPLAIN ANALYZE
SELECT * FROM generate_series(1, 1) a, generate_series(1, 1) b
UNION
SELECT * FROM generate_series(1, 1) a, generate_series(1, 1) b;

HashAggregate  (cost=80020.01..100020.01 rows=2000000 width=8) (actual time=19.349..23.123 rows=1 loops=1)
  Group Key: a.a, b.b
  ->  Append  (cost=0.01..70020.01 rows=2000000 width=8) (actual time=0.022..0.030 rows=2 loops=1)
        ->  Nested Loop  (cost=0.01..20010.01 rows=1000000 width=8) (actual time=0.021..0.022 rows=1 loops=1)
              ->  Function Scan on generate_series a  (cost=0.00..10.00 rows=1000 width=4) (actual time=0.014..0.014
rows=1loops=1) 
              ->  Function Scan on generate_series b  (cost=0.00..10.00 rows=1000 width=4) (actual time=0.002..0.003
rows=1loops=1) 
        ->  Nested Loop  (cost=0.01..20010.01 rows=1000000 width=8) (actual time=0.006..0.007 rows=1 loops=1)
              ->  Function Scan on generate_series a_1  (cost=0.00..10.00 rows=1000 width=4) (actual time=0.002..0.003
rows=1loops=1) 
              ->  Function Scan on generate_series b_1  (cost=0.00..10.00 rows=1000 width=4) (actual time=0.002..0.002
rows=1loops=1) 
Planning Time: 0.101 ms
Execution Time: 45.986 ms

As you can see, it takes over 45ms (!) to execute what appear to be a very simple query. How is this possible?

Based on my debugging, I think the following is going on:

1. The query overestimates the final output rows by a factor of 2 million. [1]
2. The executor uses the bad estimate to allocate a huge hash table [2], and the increased work_mem [3] gives it enough
ropeto hang itself [4]. 
3. Somehow EXPLAIN gets confused by this and only ends up tracking 23ms of the query execution instead of 45ms [5].

I'm certainly a novice when it comes to PostgreSQL internals, but I'm wondering if this could be fixed by taking a more
dynamicapproach for allocating HashAggregate hash tables? 

Anyway, for my query using UNION ALL was acceptable, so I'm not too troubled by this. I was just really caught by
surprisethat bad estimates can not only cause bad query plans, but also cause good query plans to execute extremely
slowly.I had never seen that before. 

Best Regards
Felix

[1] My actual query had bad estimates for other reasons (GIN Index), but that's another story. The query above was of
coursedeliberately designed to have bad estimates. 
[2] nodeAgg.c: build_hash_table()
[3] A lot of queries in my application benefit from increased work_mem.
[4] execGrouping.c: nbuckets = Min(nbuckets, (long) ((work_mem * 1024L) / entrysize));
[5] In my actual query it was even worse, only 6 out of 40ms of the total execution time were reported as being spent
inthe query nodes. 


В списке pgsql-performance по дате отправления:

Предыдущее
От: MichaelDBA
Дата:
Сообщение: Re: Erratically behaving query needs optimization
Следующее
От: Michael Lewis
Дата:
Сообщение: Re: Extremely slow HashAggregate in simple UNION query