Strange sort node/explain result

Поиск
Список
Период
Сортировка
От Gunnar \"Nick\" Bluth
Тема Strange sort node/explain result
Дата
Msg-id 587a301e-478d-1cbc-288f-6ea77086b097@pro-open.de
обсуждение исходный текст
Ответы Re: Strange sort node/explain result
Список pgsql-bugs
Hello all!

Last week at pgconf.eu, my good friend Corrado aka. Dino (in CC) showed 
me a plan he couldn't make sense of.
I stood there, scratching my head and couldn't make sense of it just the 
same.
What comforted me somehow is that nobody else we showed it to, including 
  Peter Geoghegan (but to be fair, it was on day 3 of the conference ;-) 
could explain what's going on there.
So we follow Peter's advise and post this here.


Background: Dino played a bit in a test DB, wondering if putting all the 
extract() values into a CTE when querying the large timeseries table 
would save time compared to generating them on the fly or storing them 
in the timeseries table (as it happens right now).


Mind you, this inquiry is not about if that's a good idea, the query's 
performance or even the planner's choice of a merge join!


What puzzles us is the part where the CTE "oneyear" somehow explodes 
into a sort node of almost 10 mio (but not the same amount as the index 
scan emits!) rows, taking ~ 0.4 seconds but only using 4x-5x kB of memory:

   ->  Sort  (cost=69.83..72.33 rows=1000 width=4) (actual 
time=0.418..448.397 rows=9855001 loops=1)
         Output: y.d
         Sort Key: y.d
         Sort Method: quicksort  Memory: 42kB
         ->  CTE Scan on oneyear y  (cost=0.00..20.00 rows=1000 width=4) 
(actual time=0.079..0.383 rows=366 loops=1)
               Output: y.d

Full plan is here: https://explain.depesz.com/s/B3mN

Testcase to reproduce is attached. It uses roughly the same amount of 
rows as the original data.

I tried this on 9.6, 10, 11, 12, 14 & 15 and it's always showing the 
same effect.

We also tried to "broaden" everything (adding more columns to the CTE, 
fetching more or less from the main table, etc.) in Berlin, but the 
outcome is rock solid, so to speak. Even the memory footprint of the 
sort node doesn't change at all.

Dropping the index though leads to a completely different plan, and 
_that_ sort node is sane (but not fed by a CTE scan, obviously), using 
the same 42kB of memory:

    ->  Sort  (cost=59.84..62.34 rows=1000 width=8) (actual 
time=57.275..57.355 rows=366 loops=1) 

          Output: y.d, ((y.d)::date) 

          Sort Key: ((y.d)::date) 

          Sort Method: quicksort  Memory: 42kB 

          Buffers: shared hit=4 

          ->  Function Scan on pg_catalog.generate_series y 
(cost=0.01..10.01 rows=1000 width=8) (actual time=57.190..57.233 
rows=366 loops=1)
                Output: y.d, (y.d)::date 

                Function Call: generate_series((now() - '365 
days'::interval), now(), '1 day'::interval) 




Maybe the sort node gets attributed wrongly and it's actually the index 
scan result that gets sorted? But that wouldn't be necessary, afterall...
Maybe the sort really takes much more memory, but the executor doesn't 
report that correctly?
Why does the CTE sort node sort exactly 26.999 rows less than the main 
table has rows, when that sports exactly 27.000 rows per day?
Maybe we may have found a long missing optimization opportunity? ;-)


We are seriously clueless; pls. enlighten us! ;-)

Best regards,
-- 
Gunnar "Nick" Bluth

Eimermacherweg 106
D-48159 Münster

Mobil +49 172 8853339
Email: gunnar.bluth@pro-open.de
__________________________________________________________________________
"Ceterum censeo SystemD esse delendam" - Cato

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Lauri Laanmets
Дата:
Сообщение: Re: Memory leak on subquery as scalar operand
Следующее
От: David Rowley
Дата:
Сообщение: Re: Strange sort node/explain result