Re: [EXTERNAL]Re: Refresh Materialized View Issue

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: [EXTERNAL]Re: Refresh Materialized View Issue
Дата
Msg-id f151126e-ffb8-4481-9b1f-8b43950b6b2f@gmail.com
обсуждение исходный текст
Ответ на Re: [EXTERNAL]Re: Refresh Materialized View Issue  (Jeremiah Bauer <jbauer@agristats.com>)
Список pgsql-general
On 1/12/24 6:34 AM, Jeremiah Bauer wrote:
> 
>     That'd be a band-aid at best, because we know that the query used to
>     define the materialized view runs in a reasonable amount of time on it's
>     own, as does a CTAS. So either the REFRESH is doing something odd when
>     writing into the new relation (which looking at the code seems very
>     unlikely), or REFRESH is getting a different query plan for some reason.
>     Unfortunately, I don't know of any easy way to get the query plan for
>     the REFRESH (it might be possible via gdb, but I'm not sure). We do at
>     least know that the REFRESH is using parallel workers.
> 
>     Can you post the output of EXPLAIN ANALYZE for the SELECT? That might
>     provide some clues.
> 
> Sure, here is the explain analyze for the select:
> 
> HashAggregate  (cost=123986470.20..129486707.63 rows=164493082 width=24) 
> (actual time=697250.385..741548.965 rows=59015171 loops=1)
>    Group Key: id1, id2, id3
>    Planned Partitions: 64  Batches: 65  Memory Usage: 328209kB  Disk 
> Usage: 6750176kB
>    ->  Gather  (cost=64653301.50..107228737.47 rows=328986164 width=24) 
> (actual time=357598.331..594226.355 rows=161151623 loops=1)
>          Workers Planned: 2
>          Workers Launched: 2
>          ->  HashAggregate  (cost=64652301.50..74329121.07 
> rows=164493082 width=24) (actual time=357572.082..578038.457 
> rows=53717208 loops=3)
>                Group Key: id1, id2, id3
>                Planned Partitions: 64  Batches: 65  Memory Usage: 
> 328209kB  Disk Usage: 25774088kB
>                Worker 0:  Batches: 65  Memory Usage: 328209kB  Disk 
> Usage: 25375784kB
>                Worker 1:  Batches: 65  Memory Usage: 328209kB  Disk 
> Usage: 25382936kB
>                ->  Parallel Seq Scan on large_table 
>   (cost=0.00..29740358.40 rows=685387840 width=24) (actual 
> time=12.954..99596.289 rows=548310252 loops=3)
> Planning Time: 5.380 ms
> Execution Time: 745750.371 ms

Ok, so that's using a parallel query as well, just like REFRESH, so no 
help there.

At this point I think you'd need to do some OS-level investigation using 
a tool like perf to determine what the parent process is doing once the 
workers finish.

BTW, I did just come across a script[1] that will print the query plan 
for a running query. Based on what you've said I suspect it wouldn't 
tell us much here, but I wanted to mention it.

1: https://github.com/StarfishStorage/explain-running-query
-- 
Jim Nasby, Data Architect, Austin TX




В списке pgsql-general по дате отправления:

Предыдущее
От: Kiran K V
Дата:
Сообщение: Re: Issue with loading unicode characters with copy command
Следующее
От: David Ventimiglia
Дата:
Сообщение: How to redirect output from PostgreSQL pg_recvlogical to a file or a pipe?