Re: Death postgres

Поиск
Список
Период
Сортировка
От Marc Millas
Тема Re: Death postgres
Дата
Msg-id CADX_1aZTekarfBzFOK-3v_eEWsyrqU=bkfzMxH3Xhj6=wjLN-w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Death postgres  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
Ответы Re: Death postgres
Список pgsql-general



On Thu, May 11, 2023 at 1:56 AM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
On 2023-05-10 22:52:47 +0200, Marc Millas wrote:
> On Wed, May 10, 2023 at 7:24 PM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
>
>     On 2023-05-10 16:35:04 +0200, Marc Millas wrote:
>     >  Unique  (cost=72377463163.02..201012533981.80 rows=1021522829864 width=
>     97)
>     >    ->  Gather Merge  (cost=72377463163.02..195904919832.48 rows=
>     1021522829864 width=97)
>     ...
>     >                ->  Parallel Hash Left Join  (cost=
>     604502.76..1276224253.51 rows=204304565973 width=97)
>     >                      Hash Cond: ((t1.col_ano)::text = (t2.col_ano)::text)
>     ...
>     >
>     > //so.. the planner guess that those 2 join will generate 1000 billions
>     rows...
>
>     Are some of the col_ano values very frequent? If say the value 42 occurs
>     1 million times in both table_a and table_b, the join will create 1
>     trillion rows for that value alone. That doesn't explain the crash or the
>     disk usage, but it would explain the crazy cost (and would probably be a
>     hint that this query is unlikely to finish in any reasonable time).
>
>
> good guess, even if a bit surprising: there is one (and only one) "value" which
> fit your supposition: NULL

But NULL doesn't equal NULL, so that would result in only one row in the
left join. So that's not it.


so, apo...

the 750000 lines in each tables are not NULLs but '' empty varchar, which, obviously is not the same thing.
and which perfectly generates 500 billions lines for the left join.
So, no planner or statistics pbs. apologies for the time wasted.
Back to the initial pb:
if, with temp_file_limit positioned to 210 GB, I try to run the select * from table_a left join table_b on the col_a (which contains the 750000 '' on both tables)
then postgres do crash, killed by oom, after having taken 1.1 TB of additional disk space. 
the explain plan guess 512 planned partitions. (obviously, I cannot provide an explain analyze...)

to my understanding, before postgres 13, hash aggregate did eat RAM limitless in such circumstances.
but in 14.2 ??
(I know, 14.8 is up...)


        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"



Marc MILLAS
Senior Architect
+33607850334


 

В списке pgsql-general по дате отправления:

Предыдущее
От: Marc Millas
Дата:
Сообщение: Re: gather merge
Следующее
От: Ron
Дата:
Сообщение: Re: order by