Re: [BUGS] json(b)_array_elements use causes very large memory usagewhen also referencing entire json document

Поиск
Список
Период
Сортировка
От Lucas Fairchild-Madar
Тема Re: [BUGS] json(b)_array_elements use causes very large memory usagewhen also referencing entire json document
Дата
Msg-id CAJmoq7PdrJFteGWtQFyb6fBzbZft2mkG50=RjZNOtGyNRW+=5g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [BUGS] json(b)_array_elements use causes very large memory usage when also referencing entire json document  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [BUGS] json(b)_array_elements use causes very large memory usage when also referencing entire json document
Список pgsql-bugs
On Fri, Oct 6, 2017 at 12:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Lucas Fairchild-Madar <lucas.madar@gmail.com> writes:
> Sure, here's a bash script. Assumes you have PGDATABASE, etc set.
> Verified this also blows up on pg 10. I haven't verified the new patch.

Oh, I see the problem.  This is a different animal, because it's actually
an intra-row leak, as it were.  What you've got is

select data->'id', jsonb_array_elements(data->'items') from kaboom;

where the SRF jsonb_array_elements() emits a lot of values.  For each
of those values, data->'id' gets evaluated over again, and we can't
reclaim memory in the per-tuple context until we've finished the whole
cycle for the current row of "kaboom".  So a leak would occur in any
case ... but it's particularly awful in this case, because data->'id'
involves detoasting the rather wide value of "data", which is then
promptly leaked.  So the total memory consumption is more or less
proportional to O(N^2) in the length of "data".

This has been like this since forever, and it's probably impractical
to do anything about it pre-v10, given the unstructured way that
targetlist SRFs are handled.  You could dodge the problem by moving
the SRF to a lateral FROM item:

select data->'id', ja
from kaboom, lateral jsonb_array_elements(data->'items') as ja;

(The LATERAL keyword is optional here, but I like it because it
makes it clearer what's happening.)

As of v10, it might be possible to fix this for the tlist case
as well, by doing something like using a separate short-lived
context for the non-SRF tlist items.

 
Is there any sort of setting right now that can defend against this? A way to prevent a query from using 20+GB of memory? I'd prefer the query fail before the database system is kill -9'd.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [BUGS] json(b)_array_elements use causes very large memory usage when also referencing entire json document
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [BUGS] json(b)_array_elements use causes very large memory usage when also referencing entire json document