Re: PATCH: decreasing memory needlessly consumed by array_agg

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: PATCH: decreasing memory needlessly consumed by array_agg
Дата
Msg-id 54B6E53A.5040808@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: PATCH: decreasing memory needlessly consumed by array_agg  (Ali Akbar <the.apaan@gmail.com>)
Ответы Re: PATCH: decreasing memory needlessly consumed by array_agg  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
On 12.1.2015 01:28, Ali Akbar wrote:
>
>     >     Or else we implement what you suggest below (more comments below):
>     >
>     >         Thinking about the 'release' flag a bit more - maybe we
>     could do
>     >         this
>     >         instead:
>     >
>     >             if (release && astate->private_cxt)
>     >                 MemoryContextDelete(astate->mcontext);
>     >             else if (release)
>     >             {
>     >                 pfree(astate->dvalues);
>     >                 pfree(astate->dnulls);
>     >                 pfree(astate);
>     >             }
>     >
>     >         i.e. either destroy the whole context if possible, and
>     just free the
>     >         memory when using a shared memory context. But I'm afraid
>     this would
>     >         penalize the shared memory context, because that's
>     intended for
>     >         cases where all the build states coexist in parallel and
>     then at some
>     >         point are all converted into a result and thrown away.
>     Adding pfree()
>     >         calls is no improvement here, and just wastes cycles.
>     >
>     >
>     >     As per Tom's comment, i'm using "parent memory context" instead of
>     >     "shared memory context" below.
>     >
>     >     In the future, if some code writer decided to use
>     subcontext=false,
>     >     to save memory in cases where there are many array
>     accumulation, and
>     >     the parent memory context is long-living, current code can cause
>     >     memory leak. So i think we should implement your suggestion
>     >     (pfreeing astate), and warn the implication in the API
>     comment. The
>     >     API user must choose between release=true, wasting cycles but
>     >     preventing memory leak, or managing memory from the parent memory
>     >     context.
>
>     I'm wondering whether this is necessary after fixing makeArrayResult to
>     use the privat_cxt flag. It's still possible to call makeMdArrayResult
>     directly (with the wrong 'release' value).
>
>     Another option might be to get rid of the 'release' flag altogether, and
>     just use the 'private_cxt' - I'm not aware of a code using release=false
>     with private_cxt=true (e.g. to build the same array twice from the same
>     astate). But maybe there's such code, and another downside is that it'd
>     break the existing API.
>
>     >     In one possible use case, for efficiency maybe the caller will
>     >     create a special parent memory context for all array accumulation.
>     >     Then uses makeArrayResult* with release=false, and in the end
>     >     releasing the parent memory context once for all.
>
>     Yeah, although I'd much rather not break the existing code at all. That
>     is - my goal is not to make it slower unless absolutely necessary (and
>     in that case the code may be fixed per your suggestion). But I'm not
>     convinced it's worth it.
>
>
> OK. Do you think we need to note this in the comments? Something like
> this: If using subcontext=false, the caller must be careful about memory
> usage, because makeArrayResult* will not free the memory used.

Yes, I think it's worth mentioning.

>     But I think it makes sense to move the error handling into
>     initArrayResultArr(), including the get_element_type() call, and remove
>     the element_type from the signature. This means initArrayResultAny()
>     will call the get_element_type() twice, but I guess that's negligible.
>     And everyone who calls initArrayResultArr() will get the error handling
>     for free.
>
>     Patch v7 attached, implementing those two changes, i.e.
>
>       * makeMdArrayResult(..., astate->private_cxt)
>       * move error handling into initArrayResultArr()
>       * remove element_type from initArrayResultArr() signature
>
>
> Reviewing the v7 patch:
> - applies cleanly to current master. patch format, whitespace, etc is good
> - make check runs without error
> - performance & memory usage still consistent
>
> If you think we don't have to add the comment (see above), i'll mark
> this as ready for committer

Attached is v8 patch, with a few comments added:

1) before initArrayResult() - explaining when it's better to use a
   single memory context, and when it's more efficient to use a
   separate memory context for each array build state

2) before makeArrayResult() - explaining that it won't free memory
   when allocated in a single memory context (and that a pfree()
   has to be used if necessary)

3) before makeMdArrayResult() - explaining that it's illegal to use
   release=true unless using a subcontext

Otherwise the v8 patch is exactly the same as v7. Assuming the comments
make it sufficiently clear, I agree with marking this as 'ready for
committer'.

regards

--
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: orangutan seizes up during isolation-check
Следующее
От: Robert Haas
Дата:
Сообщение: Re: pg_rewind in contrib