Re: Spilling hashed SetOps and aggregates to disk

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Spilling hashed SetOps and aggregates to disk
Дата
Msg-id CAKJS1f_yPD8M0SPSAqHBvf-SpXQu_CxsYDb-aV3k4mk1nzkFFw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Spilling hashed SetOps and aggregates to disk  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
On 6 June 2018 at 01:17, David Rowley <david.rowley@2ndquadrant.com> wrote:
> On 6 June 2018 at 01:09, Andres Freund <andres@anarazel.de> wrote:
>> On 2018-06-06 01:06:39 +1200, David Rowley wrote:
>>> My concern is that only accounting memory for the group and not the
>>> state is only solving half the problem. It might be fine for
>>> aggregates that don't stray far from their aggtransspace, but for the
>>> other ones, we could still see OOM.
>>
>>> If solving the problem completely is too hard, then a half fix (maybe
>>> 3/4) is better than nothing, but if we can get a design for a full fix
>>> before too much work is done, then isn't that better?
>>
>> I don't think we actually disagree.  I was really primarily talking
>> about the case where we can't really do better because we don't have
>> serialization support.  I mean we could just rescan from scratch, using
>> a groupagg, but that obviously sucks.
>
> I don't think we do. To take yours to the 100% solution might just
> take adding the memory accounting to palloc that Jeff proposed a few
> years ago and use that accounting to decide when we should switch
> method.
>
> However, I don't quite fully recall how the patch accounted for memory
> consumed by sub-contexts and if getting the entire consumption
> required recursively looking at subcontexts. If that's the case then
> checking the consumption would likely cost too much if it was done
> after each tuple was aggregated.

I wonder if the whole internal state memory accounting problem could
be solved by just adding an aggregate supporting function for internal
state aggregates that returns the number of bytes consumed by the
state. It might be good enough to fall back on aggtransspace when the
function is not defined. Such a function would be about 3 lines long
for string_agg and array_agg, and these are the problem aggregates.


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Spilling hashed SetOps and aggregates to disk
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: pg_replication_slot_advance to return NULL instead of 0/0 ifslot not advanced