Re: Reducing System Allocator Thrashing of ExecutorState to Alleviate FDW-related Performance Degradations

Поиск
Список
Период
Сортировка
От John Naylor
Тема Re: Reducing System Allocator Thrashing of ExecutorState to Alleviate FDW-related Performance Degradations
Дата
Msg-id CAFBsxsEeN2go4+ok00HV4Zx7Sr6OMpZ2-iQr+szFxprVfs7y0A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Reducing System Allocator Thrashing of ExecutorState to Alleviate FDW-related Performance Degradations  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers

On Tue, Feb 21, 2023 at 2:46 AM Andres Freund <andres@anarazel.de> wrote:

> On 2023-02-21 08:33:22 +1300, David Rowley wrote:
> > I am interested in a bump allocator for tuplesort.c. There it would be
> > used in isolation and all the code which would touch pointers
> > allocated by the bump allocator would be self-contained to the
> > tuplesorting code.
> >
> > What use case do you have in mind?
>
> E.g. the whole executor state tree (and likely also the plan tree) should be
> allocated that way. They're never individually freed. But we also allocate
> other things in the same context, and those do need to be individually
> freeable. We could use a separate memory context, but that'd increase memory
> usage in many cases, because there'd be two different blocks being allocated
> from at the same time.

That reminds me of this thread I recently stumbled across about memory management of prepared statements:

https://www.postgresql.org/message-id/20190726004124.prcb55bp43537vyw%40alap3.anarazel.de

I recently heard of a technique for relative pointers that could enable tree structures within a single allocation.

If "a" needs to store the location of "b" relative to "a", it would be calculated like

a = (char *) &b - (char *) &a;

...then to find b again, do

typeof_b* b_ptr;
b_ptr = (typeof_b* ) ((char *) &a + a);

One issue with this naive sketch is that zero would point to one's self, and it would be better if zero still meant "invalid pointer" so that memset(0) does the right thing.

Using signed byte-sized offsets as an example, the range is -128 to 127, so we can call -128 the invalid pointer, or in binary 0b1000_0000.

To interpret a raw zero as invalid, we need an encoding, and here we can just XOR it:

#define Encode(a) a^0b1000_0000;
#define Decode(a) a^0b1000_0000;

Then, encode(-128) == 0 and decode(0) == -128, and memset(0) will do the right thing and that value will be decoded as invalid.

Conversely, this preserves the ability to point to self, if needed:

encode(0) == -128 and decode(-128) == 0

...so we can store any relative offset in the range -127..127, as well as "invalid offset". This extends to larger signed integer types in the obvious way.

Putting the above two calculations together, the math ends up like this, which can be put into macros:

absolute to relative:
a = Encode((int32) (char *) &b - (char *) &a);

relative to absolute:
typeof_b* b_ptr;
b_ptr = (typeof_b* ) ((char *) &a + Decode(a));

I'm not yet familiar enough with parse/plan/execute trees to know if this would work or not, but that might be a good thing to look into next cycle.

--
John Naylor
EDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noel Grandin
Дата:
Сообщение: Re: how does postgresql handle LOB/CLOB/BLOB column data that dies before the query ends
Следующее
От: Tom Lane
Дата:
Сообщение: Re: how does postgresql handle LOB/CLOB/BLOB column data that dies before the query ends