Variable-length FunctionCallInfoData

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Variable-length FunctionCallInfoData
Дата
Msg-id 20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
обсуждение исходный текст
Ответы RE: Variable-length FunctionCallInfoData  (serge@rielau.com)
Re: Variable-length FunctionCallInfoData  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Re: Variable-length FunctionCallInfoData  (Andres Freund <andres@anarazel.de>)
Re: Variable-length FunctionCallInfoData  (Greg Stark <stark@mit.edu>)
Список pgsql-hackers
Hi,

While prototyping codegen improvements for JITed expression evaluation,
I once more hit the issue that the FunctionCallInfoData structs are
really large (936 bytes), despite arguments beyond the fourth barely
every being used.  I think we should fix that.

What I think we should do is convert
FunctionCallInfoData->{arg,argisnull} into an array of NullableDatum
(new type, a struct of Datum and bool), and then use a variable length
array for the arguments.  In the super common case of 2 arguments that
reduces the size of the array from 936 to 64 bytes.  Besides the size
reduction this also noticably reduces the number of cachelines accessed
- before it's absolutely guaranteed that the arg and argnull arrays for
the same argument aren't on the same cacheline, after it's almost
guaranteed to be the case.

Attached is a *PROTOTYPE* patch doing so.  Note I was too lazy to fully
fix up the jit code, I didn't want to do the legwork before we've some
agreement on this.  We also can get rid of FUNC_MAX_ARGS after this, but
there's surrounding code that still relies on it.

There's some added uglyness, which I hope we can polish a bit
further. Right now we allocate a good number of FunctionCallInfoData
struct on the stack - which doesn't quite work afterwards anymore.  So
the stack allocations, for the majoroity cases where the argument number
is known, currently looks like:

    union {
        FunctionCallInfoData fcinfo;
        char *fcinfo_data[SizeForFunctionCallInfoData(0)];
    } fcinfodata;
    FunctionCallInfo fcinfo = &fcinfodata.fcinfo;

that's not pretty, but also not that bad.

It's a bit unfortunate that this'll break some extensions, but I don't
really see a way around that.  The current approach, to me, clearly
doesn't have a future.  I wonder if we should add a bunch of accessor
macros / inline functions that we (or extension authors) can backpatch
to reduce the pain of maintaining different code paths.

Besides the change here, I think we should also go much further with the
conversion to NullableDatum.  There's two main areas of change: I want
to move the execExpr.c related code so steps return data into
NullableDatums - that removes a good chunk of pointer dereferences and
allocations. Secondly I think we should move TupleTableSlot to this
format - the issue with nulls / datums being on separate cachelines is
noticeable in profiles, but more importantly the code looks more
consistent with it.


As an example for the difference in memory usage, here's the memory
consumption at ExecutorRun time, for TPCH's Q01:

master:
TopPortalContext: 8192 total in 1 blocks; 7664 free (0 chunks); 528 used
  PortalContext: 1024 total in 1 blocks; 576 free (0 chunks); 448 used:
    ExecutorState: 90744 total in 5 blocks; 31568 free (2 chunks); 59176 used
      ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
      ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
      ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
      ExprContext: 8192 total in 1 blocks; 3488 free (0 chunks); 4704 used
      ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
      ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
Grand total: 149112 bytes in 13 blocks; 82976 free (2 chunks); 66136 used

patch:
TopPortalContext: 8192 total in 1 blocks; 7664 free (0 chunks); 528 used
  PortalContext: 1024 total in 1 blocks; 576 free (0 chunks); 448 used:
    ExecutorState: 65536 total in 4 blocks; 33536 free (6 chunks); 32000 used
      ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
      ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
      ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
      ExprContext: 8192 total in 1 blocks; 5408 free (0 chunks); 2784 used
      ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
      ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
Grand total: 123904 bytes in 12 blocks; 86864 free (6 chunks); 37040 used

As you can see, the ExecutorState context uses nearly half the amount of
memory as before. In a lot of cases a good chunk of the benefit is going
to be hidden due to memory context sizing, but I'd expect that to matter
much less for more complex statements and plpgsql functions etc.

Comments?

Greetings,

Andres Freund

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Davis
Дата:
Сообщение: Re: Re: Spilling hashed SetOps and aggregates to disk
Следующее
От: Jeff Davis
Дата:
Сообщение: Re: Spilling hashed SetOps and aggregates to disk