Обсуждение: [HACKERS] Tightly packing expressions

Поиск
Список
Период
Сортировка

[HACKERS] Tightly packing expressions

От
Douglas Doole
Дата:
Andres,
Back when you were getting ready to commit your faster expressions, I volunteered to look at switching from an array of fixed sized steps to tightly packed structures. (https://www.postgresql.org/message-id/20170314221648.jrcgh5n7ld4ej2o7@alap3.anarazel.de). I've finally found time to make it happen.

Using tightly packed structures makes the expressions a lot smaller. I instrumented some before and after builds and ran "make check" just to see how much memory expressions were using. What I found was:

There were ~104K expressions.

Memory - bytes needed to hold the steps of the expressions
Allocated Memory - memory is allocated in blocks, this is the bytes allocated to hold the expressions
Wasted Memory - the difference between the allocated and used memory

Original code:
Memory: min=64 max=9984 total=28232512 average=265
Allocated Memory: min=1024 max=16384 total=111171584 average=1045
Wasted Memory: min=0 max=6400 total=82939072 average=780

New code:
Memory: min=32 (50%) max=8128 (82%) total=18266712 (65%) average=175 (66%)
Allocated Memory: min=192 (19%) max=9408 (57%) total=29386176 (26%) average=282 (27%)
Wasted Memory: min=0 (0%) max=1280 (20%) total=11119464 (13%) average=106 (14%)

So on average expressions are about a third smaller than with the fixed size array. The new approach doesn't overallocate as badly as the array approach so the packed approach cuts memory consumption by over 70%. (Of course, the array code could be tuned to reduce the overhead as well.)

Another benefit of the way I did this is that the expression memory is never reallocated. This means that it is safe for one step to point directly to a field in another step without needing to separately palloc storage. That should allow us to simplify the code and reduce pointer traversals. (I haven't exploited this in the patch. I figured it would be a task for round 2 assuming you like what I've done here.)

The work was mostly mechanical. The only tricky bit was dealing with the places where you jump to step n+1 while building step n. Since we can't tell the address of step n+1 until it is actually built, I had to defer resolution of the jumps. So all the interesting bits of this patch are in ExprEvalPushStep().

Let me know what you think.

- Doug
Salesforce
Вложения

Re: [HACKERS] Tightly packing expressions

От
Andres Freund
Дата:
Hi Doug,

On 2017-08-03 18:01:06 +0000, Douglas Doole wrote:
> Back when you were getting ready to commit your faster expressions, I
> volunteered to look at switching from an array of fixed sized steps to
> tightly packed structures. (
> https://www.postgresql.org/message-id/20170314221648.jrcgh5n7ld4ej2o7@alap3.anarazel.de).
> I've finally found time to make it happen.

Thanks for working on it!


> Using tightly packed structures makes the expressions a lot smaller. I
> instrumented some before and after builds and ran "make check" just to see
> how much memory expressions were using. What I found was:
> 
> There were ~104K expressions.
> 
> Memory - bytes needed to hold the steps of the expressions
> Allocated Memory - memory is allocated in blocks, this is the bytes
> allocated to hold the expressions
> Wasted Memory - the difference between the allocated and used memory
> 
> Original code:
> Memory: min=64 max=9984 total=28232512 average=265
> Allocated Memory: min=1024 max=16384 total=111171584 average=1045
> Wasted Memory: min=0 max=6400 total=82939072 average=780
> 
> New code:
> Memory: min=32 (50%) max=8128 (82%) total=18266712 (65%) average=175 (66%)
> Allocated Memory: min=192 (19%) max=9408 (57%) total=29386176 (26%)
> average=282 (27%)
> Wasted Memory: min=0 (0%) max=1280 (20%) total=11119464 (13%) average=106
> (14%)

That's actually not *that* big of a saving...


> Another benefit of the way I did this is that the expression memory is
> never reallocated. This means that it is safe for one step to point
> directly to a field in another step without needing to separately palloc
> storage. That should allow us to simplify the code and reduce pointer
> traversals. (I haven't exploited this in the patch. I figured it would be a
> task for round 2 assuming you like what I've done here.)

Yes, that's a neat benefit. Although I think it'd even better if we
could work to the point where the mutable and the unchanging data is
separately allocated, so we can at some point avoid redundant expression
"compilations".


> The work was mostly mechanical. The only tricky bit was dealing with the
> places where you jump to step n+1 while building step n. Since we can't
> tell the address of step n+1 until it is actually built, I had to defer
> resolution of the jumps. So all the interesting bits of this patch are in
> ExprEvalPushStep().

I was wondering about a more general "symbol resolution" stage
anyway. Then we'd allocate individual steps during ExecInitExprRec, and
allocate the linear array after we know the exact size.


I think it'd be important to see some performance measurements,
especially for larger queries. It'd not be too surprising if the
different method of dereffing the next expression actually has a
negative performance effect, but I'm absolutely not sure of that.

Regards,

Andres