Обсуждение: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

Поиск
Список
Период
Сортировка

[PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Andres Freund
Дата:
Hi,

While looking at binary COPY performance I forgot to add BINARY and was a bit 
shocked to see printf that high in the profile...

Setup:

CREATE TABLE convtest AS SELECT a.i ai, b.i bi, a.i*b.i aibi, (a.i*b.i)::text 
aibit FROM generate_series(1,1000) a(i), generate_series(1, 10000) b(i);

Profile with an unmodified pg:

speedtest=# COPY convtest(ai,bi,aibi) TO '/dev/null';
COPY 10000000
Time: 9192.476 ms

Profile:# Events: 9K cycles## Overhead          Command      Shared Object                        Symbol# ........
............... .................  ............................#    18.24%  postgres_oldint  libc-2.12.1.so     [.]
__GI_vfprintf    8.90%  postgres_oldint  libc-2.12.1.so     [.] _itoa_word     8.77%  postgres_oldint  postgres_oldint
 [.] CopyOneRowTo     8.19%  postgres_oldint  libc-2.12.1.so     [.] 
 
_IO_default_xsputn_internal     3.67%  postgres_oldint  postgres_oldint    [.] AllocSetAlloc     3.38%  postgres_oldint
libc-2.12.1.so     [.] __strchrnul     3.24%  postgres_oldint  libc-2.12.1.so     [.] __GI___vsprintf_chk     2.87%
postgres_oldint postgres_oldint    [.] heap_deform_tuple     2.49%  postgres_oldint  libc-2.12.1.so     [.]
_IO_old_init    2.25%  postgres_oldint  libc-2.12.1.so     [.] _IO_new_file_xsputn     2.03%  postgres_oldint
postgres_oldint   [.] appendBinaryStringInfo     1.89%  postgres_oldint  postgres_oldint    [.] heapgettup_pagemode
1.86% postgres_oldint  postgres_oldint    [.] FunctionCall1     1.85%  postgres_oldint  postgres_oldint    [.]
AllocSetCheck    1.79%  postgres_oldint  postgres_oldint    [.] enlargeStringInfo
 



Timing after replacing those sprintf("%li", ...) calls with a quickly coded 
handrolled itoa:

speedtest=# COPY convtest(ai,bi,aibi) TO '/dev/null';
COPY 10000000
Time: 5309.928 ms

Profile:# Events: 5K cycles## Overhead   Command      Shared Object                       Symbol# ........  ........
................. ...........................#    14.96%  postgres  postgres           [.] pg_s32toa    14.75%
postgres postgres           [.] CopyOneRowTo     5.97%  postgres  postgres           [.] AllocSetAlloc     4.73%
postgres postgres           [.] heap_deform_tuple     4.54%  postgres  postgres           [.] AllocSetCheck     4.01%
postgres libc-2.12.1.so     [.] _IO_new_file_xsputn     3.59%  postgres  postgres           [.] heapgettup_pagemode
3.32% postgres  postgres           [.] enlargeStringInfo     3.25%  postgres  postgres           [.]
appendBinaryStringInfo    2.87%  postgres  postgres           [.] CopySendChar     2.65%  postgres  postgres
[.]FunctionCall1     2.44%  postgres  postgres           [.] int4out     2.38%  postgres  [kernel.kallsyms]  [k]
copy_user_generic_string    2.30%  postgres  postgres           [.] AllocSetReset     2.06%  postgres  postgres
 [.] pg_server_to_client     1.89%  postgres  libc-2.12.1.so     [.] __GI_memset     1.87%  postgres  libc-2.12.1.so
[.] memcpy
 



A change from 9192.476ms 5309.928ms seems to be pretty good indication that a 
change in that area is waranted given integer columns are quite ubiquous...

While at it:

* I remove the outdated
-- NOTE: int[24] operators never check for over/underflow!
-- Some of these answers are consequently numerically incorrect.
warnings in the regressions tests.

* I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names confusing. Not 
sure if its worth it.

* I added some tests for the border cases of 2^31-1 / -2^31

The 'after' profile shows obvious room for furhter improvement, but on a quick 
look I couldn't think of anything. Any Ideas?



Andres


PS: Oh, thats with assertions, but the results are comparable without them 
(8765.796ms vs 4561.673ms)

Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Itagaki Takahiro
Дата:
On Mon, Nov 1, 2010 at 6:41 AM, Andres Freund <andres@anarazel.de> wrote:
> While looking at binary COPY performance I forgot to add BINARY and was a bit
> shocked to see printf that high in the profile...
>
> A change from 9192.476ms 5309.928ms seems to be pretty good indication that a
> change in that area is waranted given integer columns are quite ubiquous...

Good optimization. Here is the result on my machine:
* before: 13057.190 ms, 12429.092 ms, 12622.374 ms
* after: 8261.688 ms, 8427.024 ms, 8622.370 ms

> * I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names confusing. Not
> sure if its worth it.

Agreed, but how about pg_i(16|32|64)toa? 'i' might be more popular than 's'.
See also http://msdn.microsoft.com/en-US/library/yakksftt(VS.100).aspx

I have a couple of questions and comments:

* Why did you change "MAXINT8LEN + 1" to "+ 2" ? Are there possibility of buffer overflow in the current code?
@@ -158,12 +159,9 @@ int8out(PG_FUNCTION_ARGS)
-    char        buf[MAXINT8LEN + 1];
+    char        buf[MAXINT8LEN + 2];

* The buffer reordering seems a bit messy.
//have to reorder the string, but not 0byte.
I'd suggest to fill a fixed-size local buffer from right to left
and copy it to the actual output.

* C++-style comments should be cleaned up.

-- 
Itagaki Takahiro


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Robert Haas
Дата:
On Sun, Oct 31, 2010 at 11:04 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
> On Mon, Nov 1, 2010 at 6:41 AM, Andres Freund <andres@anarazel.de> wrote:
>> While looking at binary COPY performance I forgot to add BINARY and was a bit
>> shocked to see printf that high in the profile...
>>
>> A change from 9192.476ms 5309.928ms seems to be pretty good indication that a
>> change in that area is waranted given integer columns are quite ubiquous...
>
> Good optimization. Here is the result on my machine:
> * before: 13057.190 ms, 12429.092 ms, 12622.374 ms
> * after: 8261.688 ms, 8427.024 ms, 8622.370 ms

Wow.  Nice stuff, Andres!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Andres Freund
Дата:
On Monday 01 November 2010 04:04:51 Itagaki Takahiro wrote:
> On Mon, Nov 1, 2010 at 6:41 AM, Andres Freund <andres@anarazel.de> wrote:
> > While looking at binary COPY performance I forgot to add BINARY and was a
> > bit shocked to see printf that high in the profile...
> > 
> > A change from 9192.476ms 5309.928ms seems to be pretty good indication
> > that a change in that area is waranted given integer columns are quite
> > ubiquous...
> 
> Good optimization. Here is the result on my machine:
> * before: 13057.190 ms, 12429.092 ms, 12622.374 ms
> * after: 8261.688 ms, 8427.024 ms, 8622.370 ms
Thanks.

> > * I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names
> > confusing. Not sure if its worth it.
> 
> Agreed, but how about pg_i(16|32|64)toa? 'i' might be more popular than
> 's'. See also
> http://msdn.microsoft.com/en-US/library/yakksftt(VS.100).aspx
I find itoa not as clear about signedness as stoa, but if you insist, I dont 
feel strongly about it.

> I have a couple of questions and comments:
> 
> * Why did you change "MAXINT8LEN + 1" to "+ 2" ?
>   Are there possibility of buffer overflow in the current code?
> @@ -158,12 +159,9 @@ int8out(PG_FUNCTION_ARGS)
> -    char        buf[MAXINT8LEN + 1];
> +    char        buf[MAXINT8LEN + 2];
Argh. That should have never gotten into the patch. I was playing around with 
another optimization which would have needed more buffer space (but was quite a  
bit slower).

> * The buffer reordering seems a bit messy.
> //have to reorder the string, but not 0byte.
> I'd suggest to fill a fixed-size local buffer from right to left
> and copy it to the actual output.
Hm. 
while(bufstart < buf){    char swap = *bufstart;    *bufstart++ = *buf;    *buf-- = swap;}

Is a bit cleaner maybe, but I dont see much point in putting it into its own 
function... But again, I don't feel strongly.


> * C++-style comments should be cleaned up.
Will clean up.

Andres


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Andres Freund
Дата:
Hi, 
On Monday 01 November 2010 10:15:01 Andres Freund wrote:
> On Monday 01 November 2010 04:04:51 Itagaki Takahiro wrote:
> > On Mon, Nov 1, 2010 at 6:41 AM, Andres Freund <andres@anarazel.de> wrote:
> > > While looking at binary COPY performance I forgot to add BINARY and was
> > > a bit shocked to see printf that high in the profile...
> > > 
> > > A change from 9192.476ms 5309.928ms seems to be pretty good indication
> > > that a change in that area is waranted given integer columns are quite
> > > ubiquous...
> > > * I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names
> > > confusing. Not sure if its worth it.
> > Agreed, but how about pg_i(16|32|64)toa? 'i' might be more popular than
> > 's'. See also
> > http://msdn.microsoft.com/en-US/library/yakksftt(VS.100).aspx
> I find itoa not as clear about signedness as stoa, but if you insist, I
> dont feel strongly about it.
Let whover commits it decide...

> > * The buffer reordering seems a bit messy.
> > //have to reorder the string, but not 0byte.
> > I'd suggest to fill a fixed-size local buffer from right to left
> > and copy it to the actual output.
> Is a bit cleaner maybe, but I dont see much point in putting it into its
> own function... But again, I don't feel strongly.
Using a seperate buffer cost nearly 500ms... So I only changed the comments 
there.

The only way I could think of to make it faster was to fill the buffer from the 
end and then return a pointer to the starting point in the buffer. The speed 
benefits are small (around 80ms) and it makes the interface more cumbersome...


Revised version attached - I will submit this to the next comittfest now.

Andres

Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Andres Freund
Дата:
On Tuesday 02 November 2010 01:37:43 Andres Freund wrote:
> Revised version attached - I will submit this to the next comittfest now.
Context diff attached this time...

Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Peter Eisentraut
Дата:
On sön, 2010-10-31 at 22:41 +0100, Andres Freund wrote:
> * I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names
> confusing. Not sure if its worth it.

Given that there are widely established functions atoi() and atol(),
naming the reverse itoa() and ltoa() makes a lot of sense.  The changed
versions read like "string to ASCII".



Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> On sön, 2010-10-31 at 22:41 +0100, Andres Freund wrote:
>> * I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names
>> confusing. Not sure if its worth it.

> Given that there are widely established functions atoi() and atol(),
> naming the reverse itoa() and ltoa() makes a lot of sense.  The changed
> versions read like "string to ASCII".

Yeah, and "s32" makes no sense at all.  I think we should either leave
well enough alone (to avoid introducing a cross-version backpatch
hazard) or use pg_i32toa etc.
        regards, tom lane


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Robert Haas
Дата:
On Sun, Oct 31, 2010 at 5:41 PM, Andres Freund <andres@anarazel.de> wrote:
> While at it:

These words always make me a bit frightened when reviewing a patch,
since it's generally simpler if a single patch only does one thing.
However, in this case...

> * I remove the outdated
> -- NOTE: int[24] operators never check for over/underflow!
> -- Some of these answers are consequently numerically incorrect.
> warnings in the regressions tests.

...this part looks obviously OK, so I have committed it.

The rest is attached as a residual patch, except that I reverted this change:

> * I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names confusing. Not
> sure if its worth it.

I notice that int8out isn't terribly consistent with int2out and
int4out, in that it does an extra copy.   Maybe that's justified given
the greater potential memory wastage, but I'm not certain.  One
approach might be to pick some threshold value and allocate a buffer
in one of two sizes based on how large the value is relative to that
cutoff.  But that might also be a stupid idea, not sure.

It would speed things up for me if you or someone else could take a
quick pass over what remains here and fix the formatting and
whitespace to be consistent with our general project style, and make
the comment headers more consistent among the functions being
added/modified.

I think the new regression tests look good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Andres Freund
Дата:
On Monday 15 November 2010 17:12:25 Robert Haas wrote:
> It would speed things up for me if you or someone else could take a
> quick pass over what remains here and fix the formatting and
> whitespace to be consistent with our general project style, and make
> the comment headers more consistent among the functions being
> added/modified.
will do.

Andres


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Andres Freund
Дата:
On Monday 15 November 2010 17:12:25 Robert Haas wrote:> I notice that int8out 
isn't terribly consistent with int2out and
> int4out, in that it does an extra copy.   Maybe that's justified given
> the greater potential memory wastage, but I'm not certain.  One
> approach might be to pick some threshold value and allocate a buffer
> in one of two sizes based on how large the value is relative to that
> cutoff.  But that might also be a stupid idea, not sure.
I removed the extra buffer - its actually a tiny bit faster without it  (I 
guess the allocation pattern is a bit nicer during copy as it will always take 
the same paths and eventually the same address).
I couldn't measure any difference memory-usage wise.

The code was that way before btw.

> It would speed things up for me if you or someone else could take a
> quick pass over what remains here and fix the formatting and
> whitespace to be consistent with our general project style, and make
> the comment headers more consistent among the functions being
> added/modified.
I think I did most of those - the function comments in numutils weren't 
consistent before - now its consistent with the unchanged pg_atoi. 

Thanks for reviewing/applying the first part,

Andres

Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Robert Haas
Дата:
On Fri, Nov 19, 2010 at 4:16 PM, Andres Freund <andres@anarazel.de> wrote:
> On Monday 15 November 2010 17:12:25 Robert Haas wrote:> I notice that int8out
> isn't terribly consistent with int2out and
>> int4out, in that it does an extra copy.   Maybe that's justified given
>> the greater potential memory wastage, but I'm not certain.  One
>> approach might be to pick some threshold value and allocate a buffer
>> in one of two sizes based on how large the value is relative to that
>> cutoff.  But that might also be a stupid idea, not sure.
> I removed the extra buffer - its actually a tiny bit faster without it  (I
> guess the allocation pattern is a bit nicer during copy as it will always take
> the same paths and eventually the same address).
> I couldn't measure any difference memory-usage wise.
>
> The code was that way before btw.

Yeah, I know.  After further thought I decided not to commit this
part, because using 32 bytes when you only need 8 is sort of sucky.
I'm not sure if it matters in real life, but if it's only a tiny
speedup I guess I might as well play it safe.

>> It would speed things up for me if you or someone else could take a
>> quick pass over what remains here and fix the formatting and
>> whitespace to be consistent with our general project style, and make
>> the comment headers more consistent among the functions being
>> added/modified.
> I think I did most of those - the function comments in numutils weren't
> consistent before - now its consistent with the unchanged pg_atoi.
>
> Thanks for reviewing/applying the first part,

Sure thing.  Thanks for taking time to do this - very nice speedup.
This part now committed, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Robert Haas
Дата:
On Fri, Nov 19, 2010 at 10:18 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Sure thing.  Thanks for taking time to do this - very nice speedup.
> This part now committed, too.

It occurs to me belatedly that there might be a better way to do this.Instead of flipping value from negative to
positive,with a special 
case for the smallest possible integer, we could do it the other
round.  And actually, I think we can rid of neg, too.

if (value < 0)   *a++ = '-';
else   value = -value;
start = a;

Then we could just adjust the calculation of the actual digit.

*a++ = '0' + (-remainder);

Good idea?  Bad idea?  Seems cleaner to me, assuming it'll actually work...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> It occurs to me belatedly that there might be a better way to do this.
>  Instead of flipping value from negative to positive, with a special
> case for the smallest possible integer, we could do it the other
> round.  And actually, I think we can rid of neg, too.

The trouble with that approach is that you have to depend on the
direction of rounding for negative quotients.  Which was unspecified
before C99, and it's precisely pre-C99 compilers that are posing a
hazard to the current coding.

FWIW, I find the code still pretty darn unsightly.  I think this change
is just wrong:
    * Avoid problems with the most negative integer not being representable    * as a positive integer.    */
-   if (value == INT32_MIN)
+   if (value == INT_MIN)   {       memcpy(a, "-2147483648", 12);

and even with INT32_MIN it was pretty silly, because there is exactly 0
hope of the code behaving sanely for some other value of the symbolic
constant.  I think it'd be much better to abandon the macros altogether
and write
   if (value == (-2147483647-1))   {       memcpy(a, "-2147483648", 12);

Likewise for the int64 case, which BTW is no safer for pre-C99 compilers
than it was yesterday: LL is not the portable way to write int64
constants.
        regards, tom lane


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Robert Haas
Дата:
On Sat, Nov 20, 2010 at 10:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The trouble with that approach is that you have to depend on the
> direction of rounding for negative quotients.  Which was unspecified
> before C99, and it's precisely pre-C99 compilers that are posing a
> hazard to the current coding.

Interesting.  I wondered whether there might be compilers out there
that handled that inconsistently, but then I thought I was probably
being paranoid.

> Likewise for the int64 case, which BTW is no safer for pre-C99 compilers
> than it was yesterday: LL is not the portable way to write int64
> constants.

Gah.  I wish we had some documentation of this stuff.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Tom Lane
Дата:
BTW, while we're thinking about marginal improvements: instead of
constructing the string backwards and then reversing it in-place,
what about building it working backwards from the end of the buffer
and then memmove'ing it down to the start of the buffer?

I haven't tested this but it seems likely to be roughly a wash
speed-wise.  The reason I find the idea attractive is that it will
immediately expose any caller that is providing a buffer shorter
than the required length, whereas now such callers will appear to
work fine if they're only tested on small values.

A small downside is that pg_itoa would then need its own implementation
instead of just punting to pg_ltoa.
        regards, tom lane


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Robert Haas
Дата:
On Sat, Nov 20, 2010 at 12:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> BTW, while we're thinking about marginal improvements: instead of
> constructing the string backwards and then reversing it in-place,
> what about building it working backwards from the end of the buffer
> and then memmove'ing it down to the start of the buffer?
>
> I haven't tested this but it seems likely to be roughly a wash
> speed-wise.  The reason I find the idea attractive is that it will
> immediately expose any caller that is providing a buffer shorter
> than the required length, whereas now such callers will appear to
> work fine if they're only tested on small values.
>
> A small downside is that pg_itoa would then need its own implementation
> instead of just punting to pg_ltoa.

I think that might be more clever than is really warranted.  I get
your point about buffer overrun, but I don't think it's that hard for
callers to do the right thing, so I'm inclined to think that's not
worth much in this case.  Of course, if memmove() can be implemented
as a single assembler instruction or something, that might be
appealing from a speed standpoint, but otherwise I think we may as
well stick with this.  There's less chance of needlessly touching an
extra cache line, less chance of being confused by leftover garbage in
memory after the end of the output string, and less duplicate code.

I had given some thought to whether it might make sense to try to
figure out how long the string will be before we actually start
generating it, so that we can just start in the exactly right space
and have to clean up afterward.  But the obvious implementation seems
like it could be more expensive than just doing the copy.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sat, Nov 20, 2010 at 12:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> what about building it working backwards from the end of the buffer
>> and then memmove'ing it down to the start of the buffer?

> I think that might be more clever than is really warranted.  I get
> your point about buffer overrun, but I don't think it's that hard for
> callers to do the right thing, so I'm inclined to think that's not
> worth much in this case.

Fair enough --- it was just a passing thought.

> I had given some thought to whether it might make sense to try to
> figure out how long the string will be before we actually start
> generating it, so that we can just start in the exactly right space
> and have to clean up afterward.  But the obvious implementation seems
> like it could be more expensive than just doing the copy.

Yeah.  You certainly don't want to do the division sequence twice,
and a log() call wouldn't be cheap either, and there don't seem to
be many other alternatives.  If we were going to get picky about
avoiding the reverse step, I'd go with Andres' idea of changing
the API to pass back an address instead of guaranteeing that the
result begins at the start of the buffer.  But I think that's much
more complicated for callers than it's worth.
        regards, tom lane


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Andres Freund
Дата:
On Saturday 20 November 2010 18:34:04 Tom Lane wrote:
> BTW, while we're thinking about marginal improvements: instead of
> constructing the string backwards and then reversing it in-place,
> what about building it working backwards from the end of the buffer
> and then memmove'ing it down to the start of the buffer?
> 
> I haven't tested this but it seems likely to be roughly a wash
> speed-wise.  The reason I find the idea attractive is that it will
> immediately expose any caller that is providing a buffer shorter
> than the required length, whereas now such callers will appear to
> work fine if they're only tested on small values.
Tried that, the cost was measurable although not big (~3-5%)...

Greetings,

Andres


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Andres Freund
Дата:
On Saturday 20 November 2010 18:18:32 Robert Haas wrote:
> > Likewise for the int64 case, which BTW is no safer for pre-C99 compilers
> > than it was yesterday: LL is not the portable way to write int64
> > constants.
> Gah.  I wish we had some documentation of this stuff.
Dito. I started doing Cish stuff quite a bit *after* C99 was mostly available 
in gcc...

Sorry btw, for not realizing those points (and the regression-expectation file) 
myself...

Andres


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Greg Stark
Дата:
On Sat, Nov 20, 2010 at 6:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sat, Nov 20, 2010 at 12:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> what about building it working backwards from the end of the buffer
>>> and then memmove'ing it down to the start of the buffer?
>
>> I think that might be more clever than is really warranted.  I get
>> your point about buffer overrun, but I don't think it's that hard for
>> callers to do the right thing, so I'm inclined to think that's not
>> worth much in this case.

It also seems wrong that a caller might happen to know that their
argument will never be more than n digits but still has to allocate a
buffer large enough to hold 2^64.



>
> Fair enough --- it was just a passing thought.
>
>> I had given some thought to whether it might make sense to try to
>> figure out how long the string will be before we actually start
>> generating it, so that we can just start in the exactly right space
>> and have to clean up afterward.  But the obvious implementation seems
>> like it could be more expensive than just doing the copy.
>
> Yeah.  You certainly don't want to do the division sequence twice,
> and a log() call wouldn't be cheap either, and there don't seem to
> be many other alternatives.

There are bittwiddling hacks for computing log based 2. I'm not sure
it's worth worrying about to this degree though.



--
greg


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Tom Lane
Дата:
Greg Stark <gsstark@mit.edu> writes:
> On Sat, Nov 20, 2010 at 6:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> I had given some thought to whether it might make sense to try to
>>> figure out how long the string will be before we actually start
>>> generating it, so that we can just start in the exactly right space
>>> and have to clean up afterward. �But the obvious implementation seems
>>> like it could be more expensive than just doing the copy.

>> Yeah. �You certainly don't want to do the division sequence twice,
>> and a log() call wouldn't be cheap either, and there don't seem to
>> be many other alternatives.

> There are bittwiddling hacks for computing log based 2. I'm not sure
> it's worth worrying about to this degree though.

I think converting log2 to log10 *exactly* would end up being not so
cheap, anyhow.
        regards, tom lane


Re: [PATCH] Custom code int(32|64) => text conversions out of performance reasons

От
Florian Weimer
Дата:
* Tom Lane:

> Yeah.  You certainly don't want to do the division sequence twice,
> and a log() call wouldn't be cheap either, and there don't seem to
> be many other alternatives.

What about a sequence of comparisons, and unrolling the loop?  That
could avoid the final division, too.  It might also be helpful to
break down the dependency chain for large input values.

The int8 version should probably work in 1e9 chunks and use a
zero-padding variant of the 32-bit code.

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99