Re: Efficient output for integer types
От | David Fetter |
---|---|
Тема | Re: Efficient output for integer types |
Дата | |
Msg-id | 20190920191450.GD31596@fetter.org обсуждение исходный текст |
Ответ на | Re: Efficient output for integer types (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Ответы |
Re: Efficient output for integer types
(David Fetter <david@fetter.org>)
|
Список | pgsql-hackers |
On Wed, Sep 18, 2019 at 04:27:46PM +0900, Kyotaro Horiguchi wrote: > Hello. > > At Wed, 18 Sep 2019 05:42:01 +0200, David Fetter <david@fetter.org> wrote in <20190918034201.GX31596@fetter.org> > > On Tue, Sep 17, 2019 at 09:01:57AM +0200, David Fetter wrote: > > > On Tue, Sep 17, 2019 at 08:55:05AM +0200, David Fetter wrote: > > > > On Sun, Sep 15, 2019 at 09:18:49AM +0200, David Fetter wrote: > > > > > Folks, > > > > > > > > > > Please find attached a couple of patches intended to $subject. > > > > > > > > > > This patch set cut the time to copy ten million rows of randomly sized > > > > > int8s (10 of them) by about a third, so at least for that case, it's > > > > > pretty decent. > > > > > > > > Added int4 output, removed the sprintf stuff, as it didn't seem to > > > > help in any cases I was testing. > > > > > > Found a couple of "whiles" that should have been "ifs." > > > > Factored out some inefficient functions and made the guts use the more > > efficient function. > > I'm not sure this is on the KISS principle, but looked it and > have several random comments. > > +numutils.o: CFLAGS += $(PERMIT_DECLARATION_AFTER_STATEMENT) > > I don't think that we are allowing that as project coding > policy. It seems to have been introduced only to accept external > code as-is. Changed to fit current policy. > - char str[23]; /* sign, 21 digits and '\0' */ > + char str[MAXINT8LEN]; > > It's uneasy that MAXINT8LEN contains tailling NUL. MAXINT8BUFLEN > can be so. I think MAXINT8LEN should be 20 and the definition > should be str[MAXINT8LEN + 1]. Done. > +static const char DIGIT_TABLE[200] = { > + '0', '0', '0', '1', '0', '2', '0', '3', '0', '4', '0', '5', '0', '6', '0', '7', '0', '8', '0', '9', > > Wouldn't it be simpler if it were defined as a constant string? > > static const char DIGIT_TABLE[201] = > "000102030405....19" > "202122232425....39" > .. I thought this might be even clearer: "00" "01" "02" "03" "04" "05" "06" "07" "08" "09" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35" "36" "37" "38" "39" "40" "41" "42" "43" "44" "45" "46" "47" "48" "49" "50" "51" "52" "53" "54" "55" "56" "57" "58" "59" "60" "61" "62" "63" "64" "65" "66" "67" "68" "69" "70" "71" "72" "73" "74" "75" "76" "77" "78" "79" "80" "81" "82" "83" "84" "85" "86" "87" "88" "89" "90" "91" "92" "93" "94" "95" "96" "97" "98" "99"; > +pg_ltoa_n(int32 value, char *a) > ... > + /* Compute the result string. */ > + while (value >= 100000000) > > We have only two degits above the value. Isn't the stuff inside > the while a waste of cycles? Changed the while to an if. > + /* Expensive 64-bit division. Optimize? */ > > I believe compiler treats such trivial optimizations. (concretely > converts into shifts and subtractons if faster.) Comments removed. > + memcpy(a + olength - i - 2, DIGIT_TABLE + c0, 2); > > Maybe it'd be easy to read if 'a + olength - i' is a single variable. Done. > + i += adjust; > + return i; > > If 'a + olength - i' is replaced with a variable, the return > statement is replacable with "return olength + adjust;". I'm not sure I understand this. > + return t + (v >= PowersOfTen[t]); > > I think it's better that if it were 't - (v < POT[t]) + 1; /* > log10(v) + 1 */'. At least we need an explanation of the > difference. (I'didn't checked the algorithm is truely right, > though.) Comments added. > > void > > pg_lltoa(int64 value, char *a) > > { > .. > > memcpy(a, "-9223372036854775808", 21); > .. > > memcpy(a, "0", 2); > > The lines need a comment like "/* length contains trailing '\0' > */" Comments added. > + if (value >= 0) > ... > + else > + { > + if (value == PG_INT32_MIN) > + memcpy(str, "-2147483648", 11); > + return str + 11; > > } > + *str++ = '-'; > + return pg_ltostr_zeropad(str, -value, minwidth - 1); > > If then block of the if statement were (values < 0), we won't > need to reenter the functaion. This is a tail-call recursion, so it's probably optimized already. > + len = pg_ltoa_n(value, str); > + if (minwidth <= len) > + return str + len; > + > + memmove(str + minwidth - len, str, len); > > If the function had the parameters str with the room only for two > digits and a NUL, 2 as minwidth but 1000 as value, the function > would overrun the buffer. The original function just ignores > overflowing digits. I believe the original was incorrect. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Вложения
В списке pgsql-hackers по дате отправления: