Re: [PATCHES] MemSet inline for newNode
От | Bruce Momjian |
---|---|
Тема | Re: [PATCHES] MemSet inline for newNode |
Дата | |
Msg-id | 200211102307.gAAN7Y611251@candle.pha.pa.us обсуждение исходный текст |
Ответ на | Re: [PATCHES] MemSet inline for newNode (Bruce Momjian <pgman@candle.pha.pa.us>) |
Список | pgsql-hackers |
Further update: I ran a test and compared the sizes of the postgres binary: 2926925 with palloc0, 2931293 without, a 0.14% decrease. Not very much, considering the 1-14% speedup of MemSet with a constant. Tom, I really didn't think those conditional tests would be significant, but clearly they are --- call me surprised. Back out palloc0()? --------------------------------------------------------------------------- Bruce Momjian wrote: > > OK, my compiler, gcc 2.95 does the optimization at -O2, so I ran the > tests here with the attached program, testing a constant of 256 and a > variable 'j' equal to 256. > > I found a 1.7% slowdown with the variable compared to the constant. > The 256 value is rather large. For a length of 32, the difference was > 14%! > > So, seems I should back out the palloc0 usage and let the MemSet inline > at each location. Does that seems like the direction to head? > > --------------------------------------------------------------------------- > > Bruce Momjian wrote: > > Tom Lane wrote: > > > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > > I have applied this patch to inline MemSet for newNode. > > > > I will make another commit to make more general use of palloc0. > > > > > > Refresh my memory on why either of these was a good idea? > > > > You are talking about the use of palloc0 in place of palloc/MemSet(0), > > not the use of palloc0 in newNode, right? > > > > > > This was already discussed on hackers. > > > > > > And this was not the approach agreed to, IIRC. What you've done has > > > eliminated the possibility of optimizing away the controlling tests > > > in MemSet. > > > > I see now: > > > > http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&selm=200210120006.g9C06i502964%40candle.pha.pa.us > > > > I thought someone had tested that there was little/no performance > > difference between these two statements: > > > > MemSet(ptr, 0, 256) > > > > vs. > > > > i = 256; > > MemSet(ptr, 0, i) > > > > However, seeing no email reply, I now assume no test was done. > > > > Neil, can you run such a test and let us know. It has to be with a > > compiler that optimizes out the MemSet length test when the len is a > > constant. I can't remember what compiler version/settings that was, but > > it may have been -O3 gcc 3.20. > > > > I can back out my changes, but it would be easier to see if there is a > > benefit before doing that. Personally, I am going to be surprised if a > > single conditional tests in MemSet (which is not in the assignment loop) > > will make any measurable difference. > > > > -- > > Bruce Momjian | http://candle.pha.pa.us > > pgman@candle.pha.pa.us | (610) 359-1001 > > + If your life is a hard drive, | 13 Roberts Road > > + Christ can be your backup. | Newtown Square, Pennsylvania 19073 > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup. | Newtown Square, Pennsylvania 19073 > #define INT_ALIGN_MASK (sizeof(int) - 1) > #include <stdlib.h> > > /* > * MemSet > * Exactly the same as standard library function memset(), but considerably > * faster for zeroing small word-aligned structures (such as parsetree nodes). > * This has to be a macro because the main point is to avoid function-call > * overhead. However, we have also found that the loop is faster than > * native libc memset() on some platforms, even those with assembler > * memset() functions. More research needs to be done, perhaps with > * platform-specific MEMSET_LOOP_LIMIT values or tests in configure. > * > * bjm 2002-10-08 > */ > #define MemSet(start, val, len) \ > do \ > { \ > int * _start = (int *) (start); \ > int _val = (val); \ > size_t _len = (len); \ > \ > if ((((long) _start) & INT_ALIGN_MASK) == 0 && \ > (_len & INT_ALIGN_MASK) == 0 && \ > _val == 0 && \ > _len <= MEMSET_LOOP_LIMIT) \ > { \ > int * _stop = (int *) ((char *) _start + _len); \ > while (_start < _stop) \ > *_start++ = 0; \ > } \ > else \ > memset((char *) _start, _val, _len); \ > } while (0) > > #define MEMSET_LOOP_LIMIT 1024 > > int > main(int argc, char **argv) > { > int i,j; > char *buf; > > buf = malloc(256); > > j = 32; > for (i = 0; i < 1000000; i++) > MemSet(buf, 0, j /* choose 256 or j here */); > return 0; > } > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
В списке pgsql-hackers по дате отправления: