Re: Inlining comparators as a performance optimisation
От | karavelov@mail.bg |
---|---|
Тема | Re: Inlining comparators as a performance optimisation |
Дата | |
Msg-id | 4f1fa2fe44a6bfd798dbe5af7a5f1524.mailbg@mail.bg обсуждение исходный текст |
Ответ на | Inlining comparators as a performance optimisation (Peter Geoghegan <peter@2ndquadrant.com>) |
Ответы |
Re: Inlining comparators as a performance optimisation
|
Список | pgsql-hackers |
----- Цитат от Peter Geoghegan (peter@2ndquadrant.com), на 21.09.2011 в 02:53 ----- <br /><br />> On 20 September 201103:51, Tom Lane wrote: <br />>> Considering that -O2 is our standard optimization level, that <br />>> observationseems to translate to "this patch will be useless in <br />>> practice". I think you had better investigatethat aspect in some <br />>> detail before spending more effort. <br />> <br />> I don't think thatthe fact that that happens is at all significant at <br />> this early stage, and it never even occurred to me thatyou'd think <br />> that it might be. I was simply disclosing a quirk of this POC patch. <br />> The workaroundis probably to use a macro instead. For the benefit of <br />> those that didn't follow the other threads, themacro-based qsort <br />> implementation, which I found to perform significantly better than <br />> regular qsort(),runs like this on my laptop when I built at 02 with <br />> GCC 4.6 just now: <br />> <br />> C stdlib quick-sorttime elapsed: 2.092451 seconds <br />> Inline quick-sort time elapsed: 1.587651 seconds <br />> <br />>Does *that* look attractive to you? I've attached source code of the <br />> program that produced these figures,which has been ported to C from <br />> C++. <br />> <br />> When I #define LARGE_SIZE 100000000, here'swhat I see: <br />> <br />> [peter@peter inline_compar_test]$ ./a.out <br />> C stdlib quick-sort time elapsed:23.659411 seconds <br />> Inline quick-sort time elapsed: 18.470611 seconds <br />> <br />> Here, sortingwith the function pointer/stdlib version takes about <br />> 1.28 times as long. In the prior test (with the smallerLARGE_SIZE), <br />> it took about 1.32 times as long. Fairly predictable, linear, and not <br />> to be sniffedat. <br />> <br />> The variance I'm seeing across runs is low - a couple of hundredths of <br />> a secondat most. This is a Fedora 15 " Intel(R) Core(TM) i5-2540M CPU <br />> @ 2.60GHz" machine. I'm not sure right nowwhy the inline quick-sort <br />> is less of a win than on my old Fedora 14 desktop (where it was 3.24 <br />> Vs2.01), but it's still a significant win. Perhaps others can build <br />> this simple program and tell me what theycome up with. <br />> <br />Run it here. <br /><br />Intel(R) Core(TM)2 Duo CPU E8200 @ 2.66GHz <br />gcc version4.6.1 (Debian 4.6.1-10) <br /><br />g++ -O2 qsort-inline-benchmark.c <br />./a.out <br />C stdlib quick-sort timeelapsed: 1.942686 seconds <br />Inline quick-sort time elapsed: 1.126508 seconds <br /><br />With #define LARGE_SIZE100000000 <br /><br />C stdlib quick-sort time elapsed: 22.158207 seconds <br />Inline quick-sort time elapsed:12.861018 seconds <br /><br />with g++ -O0 <br />C stdlib quick-sort time elapsed: 2.736360 seconds <br />Inlinequick-sort time elapsed: 2.045619 seconds <br /><br />On server hardware: <br />Intel(R) Xeon(R) CPU E5405 @ 2.00GHz<br />gcc version 4.4.5 (Debian 4.4.5-8) <br /><br />/a.out <br />C stdlib quick-sort time elapsed: 2.610150 seconds<br />Inline quick-sort time elapsed: 1.494198 seconds <br /><br />All -O2 version show 42% speedup with inlined qsort.<br />-O0 showed 25% speedup. <br /><br />Best regards <br /><br />-- <br />Luben Karavelov
В списке pgsql-hackers по дате отправления: