Re: B-Tree support function number 3 (strxfrm() optimization)
От | Robert Haas |
---|---|
Тема | Re: B-Tree support function number 3 (strxfrm() optimization) |
Дата | |
Msg-id | CA+TgmoYh=dcUeHrRiO1aWma6-593DvB32gyqv2+VhB0EP-Pg3A@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: B-Tree support function number 3 (strxfrm() optimization) (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Ответы |
Re: B-Tree support function number 3 (strxfrm() optimization)
(Peter Geoghegan <pg@heroku.com>)
Re: B-Tree support function number 3 (strxfrm() optimization) (Peter Geoghegan <pg@heroku.com>) |
Список | pgsql-hackers |
On Fri, Sep 12, 2014 at 5:28 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > On 09/12/2014 12:46 AM, Peter Geoghegan wrote: >> >> On Thu, Sep 11, 2014 at 1:50 PM, Robert Haas <robertmhaas@gmail.com> >> wrote: >>> >>> I think I said pretty clearly that it was. >> >> >> I agree that you did, but it wasn't clear exactly what factors you >> were asking me to simulate. > > > All factors. > >> Do you want me to compare the same string a million times in a loop, >> both with a strcoll() and with a memcmp()? > > > Yes. > >> Should I copy it into a buffer to add a NUL byte? > > > Yes. > >> Or should it be a new string each time, with a cache miss expected >> some proportion of the time? > > > Yes. > > I'm being facetious - it's easy to ask for tests when you're not the one > running them. But seriously, please do run the all the tests that you think > make sense. > > I'm particularly interested in the worst case. What is the worst case for > the proposed memcmp() check? Test that. If the worst case regresses > significantly, then we need to have a discussion of how likely that worst > case is to happen in real life, what the performance is like in more > realistic almost-worst-case scenarios, does it need to be tunable, is the > trade-off worth it, etc. But if the worst case regresses less than, say, 1%, > and there are some other cases where you get a 300% speed up, then I think > it's safe to say that the optimization is worth it, without any more testing > or discussion. +1 to all that, including the facetious parts. Based on discussion thus far it seems that there's a possibility that the trade-off may be different for short strings vs. long strings. If the string is small enough to fit in the L1 CPU cache, then it may be that memcmp() followed by strcoll() is not much more expensive than strcoll(). That should be easy to figure out: write a standalone C program that creates a bunch of arbitrary, fairly-short strings, say 32 bytes, in a big array. Make the strings different near the end, but not at the beginning. Then have the program either do strcoll() on every pair (O(n^2)) or, with a #define, memcmp() followed by strcoll() on every pair. It should be easy to see whether the memcmp() adds 1% or 25% or 100%. Then, repeat the same thing with strings that are big enough to blow out the L1 cache, say 1MB in length. Some intermediate sizes (64kB?) might be worth testing, too. Again, it should be easy to see what the overhead is. Once we know that, we can make intelligent decisions about whether this is a good idea or not, and when. If you attach the test programs, other people (e.g. me) can also try them on other systems (e.g. MacOS X) to see whether the characteristics there are different than what you saw on your system. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: