Re: profiling pgbench

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: profiling pgbench
Дата
Msg-id 201011242219.09433.andres@anarazel.de
обсуждение исходный текст
Ответ на Re: profiling pgbench  (Andres Freund <andres@anarazel.de>)
Ответы Re: profiling pgbench  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-hackers
On Wednesday 24 November 2010 22:14:04 Andres Freund wrote:
> On Wednesday 24 November 2010 21:24:43 Robert Haas wrote:
> > I'd like to get access to a box with (a lot) more cores, to see
> > whether the lock stuff moves up in the profile.  A big chunk of that
> > hash_search_with_hash_value overhead is coming from
> > LockAcquireExtended.  The __strcmp_sse2 is almost entirely parsing
> > overhead.  In general, I'm not sure there's much hope for reducing the
> > parsing overhead, although ScanKeywordLookup() can certainly be done
> > better.  XLogInsert() is spending a lot of time doing CRC's.
> > LWLockAcquire() is dropping cycles in many different places.
> 
> I can get you profiles of machines with up two 24 real cores, unfortunately
> I can't give access away.
> 
> Regarding CRCs:
> I spent some time optimizing these, as you might remember. The wall I hit
> optimizing it benefit-wise is that the single CRC calls (4 for a
> non-indexed single-row insert on a table with 1 column inside a
> transaction)  are just too damn small to get more efficient. Its causing
> pipeline stalls all over... (21, 5, 1, 28 bytes).
> 
> I have a very preliminary patch calculating the CRC over the whole thing in
> one go if it can do so (no switch, no xl buffers wraparound), but its
> highly ugly as it needs to read from the xl insert buffers and then
> reinsert the crc at the correct position.
> While it shows a noticable improvement, that doesn't seem to be a good way
> to go. It could be made to work properly though.
> 
> I played around with some ideas to do that more nicely, but none were
> gratifying.
> 
> Recarding LWLockAcquire costs:
> Yes, its pretty noticeable - on loads of different usages. On a bunch of
> production machines its the second (begind XLogInsert) on some the most
> expensive function. Most of the time
AllocSetAlloc is the third, battling with hash_search_with_hash value. To 
complete that sentence...

Andres


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: profiling connection overhead
Следующее
От: Tom Lane
Дата:
Сообщение: Re: profiling connection overhead