Re: Sun Donated a Sun Fire T2000 to the PostgreSQL

Поиск
Список
Период
Сортировка
От Robert Lor
Тема Re: Sun Donated a Sun Fire T2000 to the PostgreSQL
Дата
Msg-id 44C088C8.9050303@sun.com
обсуждение исходный текст
Ответ на Re: Sun Donated a Sun Fire T2000 to the PostgreSQL  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Sun Donated a Sun Fire T2000 to the PostgreSQL  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [HACKERS] Sun Donated a Sun Fire T2000 to the PostgreSQL  ("Jim C. Nasby" <jnasby@pervasive.com>)
Список pgsql-performance
Tom Lane wrote:

>Tatsuo Ishii <ishii@sraoss.co.jp> writes:
>
>
>>>>18% in s_lock is definitely bad :-(.  Were you able to determine which
>>>>LWLock(s) are accounting for the contention?
>>>>
>>>>
>
>
>
>>Sorry for the delay. Finally I got the oprofile data. It's
>>huge(34MB). If you are interested, I can put somewhere. Please let me
>>know.
>>
>>
>
>I finally got a chance to look at this, and it seems clear that all the
>traffic is on the BufMappingLock.  This is essentially the same problem
>we were discussing with respect to Gavin Hamill's report of poor
>performance on an 8-way IBM PPC64 box (see hackers archives around
>2006-04-21).  If your database is fully cached in shared buffers, then
>you can do a whole lot of buffer accesses per unit time, and even though
>all the BufMappingLock acquisitions are in shared-LWLock mode, the
>LWLock's spinlock ends up being heavily contended on an SMP box.
>
>It's likely that CVS HEAD would show somewhat better performance because
>of the btree change to cache local copies of index metapages (which
>eliminates a fair fraction of buffer accesses, at least in Gavin's test
>case).   Getting much further than that seems to require partitioning
>the buffer mapping table.  The last discussion stalled on my concerns
>about unpredictable shared memory usage, but I have some ideas on that
>which I'll post separately.  In the meantime, thanks for sending along
>the oprofile data!
>
>            regards, tom lane
>
>
I ran pgbench and fired up a DTrace script using the lwlock probes we've
added, and it looks like BufMappingLock is the most contended lock, but
CheckpointStartLocks are held for longer duration!

             Lock Id            Mode           Count
     ControlFileLock       Exclusive               1
 SubtransControlLock       Exclusive               1
    BgWriterCommLock       Exclusive               6
       FreeSpaceLock       Exclusive               6
    FirstLockMgrLock       Exclusive              48
     BufFreelistLock       Exclusive              74
      BufMappingLock       Exclusive              74
     CLogControlLock       Exclusive             184
          XidGenLock       Exclusive             184
 CheckpointStartLock          Shared             185
        WALWriteLock       Exclusive             185
       ProcArrayLock       Exclusive             368
     CLogControlLock          Shared             552
 SubtransControlLock          Shared            1273
       WALInsertLock       Exclusive            1476
          XidGenLock          Shared            1842
       ProcArrayLock          Shared            3160
          SInvalLock          Shared            3684
      BufMappingLock          Shared           14578

             Lock Id   Combined Time (ns)
     ControlFileLock                 7915
    BgWriterCommLock                43438
       FreeSpaceLock               111139
     BufFreelistLock               448530
    FirstLockMgrLock              2879957
     CLogControlLock              4237750
 SubtransControlLock              6378042
          XidGenLock              9500422
       WALInsertLock             16372040
          SInvalLock             23284554
       ProcArrayLock             32188638
      BufMappingLock            113128512
        WALWriteLock            142391501
 CheckpointStartLock           4171106665


Regards,
-Robert



В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Sun Donated a Sun Fire T2000 to the PostgreSQL
Следующее
От:
Дата:
Сообщение: Forcing using index instead of sequential scan?