Re: ugly locking corner cases ...
От | Hans-Jürgen Schönig |
---|---|
Тема | Re: ugly locking corner cases ... |
Дата | |
Msg-id | 05FC981A-2BC9-474F-A9C4-5081E93CAF52@cybertec.at обсуждение исходный текст |
Ответ на | Re: ugly locking corner cases ... (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Список | pgsql-hackers |
On Oct 4, 2010, at 1:23 PM, Heikki Linnakangas wrote: > On 04.10.2010 14:02, Hans-Jürgen Schönig wrote: >> it seems we have found a fairly nasty problem. >> imagine a long transaction which piles up XX.XXX of locks (count on pg_locks) inside the same transaction by doing sometasty savepoints, with hold cursors and so on. >> in this case we see that a normal count issued in a second database connection will take ages. in a practical case wedid a plain seq_scan in connection 2. instead of 262 ms (cached case) it started to head north linearily with the numberof locks taken by connection 1. in an extreme case it took around 1.5 hours or so (on XXX.XXX pg_locks entries). >> >> i tracked down the issue quickly and make the following profile (in 10k locks or so): >> >> Flat profile: >> >> Each sample counts as 0.01 seconds. >> % cumulative self self total >> time seconds seconds calls s/call s/call name >> 32.49 6.01 6.01 98809118 0.00 0.00 SimpleLruReadPage_ReadOnly >> 26.97 11.00 4.99 98837761 0.00 0.00 LWLockAcquire >> 21.89 15.05 4.05 98837761 0.00 0.00 LWLockRelease >> 8.70 16.66 1.61 98789370 0.00 0.00 SubTransGetParent >> 4.38 17.47 0.81 19748 0.00 0.00 SubTransGetTopmostTransaction >> 2.41 17.92 0.45 98851951 0.00 0.00 TransactionIdPrecedes >> 0.59 18.03 0.11 LWLockAssign >> 0.54 18.13 0.10 LWLockConditionalAcquire >> 0.46 18.21 0.09 19748 0.00 0.00 TransactionLogFetch >> 0.38 18.28 0.07 SimpleLruReadPage >> 0.27 18.33 0.05 SubTransSetParent >> 0.05 18.34 0.01 136778 0.00 0.00 AllocSetAlloc >> 0.05 18.35 0.01 42996 0.00 0.00 slot_deform_tuple >> 0.05 18.36 0.01 42660 0.00 0.00 TransactionIdIsCurrentTransactionId >> >> it seems we are running into a nice shared buffer / locking contention here and the number of calls explodes (this profilinginfos is coming from a seq scan on a 500.000 rows table - 400 mb or so). > > That doesn't seem related to the lock manager. Is the long-running transaction inserting a lot of tuples (by INSERT orUPDATE) to the same table that the seqscan scans? With a lot of different subtransaction xids. That profile looks likethe seqscan is spending a lot of time swapping pg_subtrans pages in and out of the slru buffers. > > Increasing NUM_SUBTRANS_BUFFERS should help. A more sophisticated solution would be to allocate slru buffers (for clogand other slru caches as well) dynamically from shared_buffers. That's been discussed before but no-one has gotten aroundto it. > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com hello ... yeah, it seems this solves the problem. i had a closer look at the SQL trace and did some more profiling. this was the case. many thanks for the quick hint. hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Craig RingerДата:
Сообщение: Re: Re: Proposed Windows-specific change: Enable crash dumps (like core files)
Следующее
От: Robert HaasДата:
Сообщение: Re: I: About "Our CLUSTER implementation is pessimal" patch