Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
От | Merlin Moncure |
---|---|
Тема | Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile |
Дата | |
Msg-id | CAHyXU0wH-2L=DHOXQmiDHBGRXxODCnQCHXB9rxcU1ju-mqksFQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: 9.2beta1, parallel queries, ReleasePredicateLocks,
CheckForSerializableConflictIn in the oprofile
(Robert Haas <robertmhaas@gmail.com>)
|
Список | pgsql-hackers |
On Thu, May 31, 2012 at 1:50 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, May 31, 2012 at 2:03 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> On Thu, May 31, 2012 at 11:54 AM, Sergey Koposov <koposov@ast.cam.ac.uk> wrote: >>> On Thu, 31 May 2012, Robert Haas wrote: >>> >>>> Oh, ho. So from this we can see that the problem is that we're >>>> getting huge amounts of spinlock contention when pinning and unpinning >>>> index pages. >>>> >>>> It would be nice to have a self-contained reproducible test case for >>>> this, so that we could experiment with it on other systems. >>> >>> >>> I have created it a few days ago: >>> http://archives.postgresql.org/pgsql-hackers/2012-05/msg01143.php >>> >>> It is still valid. And I'm using exactly it to test. The only thing to >>> change is to create a two-col index and drop another index. >>> The scripts are precisely the ones I'm using now. >>> >>> The problem is that in order to see a really big slowdown (10 times slower >>> than a single thread) I've had to raise the buffers to 48g but it was slow >>> for smaller shared buffer settings as well. >>> >>> But I'm not sure how sensitive the test is to the hardware. >> >> It's not: high contention on spinlocks is going to suck no matter what >> hardware you have. I think the problem is pretty obvious now: any >> case where multiple backends are scanning the same sequence of buffers >> in a very tight loop is going to display this behavior. It doesn't >> come up that often: it takes a pretty unusual sequence of events to >> get a bunch of backends hitting the same buffer like that. >> >> Hm, I wonder if you could alleviate the symptoms by making making the >> Pin/UnpinBuffer smarter so that frequently pinned buffers could stay >> pinned longer -- kinda as if your private ref count was hacked to be >> higher in that case. It would be a complex fix for a narrow issue >> though. > > This test case is unusual because it hits a whole series of buffers > very hard. However, there are other cases where this happens on a > single buffer that is just very, very hot, like the root block of a > btree index, where the pin/unpin overhead hurts us. I've been > thinking about this problem for a while, but it hasn't made it up to > the top of my priority list, because workloads where pin/unpin is the > dominant cost are still relatively uncommon. I expect them to get > more common as we fix other problems. > > Anyhow, I do have some vague thoughts on how to fix this. Buffer pins > are a lot like weak relation locks, in that they are a type of lock > that is taken frequently, but rarely conflicts. And the fast-path > locking in 9.2 provides a demonstration of how to handle this kind of > problem efficiently: making the weak, rarely-conflicting locks > cheaper, at the cost of some additional expense when a conflicting > lock (in this case, a buffer cleanup lock) is taken. In particular, > each backend has its own area to record weak relation locks, and a > strong relation lock must scan all of those areas and migrate any > locks found there to the main lock table. I don't think it would be > feasible to adopt exactly this solution for buffer pins, because page > eviction and buffer cleanup locks, while not exactly common, are > common enough that we can't require a scan of N per-backend areas > every time one of those operations occurs. > > But, maybe we could have a system of this type that only applies to > the very hottest buffers. Suppose we introduce two new buffer flags, > BUF_NAILED and BUF_NAIL_REMOVAL. When we detect excessive contention > on the buffer header spinlock, we set BUF_NAILED. Once we do that, > the buffer can't be evicted until that flag is removed, and backends > are permitted to record pins in a per-backend area protected by a > per-backend spinlock or lwlock, rather than in the buffer header. > When we want to un-nail the buffer, we set BUF_NAIL_REMOVAL. Hm, couple questions: how do you determine if/when to un-nail a buffer, and who makes that decision (bgwriter?) Is there a limit to how many buffers you are allowed to nail? It seems like a much stronger idea, but one downside I see vs the 'pin for longer idea' i was kicking around was how to deal stale nailed buffers and keeping them from uncontrollably growing so that you have to either stop nailing or forcibly evicting them. merlin
В списке pgsql-hackers по дате отправления:
Следующее
От: Robert HaasДата:
Сообщение: Re: Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.