Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
| От | Florian Pflug | 
|---|---|
| Тема | Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile | 
| Дата | |
| Msg-id | 2B38F631-C80E-4882-BBB5-4678891B21E9@phlo.org обсуждение исходный текст | 
| Ответ на | Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile (Tom Lane <tgl@sss.pgh.pa.us>) | 
| Ответы | Re: 9.2beta1, parallel queries, ReleasePredicateLocks,
 CheckForSerializableConflictIn in the oprofile | 
| Список | pgsql-hackers | 
On Jun1, 2012, at 15:45 , Tom Lane wrote: > Merlin Moncure <mmoncure@gmail.com> writes: >> A potential issue with this line of thinking is that your pin delay >> queue could get highly pressured by outer portions of the query (as in >> the OP's case) that will get little or no benefit from the delayed >> pin. But choosing a sufficiently sized drain queue would work for >> most reasonable cases assuming 32 isn't enough? Why not something >> much larger, for example the lesser of 1024, (NBuffers * .25) / >> max_connections? In other words, for you to get much benefit, you >> have to pin the buffer sufficiently more than 1/N times among all >> buffers. > > Allowing each backend to pin a large fraction of shared buffers sounds > like a seriously bad idea to me. That's just going to increase > thrashing of what remains. Right, that was one of the motivations for suggesting the small queue. At least that way, the number of buffers optimistically pinned by each backend is limited. The other was that once the outer portions plough through more than a few pages per iteration of the sub-plan, the cost of doing that should dominate the cost of pinning and unpinning. > More generally, I don't believe that we have any way to know which > buffers would be good candidates to keep pinned for a long time. I'd think that pinning a buffer which we've only recently unpinned is a pretty good indication that the same thing will happen again. My proposed algorithm could be made to use exactly that criterion by tracking a little bit more state. We'd have to tag queue entries with a flag indicating whether they are Unpinned (COLD) Pinned, and unpinning should be delayed (HOT) Waiting to be unpinned (LUKEWARM) UnpinBuffer() would check if the buffer is HOT, and if so add it to the queue with flag LUKEWARM. Otherwise, it'd get immediately unpinned and flagged as COLD (adding it to the queue if necessary). PinBuffer() would pin the buffer and mark it as HOT if it was COLD, and just mark it as HOT if it was LUKEWARM. If the buffer isn't on the queue already, PinBuffer() would simply pin it and be done. This would give the following behaviour for a buffer that is pinned repeatedly PinBuffer(): <not on queue> -> <not on queue> (refcount incremented) UnpinBuffer(): <not on queue> -> COLD (refcount decremented)... PinBuffer(): COLD -> HOT (refcount incremented) UnpinBuffer(): HOT -> LUKEWARM (refcount *not* decremented)... PinBuffer(): LUKEWARM -> HOT (refcount *not* incremented) UnpinBuffer(): HOT -> LUKEWARM (refcount *not*decremented) … > Typically, we don't drop the pin in the first place if we know we're > likely to touch that buffer again soon. btree root pages might be an > exception, but I'm not even convinced of that one. But Sergey's use-case pretty convincingly shows that, more generally, inner sides of a nested loop join are also an exception, no? At least if the inner side is either an index scan, or a seqscan of a really small table. best regards, Florian Pflug
В списке pgsql-hackers по дате отправления: