Обсуждение: PrivateRefCount patch has got issues

Поиск
Список
Период
Сортировка

PrivateRefCount patch has got issues

От
Tom Lane
Дата:
I just happened to look into bufmgr.c for the first time in awhile, and
noticed the privaterefcount-is-no-longer-a-simple-array stuff.  It doesn't
look too well thought out to me.  In particular, PinBuffer_Locked calls
GetPrivateRefCountEntry while holding a buffer-header spinlock.  That
seems completely unacceptable.  It's certainly a huge violation of our
design principle that spinlocks should be held for only a few
instructions; and I rather suspect that a palloc failure down inside the
hashtable entry-allocation code would leave things in a bad state.  It's
also depressing that the very common code path ReleaseBuffer->UnpinBuffer
results in a double search of the array/hashtable; that should be
refactored to avoid that.
        regards, tom lane



Re: PrivateRefCount patch has got issues

От
Andres Freund
Дата:
Hi,

On 2014-12-16 18:25:13 -0500, Tom Lane wrote:
> I just happened to look into bufmgr.c for the first time in awhile, and
> noticed the privaterefcount-is-no-longer-a-simple-array stuff.  It doesn't
> look too well thought out to me.  In particular, PinBuffer_Locked calls
> GetPrivateRefCountEntry while holding a buffer-header spinlock.  That
> seems completely unacceptable.

Argh, yes. That certainly isn't ok.

The easiest way to fix that seems to be to declare that PinBuffer_Locked
can only be used when we're guaranteed to not have pinned the
buffer. That happens to be true for all the existing users. In fact all
of them even seem to require the refcount to be zero across all
backends.  That prerequisite then allows to increase the buffer header
refcount before releasing the spinlock *and* before increasing the
private refcount.

> It's also depressing that the very common code path
> ReleaseBuffer->UnpinBuffer results in a double search of the
> array/hashtable; that should be refactored to avoid that.

Sounds like a good idea, will see how that works out to look.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: PrivateRefCount patch has got issues

От
Tom Lane
Дата:
Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-12-16 18:25:13 -0500, Tom Lane wrote:
>> I just happened to look into bufmgr.c for the first time in awhile, and
>> noticed the privaterefcount-is-no-longer-a-simple-array stuff.  It doesn't
>> look too well thought out to me.  In particular, PinBuffer_Locked calls
>> GetPrivateRefCountEntry while holding a buffer-header spinlock.  That
>> seems completely unacceptable.

> Argh, yes. That certainly isn't ok.

> The easiest way to fix that seems to be to declare that PinBuffer_Locked
> can only be used when we're guaranteed to not have pinned the
> buffer. That happens to be true for all the existing users. In fact all
> of them even seem to require the refcount to be zero across all
> backends.  That prerequisite then allows to increase the buffer header
> refcount before releasing the spinlock *and* before increasing the
> private refcount.

Hm, if you do it like that, what happens if we get a palloc failure while
trying to record the private refcount?  I think you must not bump the pin
count in shared memory unless you're certain you can record the fact that
you've done so.

The idea I'd been wondering about hinged on the same observation that we
know the buffer is not pinned (by our process) already, but the mechanics
would be closer to what we do in resource managers: reserve space first,
do the thing that needs to be remembered, bump the count using the
reserved space.  Given the way you've set this up, the idea boils down to
having a precheck call that forces there to be an empty slot in the local
fastpath array (by pushing something out to the hash table if necessary)
before we start trying to pin the buffer.  Then it's guaranteed that the
"record" step will succeed.  You could possibly even arrange it so that
it's known which array entry needs to be used and then the "record" part
is just a couple of inline instructions, so that it'd be OK to do that
while still holding the spinlock.  Otherwise it would still be a good idea
to do the "record" after releasing the spinlock IMO; but either way this
avoids the issue of possible state inconsistency due to a midflight
palloc failure.
        regards, tom lane



Re: PrivateRefCount patch has got issues

От
Andres Freund
Дата:
Hi,

Sorry for taking long to get back to this...

On 2014-12-21 13:21:56 -0500, Tom Lane wrote:
> The idea I'd been wondering about hinged on the same observation that we
> know the buffer is not pinned (by our process) already, but the mechanics
> would be closer to what we do in resource managers: reserve space first,
> do the thing that needs to be remembered, bump the count using the
> reserved space.  Given the way you've set this up, the idea boils down to
> having a precheck call that forces there to be an empty slot in the local
> fastpath array (by pushing something out to the hash table if necessary)
> before we start trying to pin the buffer.  Then it's guaranteed that the
> "record" step will succeed.

After pondering the problem for a while I agree that this is the best
approach. I've implemented it and it imo actually looks cleaner than
before because GetPrivateRefCountEntry() is much simpler.

> You could possibly even arrange it so that
> it's known which array entry needs to be used and then the "record" part
> is just a couple of inline instructions, so that it'd be OK to do that
> while still holding the spinlock.  Otherwise it would still be a good idea
> to do the "record" after releasing the spinlock IMO; but either way this
> avoids the issue of possible state inconsistency due to a midflight
> palloc failure.

I've indeed done it in a way that the reserved entry is
remembered. Primarily because that will, in many situations, allow us to
avoid searching the array for a new entry because we can setup a
reserved slot when forgetting a private refcount entry.  While the
previous reservation would make it safe to do the private tracking with
the spinlock held I've moved it to after the release; based on our
observation that it's never used for buffers already locally
pinned. Also added an assert checking that that actually still is the
case.  That observation also allows us to entirely forgo searching the
private refcount array in PinBuffer_locked(), we can just directly enter
the new entry. That is actually visible in profiles that are much larger
than s_b.


I've just written this patch down in one go, so I'll let the topic rest
for a couple days and commit it then after looking through it again.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения