Re: Small locking bugs in hs

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: Small locking bugs in hs
Дата	20 января 2010 г. 09:13:26
Msg-id	201001201413.09953.andres@anarazel.de обсуждение
Ответ на	Re: Small locking bugs in hs (Simon Riggs <simon@2ndQuadrant.com>)
Ответы	Re: Small locking bugs in hs
Список	pgsql-hackers

Дерево обсуждения

On Wednesday 20 January 2010 12:59:40 Simon Riggs wrote:
> On Wed, 2010-01-20 at 04:47 +0100, Andres Freund wrote:
> > On Saturday 16 January 2010 12:32:35 Simon Riggs wrote:
> > > No. As mentioned upthread, this is not a bug.
> > 
> > Could you also mention in a little bit more detail why not?
> 
> When a cleanup record arrives without a latestRemovedXid we are forced
> to assume that the xid could be as late as latestCompletedXid.
> Regrettably we aren't certain which of the xids are still there since it
> is possible that earlier xids in KnownAssignedXids are actually FATAL
> errors that did not write abort records. So we need to conflict with all
> current snapshots whose xmin is less than latestCompletedXid to be safe.
> This can cause false positives in our assessment of which vxids
> conflict.
> By using exclusive lock we prevent new snapshots from being taken while
> we work out which snapshots to conflict with. This protects those new
> snapshots from also being included in our conflict list.
> 
> After the lock is released, we allow snapshots again. It is possible
> that we arrive at a snapshot that is identical to one that we just
> decided we should conflict with. This a case of false positives, not an
> actual problem.
> 
> There are two cases: (1) if we were correct in using latestCompletedXid
> then that means that all xids in the snapshot lower than that are FATAL
> errors, so not xids that ever commit. We can make no visibility errors
> if we allow such xids into the snapshot. (2) if we erred on the side of
> caution and in fact the latestRemovedXid should have been earlier than
> latestCompletedXid then we conflicted with a snapshot needlessly. Taking
> another identical snapshot is OK, because the earlier conflicted
> snapshot was a false positive.
> 
> In either case, a snapshot taken after conflict assessment will still be
> valid and non-conflicting even if an identical snapshot that existed
> before conflict assessment was assessed as conflicting.
> 
> If we allowed concurrent snapshots while we were deciding who to
> conflict with we would need to include all concurrent snapshotters in
> the conflict list as well. We'd have difficulty in working out exactly
> who that was, so it is happier for all concerned if we take an exclusive
> lock.
> 
> It also means that users waiting for a snapshot is a good thing, since
> it is more likely that they will live longer after having waited. So its
> not a bug for us to use exclusive lock and is actually desirable.
> 
> We could reduce false positives by having the master calculate the exact
> xmin each time it issues an XLOG_BTREE_DELETE record. That would
> introduce more contention since that happens during btree split
> operations, so might be counter productive.
Wow. Thanks for the extensive explanation!

I do understand it correctly that in CancelVirtualTransaction LW_SHARED is 
taken only so that another transaction can finish during that time?


Andres

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Small locking bugs in hs