RE: Assuming that TAS() will succeed the first time is verboten

Поиск

Список

Период

Сортировка

От	Mikheev, Vadim
Тема	RE: Assuming that TAS() will succeed the first time is verboten
Дата	28 декабря 2000 г. 16:56:20
Msg-id	8F4C99C66D04D4118F580090272A7A234D321D@sectorbase1.sectorbase.com обсуждение исходный текст
Ответ на	Assuming that TAS() will succeed the first time is verboten (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: Assuming that TAS() will succeed the first time is verboten
Список	pgsql-hackers

Дерево обсуждения

> The bottom line turns out to be that on the Alpha hardware, it is
> possible for TAS() to fail even when the lock is initially zero,
> because that hardware's locking protocol will fail to acquire the
> lock if the ldq_l/stq_c sequence is interrupted.  TAS() *must* be
> called in a retry loop on Alphas.  Thus, the coding presently in
> xlog.c,
> 
>     while (TAS(&(XLogCtl->chkp_lck)))
>     {
>         struct timeval delay = {2, 0};
> 
>         if (shutdown)
>             elog(STOP, "Checkpoint lock is busy while data 
> base is shutting down");
>         (void) select(0, NULL, NULL, NULL, &delay);
>     }
> 
> is no good because it does not allow for multiple retries.
> 
> Offhand I see no good reason why the above-quoted code isn't just
> 
>     S_LOCK(&(XLogCtl->chkp_lck));
> 
> and propose to fix this problem by reducing it to that.  If the lock
> is held when it shouldn't be, we'll fail with a stuck-spinlock error.

It was not test against stuck slock: postmaster can shutdown db only
when there is no backends running and this was just additional test
for that. Assuming that postmaster does right things I don't object
to get rid of this test, but I would just remove shutdown test itself
- checkpoints run long time and 2 sec timeout is better than ones
in s_lock_sleep();

> It also bothers me that xlog.c contains several places where 
> there is a potentially infinite wait for a lock.  It seems to me
> that these should time out with stuck-spinlock messages. Do you object
> to such a change?

I always considered time-based decision is slock stuck or not as hack -
in old times elog(ERROR/FATAL) didn't unlock slocks sometimes, isn't
it fixed now?

Anyway I don't object if it bothers you -:)
But please do not use S_LOCK - as you see WAL code try to do other
things if slock of "primary interest" is busy.

Vadim

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

RE: Assuming that TAS() will succeed the first time is verboten