RE: Assuming that TAS() will succeed the first time is verboten

Поиск
Список
Период
Сортировка
От Mikheev, Vadim
Тема RE: Assuming that TAS() will succeed the first time is verboten
Дата
Msg-id 8F4C99C66D04D4118F580090272A7A234D321D@sectorbase1.sectorbase.com
обсуждение исходный текст
Ответ на Assuming that TAS() will succeed the first time is verboten  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Assuming that TAS() will succeed the first time is verboten  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
> The bottom line turns out to be that on the Alpha hardware, it is
> possible for TAS() to fail even when the lock is initially zero,
> because that hardware's locking protocol will fail to acquire the
> lock if the ldq_l/stq_c sequence is interrupted.  TAS() *must* be
> called in a retry loop on Alphas.  Thus, the coding presently in
> xlog.c,
> 
>     while (TAS(&(XLogCtl->chkp_lck)))
>     {
>         struct timeval delay = {2, 0};
> 
>         if (shutdown)
>             elog(STOP, "Checkpoint lock is busy while data 
> base is shutting down");
>         (void) select(0, NULL, NULL, NULL, &delay);
>     }
> 
> is no good because it does not allow for multiple retries.
> 
> Offhand I see no good reason why the above-quoted code isn't just
> 
>     S_LOCK(&(XLogCtl->chkp_lck));
> 
> and propose to fix this problem by reducing it to that.  If the lock
> is held when it shouldn't be, we'll fail with a stuck-spinlock error.

It was not test against stuck slock: postmaster can shutdown db only
when there is no backends running and this was just additional test
for that. Assuming that postmaster does right things I don't object
to get rid of this test, but I would just remove shutdown test itself
- checkpoints run long time and 2 sec timeout is better than ones
in s_lock_sleep();

> It also bothers me that xlog.c contains several places where 
> there is a potentially infinite wait for a lock.  It seems to me
> that these should time out with stuck-spinlock messages. Do you object
> to such a change?

I always considered time-based decision is slock stuck or not as hack -
in old times elog(ERROR/FATAL) didn't unlock slocks sometimes, isn't
it fixed now?

Anyway I don't object if it bothers you -:)
But please do not use S_LOCK - as you see WAL code try to do other
things if slock of "primary interest" is busy.

Vadim


В списке pgsql-hackers по дате отправления:

Предыдущее
От: ncm@zembu.com (Nathan Myers)
Дата:
Сообщение: Re: Assuming that TAS() will succeed the first time is verboten
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Assuming that TAS() will succeed the first time is verboten