Re: lock on object is already held

Поиск
Список
Период
Сортировка
От Pavel Stehule
Тема Re: lock on object is already held
Дата
Msg-id CAFj8pRAdN9UgWTtwp9ixH09+iD4pY_LB1Wc-haOA-e=G4pzn2Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: lock on object is already held  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I try to simulate this error, but without success - so I prepared patch that had to help with identification of this issue. Important part is backport process startup from 9.2. After applying we detected this issue newer.

Regards

Pavel



2013/11/29 Tom Lane <tgl@sss.pgh.pa.us>
Daniel Wood <dwood@salesforce.com> writes:
> ... Presuming your fix is putting PG_SETMASK(&UnBlockSig)
> immediately before each of the 6 calls to ereport(ERROR,...) I've been
> running the stress test with both this fix and the lock already held fix.

I'm now planning to put it in error cleanup instead, but that's good
enough for proving that the problem is what I thought it was.

> I get plenty of lock timeout errors as expected.  However, once in a great
> while I get:  sqlcode = -400, sqlstate = 57014, sqlerrmc = canceling
> statement due to user request
> My stress test certainly doesn't do a user cancel.  Should this be expected?

I think I see what must be happening there: the lock timeout interrupt is
happening at some point after the lock has been granted, but before
ProcSleep reaches its disable_timeouts call.  QueryCancelPending gets set,
and will be honored next time something does CHECK_FOR_INTERRUPTS.
But because ProcSleep told disable_timeouts to clear the LOCK_TIMEOUT
indicator bit, ProcessInterrupts thinks the cancel must've been a plain
user SIGINT, and reports it that way.

What we should probably do about this is change ProcSleep to not clear the
LOCK_TIMEOUT indicator bit, same as we already do in LockErrorCleanup,
which is the less-race-condition-y path out of a lock timeout.

(It would be cooler if the timeout handler had a way to realize that the
lock is already granted, and not issue a query cancel in the first place.
But having a signal handler poking at shared memory state is a little too
scary for my taste.)

It strikes me that this also means that places where we throw away pending
cancels by clearing QueryCancelPending, such as the sigsetjmp stanza in
postgres.c, had better reset the LOCK_TIMEOUT indicator bit.  Otherwise,
a thrown-away lock timeout cancel might cause a later SIGINT cancel to be
misreported.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Davis
Дата:
Сообщение: Re: Extension Templates S03E11
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: palloc0