Weird Assert failure in GetLockStatusData()

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Weird Assert failure in GetLockStatusData()
Дата
Msg-id 8053.1357659565@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: Weird Assert failure in GetLockStatusData()  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
This is a bit disturbing:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bushpig&dt=2013-01-07%2019%3A15%3A02

The key bit is

[50eb2156.651e:6] LOG:  execute isolationtester_waiting: SELECT 1 FROM pg_locks holder, pg_locks waiter WHERE NOT
waiter.grantedAND waiter.pid = $1 AND holder.granted AND holder.pid <> $1 AND holder.pid IN (25887, 25888, 25889) AND
holder.mode= ANY (CASE waiter.mode WHEN 'AccessShareLock' THEN ARRAY['AccessExclusiveLock'] WHEN 'RowShareLock' THEN
ARRAY['ExclusiveLock','AccessExclusiveLock']WHEN 'RowExclusiveLock' THEN
ARRAY['ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']WHEN 'ShareUpdateExclusiveLock' THEN
ARRAY['ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']WHEN
'ShareLock'THEN
ARRAY['RowExclusiveLock','ShareUpdateExclusiveLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']WHEN
'ShareRowExclusiveLock'THEN
ARRAY['RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
WHEN'ExclusiveLock' THEN
ARRAY['RowShar!eLock','RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
WHEN'AccessExclusiveLock' THEN
ARRAY['AccessShareLock','RowShareLock','RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
END)AND holder.locktype IS NOT DISTINCT FROM waiter.locktype AND holder.database IS NOT DISTINCT FROM waiter.database
ANDholder.relation IS NOT DISTINCT FROM waiter.relation AND holder.page IS NOT DISTINCT FROM waiter.page AND
holder.tupleIS NOT DISTINCT FROM waiter.tuple AND holder.virtualxid IS NOT DISTINCT FROM waiter.virtualxid AND
holder.transactionidIS NOT DISTINCT FROM waiter.transactionid AND holder.classid IS NOT DISTINCT FROM waiter.classid
ANDholder.objid IS NOT DISTINCT FROM waiter.objid AND holder.objsubid IS NOT DISTINCT FROM waiter.objsubid 
 
[50eb2156.651e:7] DETAIL:  parameters: $1 = '25889'
TRAP: FailedAssertion("!(el == data->nelements)", File: "lock.c", Line: 3398)
[50eb2103.62ee:2] LOG:  server process (PID 25886) was terminated by signal 6: Aborted
[50eb2103.62ee:3] DETAIL:  Failed process was running: SELECT 1 FROM pg_locks holder, pg_locks waiter WHERE NOT
waiter.grantedAND waiter.pid = $1 AND holder.granted AND holder.pid <> $1 AND holder.pid IN (25887, 25888, 25889) AND
holder.mode= ANY (CASE waiter.mode WHEN 'AccessShareLock' THEN ARRAY['AccessExclusiveLock'] WHEN 'RowShareLock' THEN
ARRAY['ExclusiveLock','AccessExclusiveLock']WHEN 'RowExclusiveLock' THEN
ARRAY['ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']WHEN 'ShareUpdateExclusiveLock' THEN
ARRAY['ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']WHEN
'ShareLock'THEN
ARRAY['RowExclusiveLock','ShareUpdateExclusiveLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']WHEN
'ShareRowExclusiveLock'THEN
ARRAY['RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','ExclusiveLock','AccessExclusiveLock']
WHEN'ExclusiveLock' THEN
ARRAY['RowShareL!ock','RowExclusiveLock','ShareUpdateExclusiveLock','ShareLock','ShareRowExclusiveLock','E

The assertion failure seems to indicate that the number of
LockMethodProcLockHash entries found by hash_seq_search didn't match the
number that had been counted by hash_get_num_entries immediately before
that.  I don't see any bug in GetLockStatusData itself, so this suggests
that there's something wrong with dynahash's entry counting, or that
somebody somewhere is modifying the shared hash table without holding
the appropriate lock.  The latter seems a bit more likely, given that
this must be a very low-probability bug or we'd have seen it before.
An overlooked locking requirement in a seldom-taken code path would fit
the symptoms.

Or maybe bushpig just had some weird cosmic-ray hardware failure,
but I don't put a lot of faith in such explanations.

Thoughts?
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Extra XLOG in Checkpoint for StandbySnapshot
Следующее
От: Daniele Varrazzo
Дата:
Сообщение: Re: PL/Python result object str handler