Re: hung backends stuck in spinlock heavy endless loop

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: hung backends stuck in spinlock heavy endless loop
Дата
Msg-id CAHyXU0yHgGHcwkS1HoHUQtgVsMvAM2pB_3qwc_OUNCm5efY3Lw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: hung backends stuck in spinlock heavy endless loop  (Peter Geoghegan <pg@heroku.com>)
Ответы Re: hung backends stuck in spinlock heavy endless loop  (Merlin Moncure <mmoncure@gmail.com>)
Re: hung backends stuck in spinlock heavy endless loop  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Re: hung backends stuck in spinlock heavy endless loop  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
On Thu, Jan 15, 2015 at 5:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Jan 15, 2015 at 3:00 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> Running this test on another set of hardware to verify -- if this
>> turns out to be a false alarm which it may very well be, I can only
>> offer my apologies!  I've never had a new drive fail like that, in
>> that manner.  I'll burn the other hardware in overnight and report
>> back.

huh -- well possibly. not.  This is on a virtual machine attached to a
SAN.  It ran clean for several (this is 9.4 vanilla, asserts off,
checksums on) hours then the starting having issues:

[cds2 21952 2015-01-15 22:54:51.833 CST 5502]WARNING:  page
verification failed, calculated checksum 59143 but expected 59137 at
character 20
[cds2 21952 2015-01-15 22:54:51.852 CST 5502]QUERY:         DELETE FROM "onesitepmc"."propertyguestcard" t
WHEREEXISTS         (           SELECT 1 FROM "propertyguestcard_insert" d           WHERE (t."prptyid", t."gcardid") =
(d."prptyid",d."gcardid")         )
 

[cds2 21952 2015-01-15 22:54:51.852 CST 5502]CONTEXT:  PL/pgSQL
function cdsreconciletable(text,text,text,text,boolean) line 197 at
EXECUTE statement   SQL statement "SELECT        * FROM CDSReconcileTable(             t.CDSServer,
t.CDSDatabase,            t.SchemaName,             t.TableName)"   PL/pgSQL function cdsreconcileruntable(bigint) line
35at SQL statement
 

After that, several hours of clean running, followed by:

[cds2 32353 2015-01-16 04:40:57.814 CST 7549]WARNING:  did not find
subXID 7553 in MyProc
[cds2 32353 2015-01-16 04:40:57.814 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.018 CST 7549]WARNING:  you don't own a
lock of type AccessShareLock
[cds2 32353 2015-01-16 04:40:58.018 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]LOG:  could not send data
to client: Broken pipe
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]STATEMENT:  SELECT
CDSReconcileRunTable(1160)
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]WARNING:  you don't own a
lock of type AccessShareLock
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]WARNING:  you don't own a
lock of type AccessShareLock
[cds2 32353 2015-01-16 04:40:58.026 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  you don't own a
lock of type AccessShareLock
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  you don't own a
lock of type ShareLock
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]ERROR:  failed to re-find
shared lock object
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]CONTEXT:  PL/pgSQL
function cdsreconcileruntable(bigint) line 35 during exception cleanup
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]STATEMENT:  SELECT
CDSReconcileRunTable(1160)
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
AbortSubTransaction while in ABORT state
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  did not find
subXID 7553 in MyProc
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  you don't own a
lock of type RowExclusiveLock
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  you don't own a
lock of type AccessExclusiveLock
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]ERROR:  failed to re-find
shared lock object
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
AbortSubTransaction while in ABORT state
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  did not find
subXID 7553 in MyProc
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]ERROR:  failed to re-find
shared lock object
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:
AbortSubTransaction while in ABORT state
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  did not find
subXID 7553 in MyProc
[cds2 32353 2015-01-16 04:40:58.027 CST 7549]WARNING:  you don't own a
lock of type RowExclusiveLock
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:  you don't own a
lock of type RowShareLock
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:  you don't own a
lock of type AccessShareLock
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:  you don't own a
lock of type ExclusiveLock
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]ERROR:  failed to re-find
shared lock object
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:
AbortSubTransaction while in ABORT state
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:  did not find
subXID 7553 in MyProc
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:  you don't own a
lock of type RowExclusiveLock
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:  you don't own a
lock of type ShareLock
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:  you don't own a
lock of type AccessExclusiveLock
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:  you don't own a
lock of type ShareLock
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:  you don't own a
lock of type AccessShareLock
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:  you don't own a
lock of type RowShareLock
[cds2 32353 2015-01-16 04:40:58.028 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.029 CST 7549]WARNING:  you don't own a
lock of type AccessExclusiveLock
[cds2 32353 2015-01-16 04:40:58.029 CST 7549]WARNING:
ReleaseLockIfHeld: failed??
[cds2 32353 2015-01-16 04:40:58.029 CST 7549]ERROR:  failed to re-find
shared lock object
[cds2 32353 2015-01-16 04:40:58.029 CST 7549]PANIC:
ERRORDATA_STACK_SIZE exceeded
[ 3093 2015-01-16 04:41:00.299 CST 0]LOG:  server process (PID 32353)
was terminated by signal 6: Aborted
[ 3093 2015-01-16 04:41:00.300 CST 0]LOG:  terminating any other
active server processes

After that, server resumed processing without further incident.

merlin



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: PATCH: Reducing lock strength of trigger and foreign key DDL
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Safe memory allocation functions