Обсуждение: deadlock error - version 8.4 on CentOS 6

Поиск
Список
Период
Сортировка

deadlock error - version 8.4 on CentOS 6

От
Steve Clark
Дата:
Hello List,

I am occasionally seeing the following error:
ALERT  3 sqlcode=-400 errmsg=deadlock detected on line 3351
from an application written using ecpg when trying an update to the table.
Can autovacuum be causing this,
since no one else is updating this database table.

Thanks,


--
Stephen Clark



Re: deadlock error - version 8.4 on CentOS 6

От
Adrian Klaver
Дата:
On 10/28/2016 05:28 AM, Steve Clark wrote:
> Hello List,
>
> I am occasionally seeing the following error:
> ALERT  3 sqlcode=-400 errmsg=deadlock detected on line 3351

So what exactly is it doing at line 3351?

> from an application written using ecpg when trying an update to the table.
> Can autovacuum be causing this,
> since no one else is updating this database table.

Is there more then one instance of the application running?

>
> Thanks,
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: deadlock error - version 8.4 on CentOS 6

От
Steve Clark
Дата:
On 10/28/2016 09:15 AM, Adrian Klaver wrote:
On 10/28/2016 05:28 AM, Steve Clark wrote:
Hello List,

I am occasionally seeing the following error:
ALERT  3 sqlcode=-400 errmsg=deadlock detected on line 3351
So what exactly is it doing at line 3351?

from an application written using ecpg when trying an update to the table.
Can autovacuum be causing this,
since no one else is updating this database table.
Is there more then one instance of the application running?

Thanks,



No. But I examined the pg_log/log_file and saw an error indicating it was autovacuum:


2016-10-27 09:47:02 EDT:srm2api:12968:LOG:  sending cancel to blocking autovacuum PID 12874
2016-10-27 09:47:02 EDT:srm2api:12968:DETAIL:  Process 12968 waits for ExclusiveLock on relation 955454549 of database 955447411.
2016-10-27 09:47:02 EDT:srm2api:12968:STATEMENT:  lock table t_unit_status_log in exclusive mode
2016-10-27 09:47:02 EDT::12874:ERROR:  canceling autovacuum task
2016-10-27 09:47:02 EDT::12874:CONTEXT:  automatic vacuum of table "srm2.public.t_unit_status_log"
2016-10-27 09:47:02 EDT:srm2api:9189:ERROR:  deadlock detected at character 8
2016-10-27 09:47:02 EDT:srm2api:9189:DETAIL:  Process 9189 waits for RowExclusiveLock on relation 955454549 of database 955447411; blocked by process 12968.
        Process 12968 waits for ExclusiveLock on relation 955454518 of database 955447411; blocked by process 9189.
        Process 9189: update t_unit_status_log set status_date = now ( ) , unit_active = 'y' , last_updated_date = now ( ) , last_updated_by = current_user , devices_down = $1  where unit_serial_no = $2
        Process 12968: lock table t_unit in exclusive mode

This is at the same time and same table that my application reported the error on.

So I feel pretty confident this is the issue. I guess I should retry the update in my application.

Thanks,



--
Stephen Clark

Re: deadlock error - version 8.4 on CentOS 6

От
Tom Lane
Дата:
Steve Clark <steve.clark@netwolves.com> writes:
> I am occasionally seeing the following error:
> ALERT  3 sqlcode=-400 errmsg=deadlock detected on line 3351

This isn't nearly enough information to determine what is going on.
But if that is a report of a server-detected deadlock error, there
should be more information about it in the postmaster log.

            regards, tom lane


Re: deadlock error - version 8.4 on CentOS 6

От
Scott Mead
Дата:

On 10/28/16 9:27 AM, Steve Clark wrote:
> On 10/28/2016 09:15 AM, Adrian Klaver wrote:
>> On 10/28/2016 05:28 AM, Steve Clark wrote:
>>> Hello List,
>>>
>>> I am occasionally seeing the following error:
>>> ALERT  3 sqlcode=-400 errmsg=deadlock detected on line 3351
>> So what exactly is it doing at line 3351?
>>
>>> from an application written using ecpg when trying an update to the table.
>>> Can autovacuum be causing this,
>>> since no one else is updating this database table.
>> Is there more then one instance of the application running?
>>
>>> Thanks,
>>>
>>>
> No. But I examined the pg_log/log_file and saw an error indicating it
> was autovacuum:
>
>
> 2016-10-27 09:47:02 EDT:srm2api:12968:LOG:  sending cancel to blocking
> autovacuum PID 12874
> 2016-10-27 09:47:02 EDT:srm2api:12968:DETAIL:  Process 12968 waits for
> ExclusiveLock on relation 955454549 of database 955447411.
> 2016-10-27 09:47:02 EDT:srm2api:12968:STATEMENT:  lock table
> t_unit_status_log in exclusive mode
> 2016-10-27 09:47:02 EDT::12874:ERROR:  canceling autovacuum task
> 2016-10-27 09:47:02 EDT::12874:CONTEXT:  automatic vacuum of table
> "srm2.public.t_unit_status_log"
> 2016-10-27 09:47:02 EDT:srm2api:9189:ERROR:  deadlock detected at
> character 8
> 2016-10-27 09:47:02 EDT:srm2api:9189:DETAIL:  Process 9189 waits for
> RowExclusiveLock on relation 955454549 of database 955447411; blocked
> by process 12968.
>         Process 12968 waits for ExclusiveLock on relation 955454518 of
> database 955447411; blocked by process 9189.
>         Process 9189: update t_unit_status_log set status_date = now (
> ) , unit_active = 'y' , last_updated_date = now ( ) , last_updated_by
> = current_user , devices_down = $1  where unit_serial_no = $2
>         Process 12968: lock table t_unit in exclusive mode
>
> This is at the same time and same table that my application reported
> the error on.
>
> So I feel pretty confident this is the issue. I guess I should retry
> the update in my application.
>
> Thanks,

The problem is that you're doing:
'LOCK TABLE t_unit_status_log'

  If you were executing normal updates, autovacuum would be fine.
Remove the exclusive lock from your code and you'll be fine.

--Scott


>
>
>
> --
> Stephen Clark
>

--
Scott Mead
Sr. Architect
OpenSCG



Re: deadlock error - version 8.4 on CentOS 6

От
Tom Lane
Дата:
Steve Clark <steve.clark@netwolves.com> writes:
> No. But I examined the pg_log/log_file and saw an error indicating it was autovacuum:

> 2016-10-27 09:47:02 EDT:srm2api:12968:LOG:  sending cancel to blocking autovacuum PID 12874
> 2016-10-27 09:47:02 EDT:srm2api:12968:DETAIL:  Process 12968 waits for ExclusiveLock on relation 955454549 of
database955447411. 
> 2016-10-27 09:47:02 EDT:srm2api:12968:STATEMENT:  lock table t_unit_status_log in exclusive mode
> 2016-10-27 09:47:02 EDT::12874:ERROR:  canceling autovacuum task
> 2016-10-27 09:47:02 EDT::12874:CONTEXT:  automatic vacuum of table "srm2.public.t_unit_status_log"

That kicked the autovacuum off the table, but it didn't help because you
still had a deadlock condition afterwards:

> 2016-10-27 09:47:02 EDT:srm2api:9189:ERROR:  deadlock detected at character 8
> 2016-10-27 09:47:02 EDT:srm2api:9189:DETAIL:  Process 9189 waits for RowExclusiveLock on relation 955454549 of
database955447411; blocked by process 12968. 
>          Process 12968 waits for ExclusiveLock on relation 955454518 of database 955447411; blocked by process 9189.
>          Process 9189: update t_unit_status_log set status_date = now ( ) , unit_active = 'y' , last_updated_date =
now( ) , last_updated_by = current_user , devices_down = $1  where unit_serial_no = $2 
>          Process 12968: lock table t_unit in exclusive mode

> So I feel pretty confident this is the issue. I guess I should retry the update in my application.

Retrying might be a usable band-aid, but really this is an application
logic error.  The code that is trying to do "lock table t_unit in
exclusive mode" must already hold some lower-level lock on t_unit, which
is blocking whatever the "update t_unit_status_log" command wants to do
with t_unit.  Looks like a classic lock-strength-upgrade mistake to me.

            regards, tom lane


Re: deadlock error - version 8.4 on CentOS 6

От
Steve Clark
Дата:
On 10/28/2016 09:48 AM, Tom Lane wrote:
> Steve Clark <steve.clark@netwolves.com> writes:
>> No. But I examined the pg_log/log_file and saw an error indicating it was autovacuum:
>> 2016-10-27 09:47:02 EDT:srm2api:12968:LOG:  sending cancel to blocking autovacuum PID 12874
>> 2016-10-27 09:47:02 EDT:srm2api:12968:DETAIL:  Process 12968 waits for ExclusiveLock on relation 955454549 of
database955447411. 
>> 2016-10-27 09:47:02 EDT:srm2api:12968:STATEMENT:  lock table t_unit_status_log in exclusive mode
>> 2016-10-27 09:47:02 EDT::12874:ERROR:  canceling autovacuum task
>> 2016-10-27 09:47:02 EDT::12874:CONTEXT:  automatic vacuum of table "srm2.public.t_unit_status_log"
> That kicked the autovacuum off the table, but it didn't help because you
> still had a deadlock condition afterwards:
>
>> 2016-10-27 09:47:02 EDT:srm2api:9189:ERROR:  deadlock detected at character 8
>> 2016-10-27 09:47:02 EDT:srm2api:9189:DETAIL:  Process 9189 waits for RowExclusiveLock on relation 955454549 of
database955447411; blocked by process 12968. 
>>           Process 12968 waits for ExclusiveLock on relation 955454518 of database 955447411; blocked by process
9189.
>>           Process 9189: update t_unit_status_log set status_date = now ( ) , unit_active = 'y' , last_updated_date =
now( ) , last_updated_by = current_user , devices_down = $1  where unit_serial_no = $2 
>>           Process 12968: lock table t_unit in exclusive mode
>> So I feel pretty confident this is the issue. I guess I should retry the update in my application.
> Retrying might be a usable band-aid, but really this is an application
> logic error.  The code that is trying to do "lock table t_unit in
> exclusive mode" must already hold some lower-level lock on t_unit, which
> is blocking whatever the "update t_unit_status_log" command wants to do
> with t_unit.  Looks like a classic lock-strength-upgrade mistake to me.
>
>             regards, tom lane
>
Oops - I forgot there is another process that runs every minute and takes about 1 second to run that does an
exclusive lock on t_unit and t_unit_status_log.

I only see this error maybe once or twice a day, so I am thinking of waiting 1 second and retrying when I see this
error.

Thoughts?

--
Stephen Clark



Re: deadlock error - version 8.4 on CentOS 6

От
Tom Lane
Дата:
Steve Clark <steve.clark@netwolves.com> writes:
> On 10/28/2016 09:48 AM, Tom Lane wrote:
>> Retrying might be a usable band-aid, but really this is an application
>> logic error.  The code that is trying to do "lock table t_unit in
>> exclusive mode" must already hold some lower-level lock on t_unit, which
>> is blocking whatever the "update t_unit_status_log" command wants to do
>> with t_unit.  Looks like a classic lock-strength-upgrade mistake to me.

> Oops - I forgot there is another process that runs every minute and
> takes about 1 second to run that does an exclusive lock on t_unit and
> t_unit_status_log.

The problem here doesn't seem to be that; it's that whatever transaction
is doing the "lock table" has *already* got a non-exclusive lock on
t_unit.  That's just bad programming.  Take the strongest lock you need
earliest in the transaction.

            regards, tom lane


Re: deadlock error - version 8.4 on CentOS 6

От
Steve Clark
Дата:
On 10/28/2016 10:25 AM, Tom Lane wrote:
> Steve Clark <steve.clark@netwolves.com> writes:
>> On 10/28/2016 09:48 AM, Tom Lane wrote:
>>> Retrying might be a usable band-aid, but really this is an application
>>> logic error.  The code that is trying to do "lock table t_unit in
>>> exclusive mode" must already hold some lower-level lock on t_unit, which
>>> is blocking whatever the "update t_unit_status_log" command wants to do
>>> with t_unit.  Looks like a classic lock-strength-upgrade mistake to me.
>> Oops - I forgot there is another process that runs every minute and
>> takes about 1 second to run that does an exclusive lock on t_unit and
>> t_unit_status_log.
> The problem here doesn't seem to be that; it's that whatever transaction
> is doing the "lock table" has *already* got a non-exclusive lock on
> t_unit.  That's just bad programming.  Take the strongest lock you need
> earliest in the transaction.
>
>             regards, tom lane
>
I want to thank all the people that took the time to provide some elucidation on my problem.

The original code that was doing the exclusive locks was written in 2003 on ver 7.x which according to
comments in the code did not provide declaring a cursor for update in ecpg, so the programmer at that
time opted to lock the two tables.

I just changed to code to remove the two exclusive locks and use "for update" on the cursor and haven't
seen a lock issue**in the**pg_log file since.

Regards,

--
Stephen Clark

Re: deadlock error - version 8.4 on CentOS 6

От
Steve Clark
Дата:
On 10/28/2016 10:25 AM, Tom Lane wrote:
Steve Clark <steve.clark@netwolves.com> writes:
On 10/28/2016 09:48 AM, Tom Lane wrote:
Retrying might be a usable band-aid, but really this is an application
logic error.  The code that is trying to do "lock table t_unit in
exclusive mode" must already hold some lower-level lock on t_unit, which
is blocking whatever the "update t_unit_status_log" command wants to do
with t_unit.  Looks like a classic lock-strength-upgrade mistake to me.
Oops - I forgot there is another process that runs every minute and
takes about 1 second to run that does an exclusive lock on t_unit and
t_unit_status_log.
The problem here doesn't seem to be that; it's that whatever transaction
is doing the "lock table" has *already* got a non-exclusive lock on
t_unit.  That's just bad programming.  Take the strongest lock you need
earliest in the transaction.
		regards, tom lane

I want to thank all the people that took the time to provide some elucidation on my problem.

The original code that was doing the exclusive locks was written in 2003 on ver 7.x which according to
comments in the code did not provide declaring a cursor for update in ecpg, so the programmer at that
time opted to lock the two tables.

I just changed to code to remove the two exclusive locks and use "for update" on the cursor and haven't
seen a lock issue in the pg_log file since.

Regards,

--
Stephen Clark