Обсуждение: deadlock error - version 8.4 on CentOS 6
Hello List, I am occasionally seeing the following error: ALERT 3 sqlcode=-400 errmsg=deadlock detected on line 3351 from an application written using ecpg when trying an update to the table. Can autovacuum be causing this, since no one else is updating this database table. Thanks, -- Stephen Clark
On 10/28/2016 05:28 AM, Steve Clark wrote: > Hello List, > > I am occasionally seeing the following error: > ALERT 3 sqlcode=-400 errmsg=deadlock detected on line 3351 So what exactly is it doing at line 3351? > from an application written using ecpg when trying an update to the table. > Can autovacuum be causing this, > since no one else is updating this database table. Is there more then one instance of the application running? > > Thanks, > > -- Adrian Klaver adrian.klaver@aklaver.com
On 10/28/2016 09:15 AM, Adrian Klaver wrote:
No. But I examined the pg_log/log_file and saw an error indicating it was autovacuum:On 10/28/2016 05:28 AM, Steve Clark wrote:Hello List, I am occasionally seeing the following error: ALERT 3 sqlcode=-400 errmsg=deadlock detected on line 3351So what exactly is it doing at line 3351?from an application written using ecpg when trying an update to the table. Can autovacuum be causing this, since no one else is updating this database table.Is there more then one instance of the application running?Thanks,
2016-10-27 09:47:02 EDT:srm2api:12968:LOG: sending cancel to blocking autovacuum PID 12874
2016-10-27 09:47:02 EDT:srm2api:12968:DETAIL: Process 12968 waits for ExclusiveLock on relation 955454549 of database 955447411.
2016-10-27 09:47:02 EDT:srm2api:12968:STATEMENT: lock table t_unit_status_log in exclusive mode
2016-10-27 09:47:02 EDT::12874:ERROR: canceling autovacuum task
2016-10-27 09:47:02 EDT::12874:CONTEXT: automatic vacuum of table "srm2.public.t_unit_status_log"
2016-10-27 09:47:02 EDT:srm2api:9189:ERROR: deadlock detected at character 8
2016-10-27 09:47:02 EDT:srm2api:9189:DETAIL: Process 9189 waits for RowExclusiveLock on relation 955454549 of database 955447411; blocked by process 12968.
Process 12968 waits for ExclusiveLock on relation 955454518 of database 955447411; blocked by process 9189.
Process 9189: update t_unit_status_log set status_date = now ( ) , unit_active = 'y' , last_updated_date = now ( ) , last_updated_by = current_user , devices_down = $1 where unit_serial_no = $2
Process 12968: lock table t_unit in exclusive mode
This is at the same time and same table that my application reported the error on.
So I feel pretty confident this is the issue. I guess I should retry the update in my application.
Thanks,
--
Stephen Clark
Stephen Clark
Steve Clark <steve.clark@netwolves.com> writes: > I am occasionally seeing the following error: > ALERT 3 sqlcode=-400 errmsg=deadlock detected on line 3351 This isn't nearly enough information to determine what is going on. But if that is a report of a server-detected deadlock error, there should be more information about it in the postmaster log. regards, tom lane
On 10/28/16 9:27 AM, Steve Clark wrote: > On 10/28/2016 09:15 AM, Adrian Klaver wrote: >> On 10/28/2016 05:28 AM, Steve Clark wrote: >>> Hello List, >>> >>> I am occasionally seeing the following error: >>> ALERT 3 sqlcode=-400 errmsg=deadlock detected on line 3351 >> So what exactly is it doing at line 3351? >> >>> from an application written using ecpg when trying an update to the table. >>> Can autovacuum be causing this, >>> since no one else is updating this database table. >> Is there more then one instance of the application running? >> >>> Thanks, >>> >>> > No. But I examined the pg_log/log_file and saw an error indicating it > was autovacuum: > > > 2016-10-27 09:47:02 EDT:srm2api:12968:LOG: sending cancel to blocking > autovacuum PID 12874 > 2016-10-27 09:47:02 EDT:srm2api:12968:DETAIL: Process 12968 waits for > ExclusiveLock on relation 955454549 of database 955447411. > 2016-10-27 09:47:02 EDT:srm2api:12968:STATEMENT: lock table > t_unit_status_log in exclusive mode > 2016-10-27 09:47:02 EDT::12874:ERROR: canceling autovacuum task > 2016-10-27 09:47:02 EDT::12874:CONTEXT: automatic vacuum of table > "srm2.public.t_unit_status_log" > 2016-10-27 09:47:02 EDT:srm2api:9189:ERROR: deadlock detected at > character 8 > 2016-10-27 09:47:02 EDT:srm2api:9189:DETAIL: Process 9189 waits for > RowExclusiveLock on relation 955454549 of database 955447411; blocked > by process 12968. > Process 12968 waits for ExclusiveLock on relation 955454518 of > database 955447411; blocked by process 9189. > Process 9189: update t_unit_status_log set status_date = now ( > ) , unit_active = 'y' , last_updated_date = now ( ) , last_updated_by > = current_user , devices_down = $1 where unit_serial_no = $2 > Process 12968: lock table t_unit in exclusive mode > > This is at the same time and same table that my application reported > the error on. > > So I feel pretty confident this is the issue. I guess I should retry > the update in my application. > > Thanks, The problem is that you're doing: 'LOCK TABLE t_unit_status_log' If you were executing normal updates, autovacuum would be fine. Remove the exclusive lock from your code and you'll be fine. --Scott > > > > -- > Stephen Clark > -- Scott Mead Sr. Architect OpenSCG
Steve Clark <steve.clark@netwolves.com> writes: > No. But I examined the pg_log/log_file and saw an error indicating it was autovacuum: > 2016-10-27 09:47:02 EDT:srm2api:12968:LOG: sending cancel to blocking autovacuum PID 12874 > 2016-10-27 09:47:02 EDT:srm2api:12968:DETAIL: Process 12968 waits for ExclusiveLock on relation 955454549 of database955447411. > 2016-10-27 09:47:02 EDT:srm2api:12968:STATEMENT: lock table t_unit_status_log in exclusive mode > 2016-10-27 09:47:02 EDT::12874:ERROR: canceling autovacuum task > 2016-10-27 09:47:02 EDT::12874:CONTEXT: automatic vacuum of table "srm2.public.t_unit_status_log" That kicked the autovacuum off the table, but it didn't help because you still had a deadlock condition afterwards: > 2016-10-27 09:47:02 EDT:srm2api:9189:ERROR: deadlock detected at character 8 > 2016-10-27 09:47:02 EDT:srm2api:9189:DETAIL: Process 9189 waits for RowExclusiveLock on relation 955454549 of database955447411; blocked by process 12968. > Process 12968 waits for ExclusiveLock on relation 955454518 of database 955447411; blocked by process 9189. > Process 9189: update t_unit_status_log set status_date = now ( ) , unit_active = 'y' , last_updated_date = now( ) , last_updated_by = current_user , devices_down = $1 where unit_serial_no = $2 > Process 12968: lock table t_unit in exclusive mode > So I feel pretty confident this is the issue. I guess I should retry the update in my application. Retrying might be a usable band-aid, but really this is an application logic error. The code that is trying to do "lock table t_unit in exclusive mode" must already hold some lower-level lock on t_unit, which is blocking whatever the "update t_unit_status_log" command wants to do with t_unit. Looks like a classic lock-strength-upgrade mistake to me. regards, tom lane
On 10/28/2016 09:48 AM, Tom Lane wrote: > Steve Clark <steve.clark@netwolves.com> writes: >> No. But I examined the pg_log/log_file and saw an error indicating it was autovacuum: >> 2016-10-27 09:47:02 EDT:srm2api:12968:LOG: sending cancel to blocking autovacuum PID 12874 >> 2016-10-27 09:47:02 EDT:srm2api:12968:DETAIL: Process 12968 waits for ExclusiveLock on relation 955454549 of database955447411. >> 2016-10-27 09:47:02 EDT:srm2api:12968:STATEMENT: lock table t_unit_status_log in exclusive mode >> 2016-10-27 09:47:02 EDT::12874:ERROR: canceling autovacuum task >> 2016-10-27 09:47:02 EDT::12874:CONTEXT: automatic vacuum of table "srm2.public.t_unit_status_log" > That kicked the autovacuum off the table, but it didn't help because you > still had a deadlock condition afterwards: > >> 2016-10-27 09:47:02 EDT:srm2api:9189:ERROR: deadlock detected at character 8 >> 2016-10-27 09:47:02 EDT:srm2api:9189:DETAIL: Process 9189 waits for RowExclusiveLock on relation 955454549 of database955447411; blocked by process 12968. >> Process 12968 waits for ExclusiveLock on relation 955454518 of database 955447411; blocked by process 9189. >> Process 9189: update t_unit_status_log set status_date = now ( ) , unit_active = 'y' , last_updated_date = now( ) , last_updated_by = current_user , devices_down = $1 where unit_serial_no = $2 >> Process 12968: lock table t_unit in exclusive mode >> So I feel pretty confident this is the issue. I guess I should retry the update in my application. > Retrying might be a usable band-aid, but really this is an application > logic error. The code that is trying to do "lock table t_unit in > exclusive mode" must already hold some lower-level lock on t_unit, which > is blocking whatever the "update t_unit_status_log" command wants to do > with t_unit. Looks like a classic lock-strength-upgrade mistake to me. > > regards, tom lane > Oops - I forgot there is another process that runs every minute and takes about 1 second to run that does an exclusive lock on t_unit and t_unit_status_log. I only see this error maybe once or twice a day, so I am thinking of waiting 1 second and retrying when I see this error. Thoughts? -- Stephen Clark
Steve Clark <steve.clark@netwolves.com> writes: > On 10/28/2016 09:48 AM, Tom Lane wrote: >> Retrying might be a usable band-aid, but really this is an application >> logic error. The code that is trying to do "lock table t_unit in >> exclusive mode" must already hold some lower-level lock on t_unit, which >> is blocking whatever the "update t_unit_status_log" command wants to do >> with t_unit. Looks like a classic lock-strength-upgrade mistake to me. > Oops - I forgot there is another process that runs every minute and > takes about 1 second to run that does an exclusive lock on t_unit and > t_unit_status_log. The problem here doesn't seem to be that; it's that whatever transaction is doing the "lock table" has *already* got a non-exclusive lock on t_unit. That's just bad programming. Take the strongest lock you need earliest in the transaction. regards, tom lane
On 10/28/2016 10:25 AM, Tom Lane wrote: > Steve Clark <steve.clark@netwolves.com> writes: >> On 10/28/2016 09:48 AM, Tom Lane wrote: >>> Retrying might be a usable band-aid, but really this is an application >>> logic error. The code that is trying to do "lock table t_unit in >>> exclusive mode" must already hold some lower-level lock on t_unit, which >>> is blocking whatever the "update t_unit_status_log" command wants to do >>> with t_unit. Looks like a classic lock-strength-upgrade mistake to me. >> Oops - I forgot there is another process that runs every minute and >> takes about 1 second to run that does an exclusive lock on t_unit and >> t_unit_status_log. > The problem here doesn't seem to be that; it's that whatever transaction > is doing the "lock table" has *already* got a non-exclusive lock on > t_unit. That's just bad programming. Take the strongest lock you need > earliest in the transaction. > > regards, tom lane > I want to thank all the people that took the time to provide some elucidation on my problem. The original code that was doing the exclusive locks was written in 2003 on ver 7.x which according to comments in the code did not provide declaring a cursor for update in ecpg, so the programmer at that time opted to lock the two tables. I just changed to code to remove the two exclusive locks and use "for update" on the cursor and haven't seen a lock issue**in the**pg_log file since. Regards, -- Stephen Clark
On 10/28/2016 10:25 AM, Tom Lane wrote:
I want to thank all the people that took the time to provide some elucidation on my problem.Steve Clark <steve.clark@netwolves.com> writes:On 10/28/2016 09:48 AM, Tom Lane wrote:Retrying might be a usable band-aid, but really this is an application logic error. The code that is trying to do "lock table t_unit in exclusive mode" must already hold some lower-level lock on t_unit, which is blocking whatever the "update t_unit_status_log" command wants to do with t_unit. Looks like a classic lock-strength-upgrade mistake to me.Oops - I forgot there is another process that runs every minute and takes about 1 second to run that does an exclusive lock on t_unit and t_unit_status_log.The problem here doesn't seem to be that; it's that whatever transaction is doing the "lock table" has *already* got a non-exclusive lock on t_unit. That's just bad programming. Take the strongest lock you need earliest in the transaction. regards, tom lane
The original code that was doing the exclusive locks was written in 2003 on ver 7.x which according to
comments in the code did not provide declaring a cursor for update in ecpg, so the programmer at that
time opted to lock the two tables.
I just changed to code to remove the two exclusive locks and use "for update" on the cursor and haven't
seen a lock issue in the pg_log file since.
Regards,
--
Stephen Clark
Stephen Clark