Обсуждение: and waiting
Hi guys,<br /><br /> I saw a strange behaviour on one of the production boxes. The pg_stat_activity shows a process as<IDLE> and yet 'waiting' !!! On top of it (understandably, since its IDLE), there are no entries for this pid inpg_locks!<br /><br /> Following are the snapshots of the two system views.<br /><br /><span style="font-family: couriernew,monospace;"> procpid | current_query | waiting | duration | backend_start</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;">---------+-----------------------+---------+------------------+-------------------------------</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> 20762 | <IDLE> | f | | 2008-01-31 13:38:30.848898-08</span><br style="font-family: couriernew,monospace;" /><span style="font-family: courier new,monospace;"> 19776 | <IDLE> | t | 00:38:34.76833 | 2008-01-31 12:51:29.005744-08</span><br style="font-family: courier new,monospace;" /><span style="font-family:courier new,monospace;"> 20356 | <IDLE> | f | 00:38:29.971425 | 2008-01-3113:17:37.617497-08</span><br style="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> 19775 | <IDLE> | f | 00:38:27.187201 | 2008-01-31 12:51:28.999242-08</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> 19774 | <IDLE> | f | 00:38:27.187068 | 2008-01-31 12:51:28.90554-08</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> 20728 | <IDLE> | f | 00:14:03.913027 | 2008-01-31 13:36:11.345822-08</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> 9727 | <IDLE> | f | 00:03:07.444273 | 2008-01-24 22:25:00.289931-08</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> 9684 | <IDLE> | f | 00:00:07.704656 | 2008-01-24 22:22:00.007377-08</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> 19390 | <IDLE> in transaction | f | 00:00:00.027585 | 2008-01-31 12:30:07.999246-08</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> 19389 | <IDLE> in transaction | t | -00:00:00.000255 | 2008-01-31 12:30:07.973868-08</span><brstyle="font-family: courier new,monospace;" /><br style="font-family: courier new,monospace;"/><span style="font-family: courier new,monospace;">select * from pg_locks where pid in ( 19776, 19389 );</span><brstyle="font-family: courier new,monospace;" /><br style="font-family: courier new,monospace;" /><span style="font-family:courier new,monospace;"> locktype | database | relation | page | tuple | transactionid | classid| objid | objsubid | transaction | pid | mode | granted</span><br style="font-family: courier new,monospace;"/><span style="font-family: courier new,monospace;">---------------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+-------+------------------+---------</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> relation | 16584| 17070 | | | | | | | 3700350056 | 19389 | RowExclusiveLock | t</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> relation | 16584 | 17106 | | | | | | | 3700350056| 19389 | RowExclusiveLock | t</span><br style="font-family: courier new,monospace;" /><span style="font-family:courier new,monospace;"> relation | 16584 | 17068 | | | | | | | 3700350056 | 19389 | RowExclusiveLock | t</span><br style="font-family: courier new,monospace;"/><span style="font-family: courier new,monospace;"> transactionid | | | | | 3700350056 | | | | 3700350056 | 19389 | ExclusiveLock | t</span><br style="font-family: couriernew,monospace;" /><span style="font-family: courier new,monospace;"> relation | 16584 | 17108 | | | | | | | 3700350056 | 19389 | RowExclusiveLock | t</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;">(5 rows)</span><br style="font-family:courier new,monospace;" /><br clear="all" style="font-family: courier new,monospace;" /><br /> The'duration' column above is just now()-query_start. These are not just two instant snapshots, but we could see this outputconsistently for quite long.<br /><br /> I tracked the 'waiting' column a little bit in the source code, and sawthat it is actually generated from PgBackendStatus.st_waiting . Is it possible that, for some reason, postgres forgotto update this for a backend?<br /><br /><span style="font-family: courier new,monospace;">select version();</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> version</span><br style="font-family: courier new,monospace;" /><spanstyle="font-family: courier new,monospace;">--------------------------------------------------------------------------------------------</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> PostgreSQL 8.2.4 on x86_64-unknown-linux-gnu,compiled by GCC gcc (GCC) 4.1.0 (SUSE Linux)</span><br style="font-family: courier new,monospace;"/><br /> This issue has been seen twice now.<br /><br />-- <br />gurjeet[.singh]@EnterpriseDB.com<br />singh.gurjeet@{gmail | hotmail | indiatimes | yahoo }.com<br /><br />EnterpriseDB <a href="http://www.enterprisedb.com">http://www.enterprisedb.com</a><br/><br />17° 29' 34.37"N, 78° 30' 59.76"E - Hyderabad<br/>18° 32' 57.25"N, 73° 56' 25.42"E - Pune<br />37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *<br /><br/><a href="http://gurjeet.frihost.net">http://gurjeet.frihost.net</a><br /><br />Mail sent from my BlackLaptop device
"Gurjeet Singh" <singh.gurjeet@gmail.com> writes: > I saw a strange behaviour on one of the production boxes. The > pg_stat_activity shows a process as <IDLE> and yet 'waiting' !!! On top of > it (understandably, since its IDLE), there are no entries for this pid in > pg_locks! Hmm, I can reproduce something like this by aborting a wait for lock. It seems the problem is that WaitOnLock() is ignoring its own good advice, assuming that it can do cleanup work after waiting. regards, tom lane
The situation seems pretty bad!!<br /><br />Here are the steps to reproduce in 'PostgreSQL 8.3beta2 on x86_64-unknown-linux-gnu,compiled by GCC gcc (GCC) 3.3.3 (SuSE Linux)':<br /><br />session 1: begin;<br />session 1: updatetest set a = 112 where a = 112;<br /> session 2: update test set a = 113 where a = 112; --waits<br />session 1: select* from pg_stat_activity; -- executed this a few times before executing 'select version()' and then following:<br />session1: <stat query1> -- see end of mail for this query<br /><br /><span style="font-family: courier new,monospace;"> procpid| current_query | waiting | duration | backend_start</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;">---------+----------------------------------------+---------+------------------+-------------------------------</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> 12577 | update test seta = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08</span><br style="font-family: couriernew,monospace;" /><span style="font-family: courier new,monospace;"> 11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08</span><br style="font-family: courier new,monospace;"/><span style="font-family: courier new,monospace;">(2 rows)</span><br style="font-family: courier new,monospace;"/><br style="font-family: courier new,monospace;" />session 1: select * from pg_locks<br /><span style="font-family:courier new,monospace;"><br /> locktype | database | relation | page | tuple | virtualxid | transactionid| classid | objid | objsubid | virtualtransaction | pid | mode | granted</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;">---------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+---------------<br />-----+-------+------------------+---------<br/> transactionid | | | | | | 390 | | | | 2/14 | 12577 | ShareLock | f</span><br style="font-family: couriernew,monospace;" /><span style="font-family: courier new,monospace;"> transactionid | | | | | | 390 | | | | 1/9 | 11975 | ExclusiveLock | t</span><br style="font-family:courier new,monospace;" /> <snip irrelevant><br /><br />Then,<br />session 2: ^C<br /><span style="font-family:courier new,monospace;"></span><span style="font-family: courier new,monospace;">Cancel request sent</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;">ERROR: cancelingstatement due to user request<br /><br /></span>session1: <stat query1><br /><span style="font-family: couriernew,monospace;"><br /><span style="font-family: courier new,monospace;"> procpid | current_query | waiting | duration | backend_start</span></span><br style="font-family: couriernew,monospace;" /><span style="font-family: courier new,monospace;">---------+----------------------------------------+---------+------------------+-------------------------------</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> 12577 | update test seta = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08</span><br style="font-family: couriernew,monospace;" /><span style="font-family: courier new,monospace;"> 11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08</span><br style="font-family: courier new,monospace;"/><span style="font-family: courier new,monospace;">(2 rows)</span><br style="font-family: courier new,monospace;"/><br />session 1: select * from pg_locks ;<br /><br style="font-family: courier new,monospace;" /><span style="font-family:courier new,monospace;"><no traces of pid 12577></span><br /><br />session 1: select pg_backend_pid();<br/><br /><span style="font-family: courier new,monospace;"> pg_backend_pid</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;">----------------</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: couriernew,monospace;"> 11975</span><br style="font-family: courier new,monospace;" /><br /> The last mentionedoutput of <stat query> shows session 1 executing 'select * from p_s_a', whereas the <stat query> _is_being executed in session 1!!! This result is consistently returned for a while, and later...<br /><br />session 2: selectpg_backend_pid();<br /><span style="font-family: courier new,monospace;"><br /> pg_backend_pid</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> ----------------</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> 12577</span><br /><br /> session 1: <stat query1><br /><br /><span style="font-family: couriernew,monospace;"> procpid | current_query | waiting | duration | backend_start</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;">---------+-----------------------+---------+-----------------+-------------------------------</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> 11975 | <IDLE> intransaction | f | 00:06:08.671029 | 2008-02-01 13:30:40.396392-08</span><br style="font-family: courier new,monospace;"/><span style="font-family: courier new,monospace;">(1 row)</span><br style="font-family: courier new,monospace;"/><br />After a while again:<br /><br />session 1: <stat query2> -- notice 2 not 1; 'select *' comesback to haunt!!!<br style="font-family: courier new,monospace;" /><br /><span style="font-family: courier new,monospace;"> procpid| current_query | waiting | duration | backend_start<br />---------+----------------------------------------+---------+------------------+-------------------------------<br/> 12577 | update test set a = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08<br /> 11975| select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08<br /> (2 rows)<br/></span><span style="font-family: courier new,monospace;"><br /><span style="font-family: arial,sans-serif;">session1: <stat query1> -- 1 back in action</span><br style="font-family: courier new,monospace;"/></span><span style="font-family: courier new,monospace;"><br /> procpid | current_query | waiting | duration | backend_start</span><br style="font-family: courier new,monospace;"/><span style="font-family: courier new,monospace;">---------+----------------------------------------+---------+------------------+-------------------------------</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> 12577 | update test seta = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08</span><br style="font-family: couriernew,monospace;" /><span style="font-family: courier new,monospace;"> 11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08</span><br style="font-family: courier new,monospace;"/><span style="font-family: courier new,monospace;">(2 rows)<br /></span><br />The <stat query1> is:<br/>select<br />procpid, current_query::varchar(50), waiting, now() - query_start as duration, backend_start<br />frompg_stat_activity<br />where current_query <> '<IDLE>'<br /> and current_query not like '%DONT COUNT ME1%'<br />order by duration desc<br />limit 10;<br /><br />The <stat query2> is:<br />select<br /> procpid, current_query::varchar(50),waiting, now() - query_start as duration, backend_start<br /> from pg_stat_activity<br />wherecurrent_query not like '%DONT COUNT ME1 %'<br />order by duration desc<br />limit 10;<br /><br /> Found more bugsthan I was looking for, to reproduce!!!<br /><br /> The reporter also made an observation (on 8.2.4) that there weredeadlocks detected at around the same time. Looked at WaitOnLock(), and clearly there's a problem, but is it at the same/onlyplace we are suspecting it to be?<br /><br />Best regards,<br /><br />PS: Ran the <stat query>ies 1 and 2again, just before hitting 'send', and the result is the same:<br /><span style="font-family: courier new,monospace;"> procpid| current_query | waiting | duration | backend_start</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;">---------+----------------------------------------+---------+------------------+-------------------------------</span><br style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> 12577 | update test seta = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08</span><br style="font-family: couriernew,monospace;" /><span style="font-family: courier new,monospace;"> 11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08</span><br style="font-family: courier new,monospace;"/><span style="font-family: courier new,monospace;">(2 rows)</span><br style="font-family: courier new,monospace;"/><br /><br />Clearly, there's something wrong <br /><br /><div class="gmail_quote">On Feb 1, 2008 8:16 AM,Tom Lane <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> wrote:<br /><blockquote class="gmail_quote"style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><divclass="Ih2E3d">"Gurjeet Singh" <<a href="mailto:singh.gurjeet@gmail.com">singh.gurjeet@gmail.com</a>> writes:<br/>> I saw a strange behaviour on one of the production boxes. The<br />> pg_stat_activity shows a processas <IDLE> and yet 'waiting' !!! On top of<br /> > it (understandably, since its IDLE), there are no entriesfor this pid in<br />> pg_locks!<br /><br /></div>Hmm, I can reproduce something like this by aborting a wait forlock.<br />It seems the problem is that WaitOnLock() is ignoring its own good<br /> advice, assuming that it can do cleanupwork after waiting.<br /><br /> regards, tom lane<br /></blockquote></div><br /><br clear="all"/><br />-- <br />gurjeet[.singh]@EnterpriseDB.com<br />singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com<br/><br />EnterpriseDB <a href="http://www.enterprisedb.com">http://www.enterprisedb.com</a><br /><br />17° 29'34.37"N, 78° 30' 59.76"E - Hyderabad<br />18° 32' 57.25"N, 73° 56' 25.42"E - Pune<br /> 37° 47' 19.72"N, 122° 24'1.69" W - San Francisco *<br /><br /><a href="http://gurjeet.frihost.net">http://gurjeet.frihost.net</a><br /><br />Mailsent from my BlackLaptop device
"Gurjeet Singh" <singh.gurjeet@gmail.com> writes: > The situation seems pretty bad!! I think at least part of your problem is not understanding that a single transaction sees a frozen snapshot of pg_stat_activity. regards, tom lane
On Feb 1, 2008 3:56 PM, Tom Lane <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> wrote:<br /><div class="gmail_quote"><blockquoteclass="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt0.8ex; padding-left: 1ex;"><div class="Ih2E3d">"Gurjeet Singh" <<a href="mailto:singh.gurjeet@gmail.com">singh.gurjeet@gmail.com</a>>writes:<br /></div><div class="Ih2E3d">> The situationseems pretty bad!!<br /><br /></div>I think at least part of your problem is not understanding that a single<br/> transaction sees a frozen snapshot of pg_stat_activity.<br /><br /></blockquote></div><br />It does! I assumedthat pg_stat_activity produced the transaction-independent snapshot of internal memory structures! Is that the casewith pg_locks too!? I hope not.<br /><br />BTW, we cannot say that the pg_stat_activity behaves in a consistent manner(transactions-wise). From what I could infer, this view's results are frozen when you first query the view, not whenthe transaction started (which is how other (normal) relations behave). It's a bit confusing, and should be documentedif this is the way it is intended to work; Something along the lines of : "In a transaction, this view will repeatedlyshow the same results that were returned by it's first invocation in the transaction." in a less confusing way:)<br /><br />So we are back to the original problem... Canceling a 'waiting' transaction does not revert the session's'waiting' state back to 'false' (consistently reproducible).<br clear="all" /><br />-- <br /> gurjeet[.singh]@EnterpriseDB.com<br/>singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com<br /><br />EnterpriseDB <a href="http://www.enterprisedb.com">http://www.enterprisedb.com</a><br /><br />17° 29' 34.37"N, 78° 30' 59.76"E -Hyderabad<br /> 18° 32' 57.25"N, 73° 56' 25.42"E - Pune<br />37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *<br /><br/><a href="http://gurjeet.frihost.net">http://gurjeet.frihost.net</a><br /><br />Mail sent from my BlackLaptop device
I wrote: > "Gurjeet Singh" <singh.gurjeet@gmail.com> writes: >> I saw a strange behaviour on one of the production boxes. The >> pg_stat_activity shows a process as <IDLE> and yet 'waiting' !!! On top of >> it (understandably, since its IDLE), there are no entries for this pid in >> pg_locks! > Hmm, I can reproduce something like this by aborting a wait for lock. > It seems the problem is that WaitOnLock() is ignoring its own good > advice, assuming that it can do cleanup work after waiting. I've committed a fix for this. (Too late for 8.3.0, unfortunately.) regards, tom lane
On Feb 2, 2008 2:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thanks. Like 8.2, can it not be back-patched on 8.3 too?
I just looked at the patch... Isn't PG_TRY() an expensive call to make in the lock.c code? I was thinking of registering a Xact callback using RegisterXactCallback() and performing 'waiting' reset in that callback if the Xact event is XACT_EVENT_ABORT.
That would have been compliant with the previous comments ('if we fail, any cleanup must happen in xact abort processing, not here').
Comments.
Best regards,
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
17° 29' 34.37"N, 78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N, 73° 56' 25.42"E - Pune
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *
http://gurjeet.frihost.net
Mail sent from my BlackLaptop device
I've committed a fix for this. (Too late for 8.3.0, unfortunately.)I wrote:
> "Gurjeet Singh" <singh.gurjeet@gmail.com> writes:
>> I saw a strange behaviour on one of the production boxes. The
>> pg_stat_activity shows a process as <IDLE> and yet 'waiting' !!! On top of
>> it (understandably, since its IDLE), there are no entries for this pid in
>> pg_locks!
> Hmm, I can reproduce something like this by aborting a wait for lock.
> It seems the problem is that WaitOnLock() is ignoring its own good
> advice, assuming that it can do cleanup work after waiting.
Thanks. Like 8.2, can it not be back-patched on 8.3 too?
I just looked at the patch... Isn't PG_TRY() an expensive call to make in the lock.c code? I was thinking of registering a Xact callback using RegisterXactCallback() and performing 'waiting' reset in that callback if the Xact event is XACT_EVENT_ABORT.
That would have been compliant with the previous comments ('if we fail, any cleanup must happen in xact abort processing, not here').
Comments.
Best regards,
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
17° 29' 34.37"N, 78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N, 73° 56' 25.42"E - Pune
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *
http://gurjeet.frihost.net
Mail sent from my BlackLaptop device
Gurjeet Singh escribió: > I just looked at the patch... Isn't PG_TRY() an expensive call to make in > the lock.c code? I was thinking of registering a Xact callback using > RegisterXactCallback() and performing 'waiting' reset in that callback if > the Xact event is XACT_EVENT_ABORT. PG_TRY is not expensive as all that -- it's just a sigsetjmp() call and another stack frame. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Feb 2, 2008 3:27 PM, Alvaro Herrera <<a href="mailto:alvherre@commandprompt.com">alvherre@commandprompt.com</a>>wrote:<br /><div class="gmail_quote"><blockquoteclass="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt0.8ex; padding-left: 1ex;"> Gurjeet Singh escribió:<br /><div class="Ih2E3d"><br />> I just looked at the patch...Isn't PG_TRY() an expensive call to make in<br />> the lock.c code? I was thinking of registering a Xact callbackusing<br />> RegisterXactCallback() and performing 'waiting' reset in that callback if<br /> > the Xact eventis XACT_EVENT_ABORT.<br /><br /></div>PG_TRY is not expensive as all that -- it's just a sigsetjmp() call and<br />anotherstack frame.<br /><font color="#888888"><br /></font></blockquote></div><br clear="all" /> Thats why I asked. Iassumed that creating stacks was expensive. Isn't this the reason the compilers came up with the function inline capability;to avoid stacks, because they can be expensive. Or am I confusing two different types of stacks!<br /><br />Moreover,calling a callback, once in a while (only upon XACT abort), may prove to be much cheaper than setting up an additionalstack on every lock-acquire call.<br /><br />Really, my 2 cents.<br /><br />-- <br />gurjeet[.singh]@EnterpriseDB.com<br/> singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com<br /><br />EnterpriseDB <a href="http://www.enterprisedb.com">http://www.enterprisedb.com</a><br /><br />17° 29' 34.37"N, 78°30' 59.76"E - Hyderabad<br /> 18° 32' 57.25"N, 73° 56' 25.42"E - Pune<br />37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco*<br /><br /><a href="http://gurjeet.frihost.net">http://gurjeet.frihost.net</a><br /><br />Mail sent from my BlackLaptopdevice
Alvaro Herrera <alvherre@commandprompt.com> writes: > Gurjeet Singh escribi�: >> I just looked at the patch... Isn't PG_TRY() an expensive call to make in >> the lock.c code? I was thinking of registering a Xact callback using >> RegisterXactCallback() and performing 'waiting' reset in that callback if >> the Xact event is XACT_EVENT_ABORT. > PG_TRY is not expensive as all that -- it's just a sigsetjmp() call and > another stack frame. Also, since we're about to block here, shaving microseconds is not all that important. The reason I did it that way was to avoid having to export the saved ps-display string out to someplace LockWaitCancel could find it. regards, tom lane