Обсуждение: and waiting
Hi guys,<br /><br /> I saw a strange behaviour on one of the production boxes. The pg_stat_activity shows a process
as<IDLE> and yet 'waiting' !!! On top of it (understandably, since its IDLE), there are no entries for this pid
inpg_locks!<br /><br /> Following are the snapshots of the two system views.<br /><br /><span style="font-family:
couriernew,monospace;"> procpid | current_query | waiting | duration | backend_start</span><br
style="font-family:courier new,monospace;" /><span style="font-family: courier
new,monospace;">---------+-----------------------+---------+------------------+-------------------------------</span><br
style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> 20762 |
<IDLE> | f | | 2008-01-31 13:38:30.848898-08</span><br style="font-family:
couriernew,monospace;" /><span style="font-family: courier new,monospace;"> 19776 | <IDLE> |
t | 00:38:34.76833 | 2008-01-31 12:51:29.005744-08</span><br style="font-family: courier new,monospace;" /><span
style="font-family:courier new,monospace;"> 20356 | <IDLE> | f | 00:38:29.971425 |
2008-01-3113:17:37.617497-08</span><br style="font-family: courier new,monospace;" /><span style="font-family: courier
new,monospace;"> 19775 | <IDLE> | f | 00:38:27.187201 | 2008-01-31
12:51:28.999242-08</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier
new,monospace;"> 19774 | <IDLE> | f | 00:38:27.187068 | 2008-01-31
12:51:28.90554-08</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier
new,monospace;"> 20728 | <IDLE> | f | 00:14:03.913027 | 2008-01-31
13:36:11.345822-08</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier
new,monospace;"> 9727 | <IDLE> | f | 00:03:07.444273 | 2008-01-24
22:25:00.289931-08</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier
new,monospace;"> 9684 | <IDLE> | f | 00:00:07.704656 | 2008-01-24
22:22:00.007377-08</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier
new,monospace;"> 19390 | <IDLE> in transaction | f | 00:00:00.027585 | 2008-01-31
12:30:07.999246-08</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier
new,monospace;"> 19389 | <IDLE> in transaction | t | -00:00:00.000255 | 2008-01-31
12:30:07.973868-08</span><brstyle="font-family: courier new,monospace;" /><br style="font-family: courier
new,monospace;"/><span style="font-family: courier new,monospace;">select * from pg_locks where pid in ( 19776, 19389
);</span><brstyle="font-family: courier new,monospace;" /><br style="font-family: courier new,monospace;" /><span
style="font-family:courier new,monospace;"> locktype | database | relation | page | tuple | transactionid |
classid| objid | objsubid | transaction | pid | mode | granted</span><br style="font-family: courier
new,monospace;"/><span style="font-family: courier
new,monospace;">---------------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+-------+------------------+---------</span><br
style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> relation |
16584| 17070 | | | | | | | 3700350056 | 19389 | RowExclusiveLock |
t</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier
new,monospace;"> relation | 16584 | 17106 | | | | | | |
3700350056| 19389 | RowExclusiveLock | t</span><br style="font-family: courier new,monospace;" /><span
style="font-family:courier new,monospace;"> relation | 16584 | 17068 | | |
| | | | 3700350056 | 19389 | RowExclusiveLock | t</span><br style="font-family: courier
new,monospace;"/><span style="font-family: courier new,monospace;"> transactionid | | | |
| 3700350056 | | | | 3700350056 | 19389 | ExclusiveLock | t</span><br style="font-family:
couriernew,monospace;" /><span style="font-family: courier new,monospace;"> relation | 16584 | 17108 |
| | | | | | 3700350056 | 19389 | RowExclusiveLock | t</span><br
style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;">(5 rows)</span><br
style="font-family:courier new,monospace;" /><br clear="all" style="font-family: courier new,monospace;" /><br />
The'duration' column above is just now()-query_start. These are not just two instant snapshots, but we could see this
outputconsistently for quite long.<br /><br /> I tracked the 'waiting' column a little bit in the source code, and
sawthat it is actually generated from PgBackendStatus.st_waiting . Is it possible that, for some reason, postgres
forgotto update this for a backend?<br /><br /><span style="font-family: courier new,monospace;">select
version();</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier
new,monospace;"> version</span><br style="font-family: courier new,monospace;"
/><spanstyle="font-family: courier
new,monospace;">--------------------------------------------------------------------------------------------</span><br
style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;"> PostgreSQL 8.2.4 on
x86_64-unknown-linux-gnu,compiled by GCC gcc (GCC) 4.1.0 (SUSE Linux)</span><br style="font-family: courier
new,monospace;"/><br /> This issue has been seen twice now.<br /><br />-- <br />gurjeet[.singh]@EnterpriseDB.com<br
/>singh.gurjeet@{gmail | hotmail | indiatimes | yahoo }.com<br /><br />EnterpriseDB <a
href="http://www.enterprisedb.com">http://www.enterprisedb.com</a><br/><br />17° 29' 34.37"N, 78° 30' 59.76"E -
Hyderabad<br/>18° 32' 57.25"N, 73° 56' 25.42"E - Pune<br />37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *<br
/><br/><a href="http://gurjeet.frihost.net">http://gurjeet.frihost.net</a><br /><br />Mail sent from my BlackLaptop
device
"Gurjeet Singh" <singh.gurjeet@gmail.com> writes:
> I saw a strange behaviour on one of the production boxes. The
> pg_stat_activity shows a process as <IDLE> and yet 'waiting' !!! On top of
> it (understandably, since its IDLE), there are no entries for this pid in
> pg_locks!
Hmm, I can reproduce something like this by aborting a wait for lock.
It seems the problem is that WaitOnLock() is ignoring its own good
advice, assuming that it can do cleanup work after waiting.
regards, tom lane
The situation seems pretty bad!!
Here are the steps to reproduce in 'PostgreSQL 8.3beta2 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.3.3 (SuSE Linux)':
session 1: begin;
session 1: update test set a = 112 where a = 112;
session 2: update test set a = 113 where a = 112; --waits
session 1: select * from pg_stat_activity; -- executed this a few times before executing 'select version()' and then following:
session 1: <stat query1> -- see end of mail for this query
procpid | current_query | waiting | duration | backend_start
---------+----------------------------------------+---------+------------------+-------------------------------
12577 | update test set a = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08
11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08
(2 rows)
session 1: select * from pg_locks
locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted
---------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+---------------
-----+-------+------------------+---------
transactionid | | | | | | 390 | | | | 2/14 | 12577 | ShareLock | f
transactionid | | | | | | 390 | | | | 1/9 | 11975 | ExclusiveLock | t
<snip irrelevant>
Then,
session 2: ^C
Cancel request sent
ERROR: canceling statement due to user request
session1: <stat query1>
procpid | current_query | waiting | duration | backend_start
---------+----------------------------------------+---------+------------------+-------------------------------
12577 | update test set a = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08
11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08
(2 rows)
session 1: select * from pg_locks ;
<no traces of pid 12577>
session 1: select pg_backend_pid();
pg_backend_pid
----------------
11975
The last mentioned output of <stat query> shows session 1 executing 'select * from p_s_a', whereas the <stat query> _is_ being executed in session 1!!! This result is consistently returned for a while, and later...
session 2: select pg_backend_pid();
pg_backend_pid
----------------
12577
session 1: <stat query1>
procpid | current_query | waiting | duration | backend_start
---------+-----------------------+---------+-----------------+-------------------------------
11975 | <IDLE> in transaction | f | 00:06:08.671029 | 2008-02-01 13:30:40.396392-08
(1 row)
After a while again:
session 1: <stat query2> -- notice 2 not 1; 'select *' comes back to haunt!!!
procpid | current_query | waiting | duration | backend_start
---------+----------------------------------------+---------+------------------+-------------------------------
12577 | update test set a = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08
11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08
(2 rows)
session 1: <stat query1> -- 1 back in action
procpid | current_query | waiting | duration | backend_start
---------+----------------------------------------+---------+------------------+-------------------------------
12577 | update test set a = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08
11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08
(2 rows)
The <stat query1> is:
select
procpid, current_query::varchar(50), waiting, now() - query_start as duration, backend_start
from pg_stat_activity
where current_query <> '<IDLE>'
and current_query not like '%DONT COUNT ME1 %'
order by duration desc
limit 10;
The <stat query2> is:
select
procpid, current_query::varchar(50), waiting, now() - query_start as duration, backend_start
from pg_stat_activity
where current_query not like '%DONT COUNT ME1 %'
order by duration desc
limit 10;
Found more bugs than I was looking for, to reproduce!!!
The reporter also made an observation (on 8.2.4) that there were deadlocks detected at around the same time. Looked at WaitOnLock(), and clearly there's a problem, but is it at the same/only place we are suspecting it to be?
Best regards,
PS: Ran the <stat query>ies 1 and 2 again, just before hitting 'send', and the result is the same:
procpid | current_query | waiting | duration | backend_start
---------+----------------------------------------+---------+------------------+-------------------------------
12577 | update test set a = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08
11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08
(2 rows)
Clearly, there's something wrong
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
17° 29' 34.37"N, 78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N, 73° 56' 25.42"E - Pune
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *
http://gurjeet.frihost.net
Mail sent from my BlackLaptop device
Here are the steps to reproduce in 'PostgreSQL 8.3beta2 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.3.3 (SuSE Linux)':
session 1: begin;
session 1: update test set a = 112 where a = 112;
session 2: update test set a = 113 where a = 112; --waits
session 1: select * from pg_stat_activity; -- executed this a few times before executing 'select version()' and then following:
session 1: <stat query1> -- see end of mail for this query
procpid | current_query | waiting | duration | backend_start
---------+----------------------------------------+---------+------------------+-------------------------------
12577 | update test set a = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08
11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08
(2 rows)
session 1: select * from pg_locks
locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted
---------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+---------------
-----+-------+------------------+---------
transactionid | | | | | | 390 | | | | 2/14 | 12577 | ShareLock | f
transactionid | | | | | | 390 | | | | 1/9 | 11975 | ExclusiveLock | t
<snip irrelevant>
Then,
session 2: ^C
Cancel request sent
ERROR: canceling statement due to user request
session1: <stat query1>
procpid | current_query | waiting | duration | backend_start
---------+----------------------------------------+---------+------------------+-------------------------------
12577 | update test set a = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08
11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08
(2 rows)
session 1: select * from pg_locks ;
<no traces of pid 12577>
session 1: select pg_backend_pid();
pg_backend_pid
----------------
11975
The last mentioned output of <stat query> shows session 1 executing 'select * from p_s_a', whereas the <stat query> _is_ being executed in session 1!!! This result is consistently returned for a while, and later...
session 2: select pg_backend_pid();
pg_backend_pid
----------------
12577
session 1: <stat query1>
procpid | current_query | waiting | duration | backend_start
---------+-----------------------+---------+-----------------+-------------------------------
11975 | <IDLE> in transaction | f | 00:06:08.671029 | 2008-02-01 13:30:40.396392-08
(1 row)
After a while again:
session 1: <stat query2> -- notice 2 not 1; 'select *' comes back to haunt!!!
procpid | current_query | waiting | duration | backend_start
---------+----------------------------------------+---------+------------------+-------------------------------
12577 | update test set a = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08
11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08
(2 rows)
session 1: <stat query1> -- 1 back in action
procpid | current_query | waiting | duration | backend_start
---------+----------------------------------------+---------+------------------+-------------------------------
12577 | update test set a = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08
11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08
(2 rows)
The <stat query1> is:
select
procpid, current_query::varchar(50), waiting, now() - query_start as duration, backend_start
from pg_stat_activity
where current_query <> '<IDLE>'
and current_query not like '%DONT COUNT ME1 %'
order by duration desc
limit 10;
The <stat query2> is:
select
procpid, current_query::varchar(50), waiting, now() - query_start as duration, backend_start
from pg_stat_activity
where current_query not like '%DONT COUNT ME1 %'
order by duration desc
limit 10;
Found more bugs than I was looking for, to reproduce!!!
The reporter also made an observation (on 8.2.4) that there were deadlocks detected at around the same time. Looked at WaitOnLock(), and clearly there's a problem, but is it at the same/only place we are suspecting it to be?
Best regards,
PS: Ran the <stat query>ies 1 and 2 again, just before hitting 'send', and the result is the same:
procpid | current_query | waiting | duration | backend_start
---------+----------------------------------------+---------+------------------+-------------------------------
12577 | update test set a = 113 where a = 112; | t | -00:01:35.782881 | 2008-02-01 13:36:15.31027-08
11975 | select * from pg_stat_activity ; | f | -00:01:52.554697 | 2008-02-01 13:30:40.396392-08
(2 rows)
Clearly, there's something wrong
On Feb 1, 2008 8:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Gurjeet Singh" <singh.gurjeet@gmail.com> writes:Hmm, I can reproduce something like this by aborting a wait for lock.
> I saw a strange behaviour on one of the production boxes. The
> pg_stat_activity shows a process as <IDLE> and yet 'waiting' !!! On top of
> it (understandably, since its IDLE), there are no entries for this pid in
> pg_locks!
It seems the problem is that WaitOnLock() is ignoring its own good
advice, assuming that it can do cleanup work after waiting.
regards, tom lane
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
17° 29' 34.37"N, 78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N, 73° 56' 25.42"E - Pune
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *
http://gurjeet.frihost.net
Mail sent from my BlackLaptop device
"Gurjeet Singh" <singh.gurjeet@gmail.com> writes:
> The situation seems pretty bad!!
I think at least part of your problem is not understanding that a single
transaction sees a frozen snapshot of pg_stat_activity.
regards, tom lane
On Feb 1, 2008 3:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
It does! I assumed that pg_stat_activity produced the transaction-independent snapshot of internal memory structures! Is that the case with pg_locks too!? I hope not.
BTW, we cannot say that the pg_stat_activity behaves in a consistent manner (transactions-wise). From what I could infer, this view's results are frozen when you first query the view, not when the transaction started (which is how other (normal) relations behave). It's a bit confusing, and should be documented if this is the way it is intended to work; Something along the lines of : "In a transaction, this view will repeatedly show the same results that were returned by it's first invocation in the transaction." in a less confusing way :)
So we are back to the original problem... Canceling a 'waiting' transaction does not revert the session's 'waiting' state back to 'false' (consistently reproducible).
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
17° 29' 34.37"N, 78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N, 73° 56' 25.42"E - Pune
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *
http://gurjeet.frihost.net
Mail sent from my BlackLaptop device
"Gurjeet Singh" <singh.gurjeet@gmail.com> writes:> The situation seems pretty bad!!I think at least part of your problem is not understanding that a single
transaction sees a frozen snapshot of pg_stat_activity.
It does! I assumed that pg_stat_activity produced the transaction-independent snapshot of internal memory structures! Is that the case with pg_locks too!? I hope not.
BTW, we cannot say that the pg_stat_activity behaves in a consistent manner (transactions-wise). From what I could infer, this view's results are frozen when you first query the view, not when the transaction started (which is how other (normal) relations behave). It's a bit confusing, and should be documented if this is the way it is intended to work; Something along the lines of : "In a transaction, this view will repeatedly show the same results that were returned by it's first invocation in the transaction." in a less confusing way :)
So we are back to the original problem... Canceling a 'waiting' transaction does not revert the session's 'waiting' state back to 'false' (consistently reproducible).
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
17° 29' 34.37"N, 78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N, 73° 56' 25.42"E - Pune
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *
http://gurjeet.frihost.net
Mail sent from my BlackLaptop device
I wrote:
> "Gurjeet Singh" <singh.gurjeet@gmail.com> writes:
>> I saw a strange behaviour on one of the production boxes. The
>> pg_stat_activity shows a process as <IDLE> and yet 'waiting' !!! On top of
>> it (understandably, since its IDLE), there are no entries for this pid in
>> pg_locks!
> Hmm, I can reproduce something like this by aborting a wait for lock.
> It seems the problem is that WaitOnLock() is ignoring its own good
> advice, assuming that it can do cleanup work after waiting.
I've committed a fix for this. (Too late for 8.3.0, unfortunately.)
regards, tom lane
On Feb 2, 2008 2:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thanks. Like 8.2, can it not be back-patched on 8.3 too?
I just looked at the patch... Isn't PG_TRY() an expensive call to make in the lock.c code? I was thinking of registering a Xact callback using RegisterXactCallback() and performing 'waiting' reset in that callback if the Xact event is XACT_EVENT_ABORT.
That would have been compliant with the previous comments ('if we fail, any cleanup must happen in xact abort processing, not here').
Comments.
Best regards,
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
17° 29' 34.37"N, 78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N, 73° 56' 25.42"E - Pune
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *
http://gurjeet.frihost.net
Mail sent from my BlackLaptop device
I've committed a fix for this. (Too late for 8.3.0, unfortunately.)I wrote:
> "Gurjeet Singh" <singh.gurjeet@gmail.com> writes:
>> I saw a strange behaviour on one of the production boxes. The
>> pg_stat_activity shows a process as <IDLE> and yet 'waiting' !!! On top of
>> it (understandably, since its IDLE), there are no entries for this pid in
>> pg_locks!
> Hmm, I can reproduce something like this by aborting a wait for lock.
> It seems the problem is that WaitOnLock() is ignoring its own good
> advice, assuming that it can do cleanup work after waiting.
Thanks. Like 8.2, can it not be back-patched on 8.3 too?
I just looked at the patch... Isn't PG_TRY() an expensive call to make in the lock.c code? I was thinking of registering a Xact callback using RegisterXactCallback() and performing 'waiting' reset in that callback if the Xact event is XACT_EVENT_ABORT.
That would have been compliant with the previous comments ('if we fail, any cleanup must happen in xact abort processing, not here').
Comments.
Best regards,
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
17° 29' 34.37"N, 78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N, 73° 56' 25.42"E - Pune
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *
http://gurjeet.frihost.net
Mail sent from my BlackLaptop device
Gurjeet Singh escribió: > I just looked at the patch... Isn't PG_TRY() an expensive call to make in > the lock.c code? I was thinking of registering a Xact callback using > RegisterXactCallback() and performing 'waiting' reset in that callback if > the Xact event is XACT_EVENT_ABORT. PG_TRY is not expensive as all that -- it's just a sigsetjmp() call and another stack frame. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Feb 2, 2008 3:27 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote:
Thats why I asked. I assumed that creating stacks was expensive. Isn't this the reason the compilers came up with the function inline capability; to avoid stacks, because they can be expensive. Or am I confusing two different types of stacks!
Moreover, calling a callback, once in a while (only upon XACT abort), may prove to be much cheaper than setting up an additional stack on every lock-acquire call.
Really, my 2 cents.
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
17° 29' 34.37"N, 78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N, 73° 56' 25.42"E - Pune
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *
http://gurjeet.frihost.net
Mail sent from my BlackLaptop device
Gurjeet Singh escribió:PG_TRY is not expensive as all that -- it's just a sigsetjmp() call and
> I just looked at the patch... Isn't PG_TRY() an expensive call to make in
> the lock.c code? I was thinking of registering a Xact callback using
> RegisterXactCallback() and performing 'waiting' reset in that callback if
> the Xact event is XACT_EVENT_ABORT.
another stack frame.
Thats why I asked. I assumed that creating stacks was expensive. Isn't this the reason the compilers came up with the function inline capability; to avoid stacks, because they can be expensive. Or am I confusing two different types of stacks!
Moreover, calling a callback, once in a while (only upon XACT abort), may prove to be much cheaper than setting up an additional stack on every lock-acquire call.
Really, my 2 cents.
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com
EnterpriseDB http://www.enterprisedb.com
17° 29' 34.37"N, 78° 30' 59.76"E - Hyderabad
18° 32' 57.25"N, 73° 56' 25.42"E - Pune
37° 47' 19.72"N, 122° 24' 1.69" W - San Francisco *
http://gurjeet.frihost.net
Mail sent from my BlackLaptop device
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Gurjeet Singh escribi�:
>> I just looked at the patch... Isn't PG_TRY() an expensive call to make in
>> the lock.c code? I was thinking of registering a Xact callback using
>> RegisterXactCallback() and performing 'waiting' reset in that callback if
>> the Xact event is XACT_EVENT_ABORT.
> PG_TRY is not expensive as all that -- it's just a sigsetjmp() call and
> another stack frame.
Also, since we're about to block here, shaving microseconds is not all
that important.
The reason I did it that way was to avoid having to export the saved
ps-display string out to someplace LockWaitCancel could find it.
regards, tom lane