Обсуждение: log_lock_waits to identify transaction's relation
When there is contention between concurrent transactions it shows up as with log_lock_waits as LOG: process %d acquired %s on %s transaction %u after %ld.%03d ms Which is mostly useless for identifying and eliminating the contention in the application since you can't tell which table is causing problems. This only occurs for lock waits on transactions, not on other lock types, such as tuples which are much more informative, for example LOG: process %d acquired %s on tuple(x,y) of relation %u of database %u after %ld.%03d ms Attached patch passes through further information about the lock wait, so we can display the following message instead LOG: process %d acquired %s on transaction %u on relation %u of database %u after %ld.%03d ms This should help identify benefit/loss from the FKlocks feature, since MultiXactIdWait() shows up in the log as contention of this type. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
Simon, * Simon Riggs (simon@2ndQuadrant.com) wrote: > Attached patch passes through further information about the lock wait, > so we can display the following message instead > LOG: process %d acquired %s on transaction %u on relation %u of > database %u after %ld.%03d ms I love this idea. Please take these comments as an initial/quick review because I'd really like to see this get in. A couple quick notes regarding the patch- what does GetXactLockTableRelid really provide..? This patch does use it outside of lmgr.c, where XactLockTableRelid is defined. Second, do we really need to describeXact bool to DescribeLockTag..? Strikes me as unnecessary- if the information is available, include it. Lastly, I really don't like the changes made to XactLockTableWait() and friends. They had a very clearly defined and simple goal previously and the additional parameter seems to muddy things a bit, particularly without any comments about what it's all about or why it's there. I understand that you're trying to pass the necessary information down to where the log is generated, but it doesn't feel quite right. Also, what about VirtualXactLockTableWait()..? Should that have a similar treatment? Thanks! Stephen
On 16 January 2013 03:47, Stephen Frost <sfrost@snowman.net> wrote: > Simon, > > * Simon Riggs (simon@2ndQuadrant.com) wrote: >> Attached patch passes through further information about the lock wait, >> so we can display the following message instead >> LOG: process %d acquired %s on transaction %u on relation %u of >> database %u after %ld.%03d ms > > I love this idea. Please take these comments as an initial/quick review > because I'd really like to see this get in. It's there as a way to prove FKlocks works, or not, and to help that get committed. It's possible we reject this in this CF, since I see it as low priority anyway. > A couple quick notes regarding the patch- what does > GetXactLockTableRelid really provide..? The ability to access a static variable in a different module. It doesn't provide anything other than that, > This patch does use it outside > of lmgr.c, where XactLockTableRelid is defined. Second, do we really > need to describeXact bool to DescribeLockTag..? Strikes me as > unnecessary- if the information is available, include it. The information isn't available, which is why I didn't include it. Comments explain that the additional information is only available within the backend that was waiting. That's sufficient for stats, but not enough for an extended multi-backend deadlock report. > Lastly, I really don't like the changes made to XactLockTableWait() and > friends. They had a very clearly defined and simple goal previously and > the additional parameter seems to muddy things a bit, particularly > without any comments about what it's all about or why it's there. I > understand that you're trying to pass the necessary information down to > where the log is generated, but it doesn't feel quite right. There is an API change to XactLockTableWait() to pass the relid. That part is essential to any solution. There are 2 possible designs for this... (1) Simple - we remember the relid we are waiting on in a backend-only variable so it can be used later when generating stats. (2) Complex - record the additional information within the unused fields of the locktag of a TRANSACTION lock, so they will be stored and accessible by all, within the shared lock table itself. This would require changing the hash function for transaction lock tags so that only the key and not the additional payload items get hashed. That seemed a much more contentious change. > Also, what > about VirtualXactLockTableWait()..? Should that have a similar > treatment? That isn't needed, IMHO. We wait on Xids whenever we see them on heap tuples. Vxids aren't recorded anywhere, so they don't offer the same blockage, which is basically for visibility checks on index creation. That is more obvious. One might argue its needed as well, but I would see it as a separate patch. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
* Simon Riggs (simon@2ndquadrant.com) wrote: > > A couple quick notes regarding the patch- what does > > GetXactLockTableRelid really provide..? > > The ability to access a static variable in a different module. It > doesn't provide anything other than that, It isn't actually necessary for that currently though, is it? All calls to it appeared to be inside the same module. Do you anticipate that needing to change in the future? > The information isn't available, which is why I didn't include it. > Comments explain that the additional information is only available > within the backend that was waiting. That's sufficient for stats, but > not enough for an extended multi-backend deadlock report. Right, sorry, to be clear, it seemed that we could simply detect if the information is available and print it out if it is- and not print it if it isn't. The additional boolean variable looked to be unnecessary, or is there a point where OidIsValid(GetXactLockTableRelid()) will be true but the additional information shouldn't be printed or isn't actually available? > There is an API change to XactLockTableWait() to pass the relid. That > part is essential to any solution. I'm wondering if we could pass in something both more generic and with more information- perhaps a simple string which is constructed when we have more information and is then included in the log message which is generated? Perhaps something which includes both the relname and the OID? Could that be used across more than just these kinds of locks? Of course, on the flip side, I understand that we don't want to spend a lot of time building strings and whatnot during all of these code paths; would be nice if we could defer that until we're about to log. Regardless, I do think it'd be good to update the comments to indicate that the relid is being passed in solely for the purpose of being available to be included in the log, if necessary. > (2) Complex - record the additional information within the unused > fields of the locktag of a TRANSACTION lock, so they will be stored > and accessible by all, within the shared lock table itself. This would > require changing the hash function for transaction lock tags so that > only the key and not the additional payload items get hashed. That > seemed a much more contentious change. As the lock table is itself contended for greatly, I agree that we probably don't want to go changing it. It would be nice if we had a way to associate additional meta-data with why a lock is being attempted though and for logging it. > That isn't needed, IMHO. We wait on Xids whenever we see them on heap > tuples. Vxids aren't recorded anywhere, so they don't offer the same > blockage, which is basically for visibility checks on index creation. > That is more obvious. One might argue its needed as well, but I would > see it as a separate patch. Alright. Thanks, Stephen
On 16 January 2013 16:12, Stephen Frost <sfrost@snowman.net> wrote: > * Simon Riggs (simon@2ndquadrant.com) wrote: >> > A couple quick notes regarding the patch- what does >> > GetXactLockTableRelid really provide..? >> >> The ability to access a static variable in a different module. It >> doesn't provide anything other than that, > > It isn't actually necessary for that currently though, is it? All calls > to it appeared to be inside the same module. Do you anticipate that > needing to change in the future? OK, will change. >> The information isn't available, which is why I didn't include it. >> Comments explain that the additional information is only available >> within the backend that was waiting. That's sufficient for stats, but >> not enough for an extended multi-backend deadlock report. > > Right, sorry, to be clear, it seemed that we could simply detect if the > information is available and print it out if it is- and not print it if > it isn't. The additional boolean variable looked to be unnecessary, or > is there a point where OidIsValid(GetXactLockTableRelid()) will be true > but the additional information shouldn't be printed or isn't actually > available? You might be reporting a deadlock in another backend when you are waiting on a transaction yourself. If we don't address that we'll end up reporting the wrong info in a way that makes it look real and probably never even notice. >> There is an API change to XactLockTableWait() to pass the relid. That >> part is essential to any solution. > > I'm wondering if we could pass in something both more generic and with > more information- perhaps a simple string which is constructed when we > have more information and is then included in the log message which is > generated? Perhaps something which includes both the relname and the > OID? Could that be used across more than just these kinds of locks? Of > course, on the flip side, I understand that we don't want to spend a lot > of time building strings and whatnot during all of these code paths; > would be nice if we could defer that until we're about to log. Keeping it as an Oid is better, since we don't always need the string. > Regardless, I do think it'd be good to update the comments to indicate > that the relid is being passed in solely for the purpose of being > available to be included in the log, if necessary. OK, will add. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services