On Mon, Jun 11, 2012 at 9:30 PM, Amit Kapila <amit.kapila@huawei.com> wrote:
>> Yes, that means the list has over-flowed. Once it is over-flowed, it
>> is now invalid for the reminder of the life of the resource owner.
> Don't we need any logic to clear the reference of locallock in owner->locks
> array.
I don't think so. C doesn't ref count its pointers.
> MAX_RESOWNER_LOCKS - How did you arrive at number 10 for it. Is there any
> specific reason for 10.
I instrumented the code to record the maximum number of locks held by
a resource owner, and report the max when it was destroyed. (That
code is not in this patch). During a large pg_dump, the vast majority
of the resource owners had maximum locks of 2, with some more at 4
and 6. Then there was one resource owner, for the top-level
transaction, at tens or hundreds of thousands (basically one for every
lockable object). There was little between 6 and this top-level
number, so I thought 10 was a good compromise, safely above 6 but not
so large that searching through the list itself was likely to bog
down.
Also, Tom independently suggested the same number.
>> Should it emit a FATAL rather than an ERROR? I thought ERROR was
>> sufficient to make the backend quit, as it is not clear how it could
>> meaningfully recover.
>
> I am not able to visualize any valid scenario in which it can happen unless
> some corruption happens.
> If this happens, user can close all statements and abort its transactions.
> According to me ERROR is okay. However in the message "Can't find lock to
> remove", it could be better,
> if there is information about resource owner and lock.
I think we might end up changing that entirely once someone more
familiar with the error handling mechanisms takes a look at it. I
don't think that lock tags have good human readable formats, and just
a pointer dump probably wouldn't be much use when something that can
never happen has happened. But I'll at least add a reference to the
resource owner if this stays in.
Thanks,
Jeff