On 2020/02/05 17:13, Julien Rouhaud wrote:
> On Wed, Feb 05, 2020 at 02:43:49PM +0900, Fujii Masao wrote:
>> Hi,
>>
>> When I compiled PostgreSQL with -DLWLOCK_STATS and tried to check
>> the statistics of light-weight locks, I observed that more than one
>> statistics entries were output *for the same backend process and
>> the same lwlock*. For example, I got the following four statistics
>> when I checked how the process with PID 81141 processed ProcArrayLock.
>> This is strange, and IMO only one statistics should be output for
>> the same backend process and lwlock.
>>
>> $ grep "PID 81141 lwlock ProcArrayLock" data/log/postgresql-2020-02-05_141842.log
>> PID 81141 lwlock ProcArrayLock 0x111e87780: shacq 4000 exacq 0 blk 0 spindelay 0 dequeue self 0
>> PID 81141 lwlock ProcArrayLock 0x111e87780: shacq 2 exacq 0 blk 0 spindelay 0 dequeue self 0
>> PID 81141 lwlock ProcArrayLock 0x111e87780: shacq 6001 exacq 1 blk 0 spindelay 0 dequeue self 0
>> PID 81141 lwlock ProcArrayLock 0x111e87780: shacq 5 exacq 1 blk 0 spindelay 0 dequeue self 0
>>
>> The cause of this issue is that the key variable used for lwlock hash
>> table was not fully initialized. The key consists of two fields and
>> they are initialized as follows. But the following 4 bytes allocated
>> for the alignment was not initialized. So even if the same key was
>> specified, hash_search(HASH_ENTER) could not find the existing
>> entry for that key and created new one.
>>
>> key.tranche = lock->tranche;
>> key.instance = lock;
>>
>> Attached is the patch fixing this issue by initializing the key
>> variable with zero. In the patched version, I confirmed that only one
>> statistics is output for the same process and the same lwlock.
>> Also this patch would reduce the volume of lwlock statistics
>> very much.
>>
>> This issue was introduced by commit 3761fe3c20. So the patch needs
>> to be back-patch to v10.
Pushed.
> Good catch! The patch looks good to me. Just in case I looked at other users
> of HASH_BLOBS and AFAICT there's no other cases of key that can contain padding
> bytes that aren't memset first.
Thanks Julien and Horiguchi-san for reviewing the patch
and checking other cases!
Regards,
--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters