Re: BufferAlloc: don't take two simultaneous locks

Поиск
Список
Период
Сортировка
От Yura Sokolov
Тема Re: BufferAlloc: don't take two simultaneous locks
Дата
Msg-id 75fe5128f2550bf0d198ed222f1abd6f921aec77.camel@postgrespro.ru
обсуждение исходный текст
Ответ на Re: BufferAlloc: don't take two simultaneous locks  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Ответы Re: BufferAlloc: don't take two simultaneous locks
Список pgsql-hackers
В Пн, 14/03/2022 в 14:31 +0900, Kyotaro Horiguchi пишет:
> At Mon, 14 Mar 2022 09:39:48 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> > I'll examine the possibility to resolve this...
> 
> The existence of nfree and nalloc made me confused and I found the
> reason.
> 
> In the case where a parittion collects many REUSE-ASSIGN-REMOVEed
> elemetns from other paritiotns, nfree gets larger than nalloced.  This
> is a strange point of the two counters.  nalloced is only referred to
> as (sum(nalloced[])).  So we don't need nalloced per-partition basis
> and the formula to calculate the number of used elements would be as
> follows.
> 
>  sum(nalloced - nfree)
>  = <total_nalloced> - sum(nfree)
> 
> We rarely create fresh elements in shared hashes so I don't think
> there's additional contention on the <total_nalloced> even if it were
> a global atomic.
> 
> So, the remaining issue is the possible imbalancement among
> partitions.  On second thought, by the current way, if there's a bad
> deviation in partition-usage, a heavily hit partition finally collects
> elements via get_hash_entry().  By the patch's way, similar thing
> happens via the REUSE-ASSIGN-REMOVE sequence. But buffers once used
> for something won't be freed until buffer invalidation. But bulk
> buffer invalidation won't deviatedly distribute freed buffers among
> partitions.  So I conclude for now that is a non-issue.
> 
> So my opinion on the counters is:
> 
> I'd like to ask you to remove nalloced from partitions then add a
> global atomic for the same use?

I really believe it should be global. I made it per-partition to
not overcomplicate first versions. Glad you tell it.

I thought to protect it with freeList[0].mutex, but probably atomic
is better idea here. But which atomic to chose: uint64 or uint32?
Based on sizeof(long)?
Ok, I'll do in next version.

Whole get_hash_entry look strange.
Doesn't it better to cycle through partitions and only then go to
get_hash_entry?
May be there should be bitmap for non-empty free lists? 32bit for
32 partitions. But wouldn't bitmap became contention point itself?

> No need to do something for the possible deviation issue.

-------

regards
Yura Sokolov




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: BufferAlloc: don't take two simultaneous locks
Следующее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: [PATCH] Accept IP addresses in server certificate SANs