Re: Speedup truncations of temporary relation forks

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: Speedup truncations of temporary relation forks
Дата
Msg-id 6c57ec58-f317-4f76-a45d-8e62d042595c@oss.nttdata.com
обсуждение исходный текст
Ответ на Speedup truncations of temporary relation forks  (Daniil Davydov <3danissimo@gmail.com>)
Ответы Re: Speedup truncations of temporary relation forks
Список pgsql-hackers

On 2025/06/02 18:06, Yura Sokolov wrote:
> 31.05.2025 17:23, Daniil Davydov пишет:
>> Hi,
>>
>> On Sat, May 31, 2025 at 7:41 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>>
>>> Here are a few review comments on the patch:
>>>
>>> +               for (j = 0; j < nforks; j++)
>>>                  {
>>> -                       InvalidateLocalBuffer(bufHdr, true);
>>> +                       if ((buf_state & BM_TAG_VALID) &&
>>> +                               BufTagGetForkNum(&bufHdr->tag) == forkNum[j] &&
>>> +                               bufHdr->tag.blockNum >= firstDelBlock[j])
>>> +                       {
>>> +                               InvalidateLocalBuffer(bufHdr, true);
>>> +                       }
>>>
>>> It looks like the "buf_state & BM_TAG_VALID" check can be moved
>>> outside the loop, along with the BufTagMatchesRelFileLocator() check.
>>> That would avoid unnecessary looping.
>>>
>>> Also, should we add a "break" right after calling InvalidateLocalBuffer()?
>>> Since the buffer has already been invalidated, continuing the loop
>>> may not be necessary.
>>
>> Thanks for the review! I'll fix both remarks. Please see the v2 patch.
> 
> Excuse me for disturbing...
> Wouldn't it be more efficient if we change search data structure for local
> buffers?
> Instead of hash table for RelFileLocator+forknum+BlockNumber it could be
> hash table for RelFileLocator+forknum + included datastructure for
> BlockNumber (hash table or radix tree). Then there will no be need to
> iterate whole local buffers for each relation.
> 
> Given local buffers are not target for concurrent access, both hash tables
> could be implemented using simplehash. It will compensate two-stage lookup,
> given dynahash is much slower than simplehash.

I'm not sure how much this approach improves performance, but it might
be worth trying. If it proves effective, it would also make sense to
apply it to shared buffers, since it's typically larger and takes longer
to scan than local buffers.

Regardless, I think we should go ahead and apply the current patch.
If your approach shows a noticeable performance gain, we can consider
adding it as a follow-up.

Regards,

-- 
Fujii Masao
NTT DATA Japan Corporation




В списке pgsql-hackers по дате отправления: