Re: [Patch] Optimize dropping of relation buffers using dlist

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: [Patch] Optimize dropping of relation buffers using dlist
Дата
Msg-id 63fad8ea-6e16-2212-63b8-781e874b6cc9@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [Patch] Optimize dropping of relation buffers using dlist  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: [Patch] Optimize dropping of relation buffers using dlist  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers

On 07.08.2020 00:33, Tomas Vondra wrote:
>
> Unfortunately Konstantin did not share any details about what workloads
> he tested, what config etc. But I find the "no regression" hypothesis
> rather hard to believe, because we're adding non-trivial amount of code
> to a place that can be quite hot.

Sorry, that I have not explained  my test scenarios.
As far as Postgres is pgbench-oriented database:) I have also used pgbench:
read-only case and sip-some updates.
For this patch most critical is number of buffer allocations,
so I used small enough database (scale=100), but shared buffer was set 
to 1Gb.
As a result, all data is cached in memory (in file system cache), but 
there is intensive swapping at Postgres buffer manager level.
I have tested it both with relatively small (100) and large (1000) 
number of clients.
I repeated this tests at my notebook (quadcore, 16Gb RAM, SSD) and IBM 
Power2 server with about 380 virtual cores  and about 1Tb of memory.
I the last case results are vary very much I think because of NUMA 
architecture) but I failed to find some noticeable regression of patched 
version.


But I have to agree that adding parallel hash (in addition to existed 
buffer manager hash) is not so good idea.
This cache really quite frequently becomes bottleneck.
My explanation of why I have not observed some noticeable regression was 
that this patch uses almost the same lock partitioning schema
as already used it adds not so much new conflicts. May be in case of 
POwer2 server, overhead of NUMA is much higher than other factors
(although shared hash is one of the main thing suffering from NUMA 
architecture).
But in principle I agree that having two independent caches may decrease 
speed up to two times  (or even more).

I hope that everybody will agree that this problem is really critical. 
It is certainly not the most common case when there are hundreds of 
relation which are frequently truncated. But having quadratic complexity 
in drop function is not acceptable from my point of view.
And it is not only recovery-specific problem, this is why solution with 
local cache is not enough.

I do not know good solution of the problem. Just some thoughts.
- We can somehow combine locking used for main buffer manager cache (by 
relid/blockno) and cache for relid. It will eliminates double locking 
overhead.
- We can use something like sorted tree (like std::map) instead of hash 
- it will allow to locate blocks both by relid/blockno and by relid only.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Sharma
Дата:
Сообщение: Re: recovering from "found xmin ... from before relfrozenxid ..."
Следующее
От: Ashutosh Sharma
Дата:
Сообщение: Re: recovering from "found xmin ... from before relfrozenxid ..."