Re: [HACKERS] Clock with Adaptive Replacement

Поиск
Список
Период
Сортировка
От Юрий Соколов
Тема Re: [HACKERS] Clock with Adaptive Replacement
Дата
Msg-id CAL-rCA1fB01OuK26DOHrCnysLyCmg+OvfnH=RupvdMva14o-Gg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Clock with Adaptive Replacement  (Andrey Borodin <x4mmm@yandex-team.ru>)
Ответы Re: [HACKERS] Clock with Adaptive Replacement  (Andrey Borodin <x4mmm@yandex-team.ru>)
Список pgsql-hackers
вт, 24 апр. 2018 г., 8:04 Andrey Borodin <x4mmm@yandex-team.ru>:
Hi, Thomas!

> 24 апр. 2018 г., в 2:41, Thomas Munro <thomas.munro@enterprisedb.com> написал(а):
>
> On Fri, Feb 12, 2016 at 10:02 AM, Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> Are there some well known drawbacks of this approach or it will be
>> interesting to adopt this algorithm to PostgreSQL and measure it impact om
>> performance under different workloads?
>
> I'm not currently planning to work in this area and have done no real
> investigation, so please consider the following to be "water cooler
> talk".

I've intention to make some prototypes in this area, but still I hadn't allocated any time chunks sufficient enough to make anything usefull.

I think that replacement of current CS5 will:
1. allow use of big shared buffers
2. make DIRECT_IO realistic possibility
3. improve BgWriter
4. unify different buffering strategies into single buffer manager (there will be no need in placing VACUUM into special buffer ring)
5. finally allow aio and more efficient prefetching like [0]

Here's what we have about size of shared buffer now [1] (taken from [2]). I believe right hill must be big enough to reduce central pit to zero and make function monotonic: OS page cache knows less about data blocks and is expected to be less efficient.


I'm not sure CART is the best possibility, though.
I think that the right way is to implement many prototypes with LRU, ARC, CAR, CART, and 2Q. Then, benchmark them well. Or even make this algorithm pluggable? But currently we have a lot of dependent parts in the system. I do not even know where to start.


Best regards, Andrey Borodin.


[0] http://diku.dk/forskning/Publikationer/tekniske_rapporter/2004/04-03.pdf
[1] https://4.bp.blogspot.com/-_Zz6X-e9_ok/WlaIvpStBmI/AAAAAAAAAA4/E1NwV-_82-oS5KfmyjoOff_IxUXiO96WwCLcBGAs/s1600/20180110-PTI.png
[2] http://blog.dataegret.com/2018/01/postgresql-performance-meltdown.html

Before implementing algorithms within PostgreSQL it will be great to test them outside with real traces.

I think, there should be mechamism to collect traces from real-world postgresql instalations: every read and write access. It should be extremely eficient to be enabled in real world. Something like circular buffer in shared memory, and separate worker to dump it to disk.
Instead of full block address, 64bit hash could be used. Even 63bit + 1bit to designate read/write access.

Using these traces, it will be easy to choose couple of "theoretically" best algorithms, and then attempt to implement them. 

With regards, 
Yura

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Gasper Zejn
Дата:
Сообщение: Re: community bonding
Следующее
От: Noah Misch
Дата:
Сообщение: Re: "could not reattach to shared memory" on buildfarm member dory