Re: Improvement of checkpoint IO scheduler for stable transaction responses

Поиск

Список

Период

Сортировка

От	Jeff Janes
Тема	Re: Improvement of checkpoint IO scheduler for stable transaction responses
Дата	15 июля 2013 г. 01:08:20
Msg-id	CAMkU=1zi=+mqpcBjzeHcho8bsBENA6Fr+pujNti+js+RqwptNw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Improvement of checkpoint IO scheduler for stable transaction responses (Greg Smith <greg@2ndQuadrant.com>)
Список	pgsql-hackers

Дерево обсуждения

On Sunday, July 14, 2013, Greg Smith wrote:

On 6/27/13 11:08 AM, Robert Haas wrote:
I'm pretty sure Greg Smith tried it the fixed-sleep thing before and
it didn't work that well.

That's correct, I spent about a year whipping that particular horse and submitted improvements on it to the community. http://www.postgresql.org/message-id/4D4F9A3D.5070700@2ndquadrant.com and its updates downthread are good ones to compare this current work against.

The important thing to realize about just delaying fsync calls is that it *cannot* increase TPS throughput. Not possible in theory, obviously doesn't happen in practice. The most efficient way to write things out is to delay those writes as long as possible. The longer you postpone a write, the more elevator sorting and write combining you get out of the OS. This is why operating systems like Linux come tuned for such delayed writes in the first place. Throughput and latency are linked; any patch that aims to decrease latency will probably slow throughput.

Do common low level IO benchmarking tools cover this territory? I've looked at Bonnie, which seems to be the most famous one, and it doesn't look like it covers effectiveness of write combining at all.

I've done my own casual benchmarking, and the results were astonishingly bad for the OS/FS. If I over-wrote 1024*1024 blocks of 8KB in random order and then fsynced the 8GB of data (divided into 8x1GB files, in deference to PG segment size) it took way longer than if I did the overwrite in block order and then fsynced that. This was a gift-horse machine not speced out to be a database server, but the linux kernel is still the kernel regardless of the hardware it sits on so I don't how much that should matter. To be clear, the writes did not take longer, it was the fsyncs that took longer. All writes were successfully absorbed into memory promptly. Alas, I no longer have access to a machine which can absorb 8GB of writes into RAM without thinking twice and which I can use for casual experimentation.

Cheers,

Jeff

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Improvement of checkpoint IO scheduler for stable transaction responses