Re: Improvement of checkpoint IO scheduler for stable transaction responses

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Improvement of checkpoint IO scheduler for stable transaction responses
Дата
Msg-id 00a201ce82e4$76387950$62a96bf0$@kapila@huawei.com
обсуждение исходный текст
Ответ на Re: Improvement of checkpoint IO scheduler for stable transaction responses  (Ants Aasma <ants@cybertec.at>)
Список pgsql-hackers
On Tuesday, July 16, 2013 10:16 PM Ants Aasma wrote:
> On Jul 14, 2013 9:46 PM, "Greg Smith" <greg@2ndquadrant.com> wrote:
> > I updated and re-reviewed that in 2011:
> http://www.postgresql.org/message-id/4D31AE64.3000202@2ndquadrant.com
> and commented on why I think the improvement was difficult to reproduce
> back then.  The improvement didn't follow for me either.  It would take
> a really amazing bit of data to get me to believe write sorting code is
> worthwhile after that.  On large systems capable of dirtying enough
> blocks to cause a problem, the operating system and RAID controllers
> are already sorting block.  And *that* sorting is also considering
> concurrent read requests, which are a lot more important to an
> efficient schedule than anything the checkpoint process knows about.
> The database doesn't have nearly enough information yet to compete
> against OS level sorting.
> 
> That reasoning makes no sense. OS level sorting can only see the
> writes in the time window between PostgreSQL write, and being forced
> to disk. Spread checkpoints sprinkles the writes out over a long
> period and the general tuning advice is to heavily bound the amount of
> memory the OS willing to keep dirty. This makes probability of
> scheduling adjacent writes together quite low, the merging window
> being limited either by dirty_bytes or dirty_expire_centisecs. The
> checkpointer has the best long term overview of the situation here, OS
> scheduling only has the short term view of outstanding read and write
> requests. By sorting checkpoint writes it is much more likely that
> adjacent blocks are visible to OS writeback at the same time and will
> be issued together.

I think Oracle also use similar concept for making writes efficient, and
they have patent also for this technology which you can find at below link:
http://www.google.com/patents/US7194589?dq=645987&hl=en&sa=X&ei=kn7mUZ-PIsWq
rAe99oDgBw&sqi=2&pjf=1&ved=0CEcQ6AEwAw

Although Oracle has different concept for performing checkpoint writes, but
I thought of sharing the above link with you, so that unknowingly we should
not go into wrong path. 

AFAIK instead of depending on OS buffers, they use direct I/O and infact in
the patent above they are using temporary buffer (Claim 3) to sort the
writes which is not the same idea as far as I can understand by reading
above thread.

With Regards,
Amit Kapila.




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Smith
Дата:
Сообщение: Re: Improvement of checkpoint IO scheduler for stable transaction responses
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)