Re: Why we are going to have to go DirectIO

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: Why we are going to have to go DirectIO
Дата
Msg-id 52A4E0F5.1090008@nasby.net
обсуждение исходный текст
Ответ на Re: Why we are going to have to go DirectIO  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 12/5/13 9:59 AM, Tom Lane wrote:
> Greg Stark <stark@mit.edu> writes:
>> I think the way to use mmap would be to mmap very large chunks,
>> possibly whole tables. We would need some way to control page flushes
>> that doesn't involve splitting mappings and can be efficiently
>> controlled without having the kernel storing arbitrarily large tags on
>> page tables or searching through all the page tables to mark pages
>> flushable.
>
> I might be missing something, but AFAICS mmap's API is just fundamentally
> wrong for this.  The kernel is allowed to write-back a modified mmap'd
> page to the underlying file at any time, and will do so if say it's under
> memory pressure.  You can tell the kernel to sync now, but you can't tell
> it *not* to sync.  I suppose you are thinking that some wart could be
> grafted onto that API to reverse that, but I wouldn't have a lot of
> confidence in it.  Any VM bug that caused the kernel to sometimes write
> too soon would result in nigh unfindable data consistency hazards.

Something else to ponder on... a Segate researcher gave a talk on upcoming hard drive technology it RICON East this
spring.The interesting bit is that 1 or 2 generations down the road HDs will start using "shingling": The write head
hasto be bigger than the read head, so they're going to set it up so you can not modify a range of tracks after they've
beenwritten. They'll do this by keeping a journal inside the HD. This is somewhat similar to how SSDs work too (you can
onlyerase large pages of data, you can't update individual bytes/sectors/filesystem blocks.
 

So long-term, random access updates to permanent storage will be less efficient than today. (Of course, non-volatile
memorycould turn all this on it's head..)
 
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Mark Kirkwood
Дата:
Сообщение: Re: ANALYZE sampling is too good
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Possible work-around for 9.1 partial vacuum bug?