Re: PERFORMANCE IMPROVEMENT by mapping WAL FILES

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: PERFORMANCE IMPROVEMENT by mapping WAL FILES
Дата
Msg-id 200110121735.f9CHZPN09243@candle.pha.pa.us
обсуждение исходный текст
Ответ на Re: PERFORMANCE IMPROVEMENT by mapping WAL FILES  (Janardhana Reddy <jana-reddy@mediaring.com.sg>)
Список pgsql-hackers
I have added this to TODO.detail/mmap.

>      I have just  completed the functional testing  the WAL using mmap  , it is
> 
>  working  fine,  I  have tested  by commenting out the  "CreateCheckPoint "
> functionality so that
>    when i kill the postgres and restart it will redo all the records from the
> WAL log file  which
>   is updated  using mmap.
>      Just i need  to  clean code and to do some stress testing.
>  By the end of this week i should able to  complete  the stress test  and
> generate the patch file .
>     As Tom Lane mentioned  i see the  problem in portability  to all platforms,
> 
>       what i propose is to use mmap for only WAL  for some platforms like
>   linux,freebsd etc . For  other platforms we can use the existing method by
> slightly modifying the
>  write()  routine to write only the modified part of the page.
> 
> Regards
> jana
> 
> >
> >
> > OK, I have talked to Tom Lane about this on the phone and we have a few
> > ideas.
> >
> > Historically, we have avoided mmap() because of portability problems,
> > and because using mmap() to write to large tables could consume lots of
> > address space with little benefit.  However, I perhaps can see WAL as
> > being a good use of mmap.
> >
> > First, there is the issue of using mmap().  For OS's that have the
> > mmap() MAP_SHARED flag, different backends could mmap the same file and
> > each see the changes.  However, keep in mind we still have to fsync()
> > WAL, so we need to use msync().
> >
> > So, looking at the benefits of using mmap(), we have overhead of
> > different backends having to mmap something that now sits quite easily
> > in shared memory.  Now, I can see mmap reducing the copy from user to
> > kernel, but there are other ways to fix that.  We could modify the
> > write() routines to write() 8k on first WAL page write and later write
> > only the modified part of the page to the kernel buffers.  The old
> > kernel buffer is probably still around so it is unlikely to require a
> > read from the file system to read in the rest of the page.  This reduces
> > the write from 8k to something probably less than 4k which is better
> > than we can do with mmap.
> >
> > I will add a TODO item to this effect.
> >
> > As far as reducing the write to disk from 8k to 4k, if we have to
> > fsync/msync, we have to wait for the disk to spin to the proper location
> > and at that point writing 4k or 8k doesn't seem like much of a win.
> >
> > In summary, I think it would be nice to reduce the 8k transfer from user
> > to kernel on secondary page writes to only the modified part of the
> > page.  I am uncertain if mmap() or anything else will help the physical
> > write to the disk.
> >
> > --
> >   Bruce Momjian                        |  http://candle.pha.pa.us
> >   pgman@candle.pha.pa.us               |  (610) 853-3000
> >   +  If your life is a hard drive,     |  830 Blythe Avenue
> >   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Accessing Database files on a "read-only" medium...like
Следующее
От: Tom Lane
Дата:
Сообщение: Re: SQL99 time zones