Re: Experimental patch for inter-page delay in VACUUM

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Experimental patch for inter-page delay in VACUUM
Дата
Msg-id 200311101423.hAAENbv10754@candle.pha.pa.us
обсуждение исходный текст
Ответ на Re: Experimental patch for inter-page delay in VACUUM  (Jan Wieck <JanWieck@Yahoo.com>)
Ответы Re: Experimental patch for inter-page delay in VACUUM
Re: Experimental patch for inter-page delay in VACUUM
Список pgsql-hackers
Jan Wieck wrote:
> Bruce Momjian wrote:
> > I would be interested to know if you have the background write process
> > writing old dirty buffers to kernel buffers continually if the sync()
> > load is diminished.  What this does is to push more dirty buffers into
> > the kernel cache in hopes the OS will write those buffers on its own
> > before the checkpoint does its write/sync work.  This might allow us to
> > reduce sync() load while preventing the need for O_SYNC/fsync().
> 
> I tried that first. Linux 2.4 does not, as long as you don't tell it by 
> reducing the dirty data block aging time with update(8). So you have to 
> force it to utilize the write bandwidth in the meantime. For that you 
> have to call sync() or fsync() on something.
> 
> Maybe O_SYNC is not as bad an option as it seems. In my patch, the 
> checkpointer flushes the buffers in LRU order, meaning it flushes the 
> least recently used ones first. This has the side effect that buffers 
> returned for replacement (on a cache miss, when the backend needs to 
> read the block) are most likely to be flushed/clean. So it reduces the 
> write load of backends and thus the probability that a backend is ever 
> blocked waiting on an O_SYNC'd write().
> 
> I will add some counters and gather some statistics how often the 
> backend in comparision to the checkpointer calls write().

OK, new idea.  How about if you write() the buffers, mark them as clean
and unlock them, then issue fsync().  The advantage here is that we can
allow the buffer to be reused while we wait for the fsync to complete. 
Obviously, O_SYNC is not going to allow that.  Another idea --- if
fsync() is slow because it can't find the dirty buffers, use write() to
write the buffers, copy the buffer to local memory, mark it as clean,
then open the file with O_SYNC and write it again.  Of course, I am just
throwing out ideas here.  The big thing I am concerned about is that
reusing buffers not take too long.

> > Perhaps sync() is bad partly because the checkpoint runs through all the
> > dirty shared buffers and writes them all to the kernel and then issues
> > sync() almost guaranteeing a flood of writes to the disk.  This method
> > would find fewer dirty buffers in the shared buffer cache, and therefore
> > fewer kernel writes needed by sync().
> 
> I don't understand this? How would what method reduce the number of page 
> buffers the backends modify?

What I was saying is that if we only write() just before a checkpoint,
we never give the kernel a chance to write the buffers on its own.  I
figured if we wrote them earlier, the kernel might write them for us and
sync wouldn't need to do it.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: what could cause this PANIC on enterprise 7.3.4 db?
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Experimental patch for inter-page delay in VACUUM