Re: Controlling Load Distributed Checkpoints

Поиск
Список
Период
Сортировка
От ITAGAKI Takahiro
Тема Re: Controlling Load Distributed Checkpoints
Дата
Msg-id 20070611141111.8B5D.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Controlling Load Distributed Checkpoints  (Heikki Linnakangas <heikki@enterprisedb.com>)
Ответы Re: Controlling Load Distributed Checkpoints  (Greg Smith <gsmith@gregsmith.com>)
Re: Controlling Load Distributed Checkpoints  (Heikki Linnakangas <heikki@enterprisedb.com>)
Список pgsql-hackers
Heikki Linnakangas <heikki@enterprisedb.com> wrote:

> True. On the other hand, if we issue writes in essentially random order, 
> we might fill the kernel buffers with random blocks and the kernel needs 
> to flush them to disk as almost random I/O. If we did the writes in 
> groups, the kernel has better chance at coalescing them.

If the kernel can treat sequential writes better than random writes, 
is it worth sorting dirty buffers in block order per file at the start
of checkpoints? Here is the pseudo code:
 buffers_to_be_written =     SELECT buf_id, tag FROM BufferDescriptors       WHERE (flags & BM_DIRTY) != 0 ORDER BY
tag.rnode,tag.blockNum; for { buf_id, tag } in buffers_to_be_written:     if BufferDescriptors[buf_id].tag == tag:
  FlushBuffer(&BufferDescriptors[buf_id])
 

We can also avoid writing buffers newly dirtied after the checkpoint was
started with this method.


> I tend to agree that if the goal is to finish the checkpoint as quickly 
> as possible, the current approach is better. In the context of load 
> distributed checkpoints, however, it's unlikely the kernel can do any 
> significant overlapping since we're trickling the writes anyway.

Some kernels or storage subsystems treat all I/Os too fairly so that user
transactions waiting for reads are blocked by checkpoints writes. It is
unavoidable behavior though, but we can split writes in small batches.


> I'm starting to feel we should give up on smoothing the fsyncs and 
> distribute the writes only, for 8.3. As we get more experience with that 
> and it's shortcomings, we can enhance our checkpoints further in 8.4.

I agree with the only writes distribution for 8.3. The new parameters
introduced by it (checkpoint_write_percent and checkpoint_write_min_rate)
will continue to be alive without major changes in the future, but other
parameters seem to be volatile.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Meskes
Дата:
Сообщение: Re: ecpg leaves broken files around
Следующее
От: "Ewald Geschwinde"
Дата:
Сообщение: Re: Truncate Permission