Re: Experimental patch for inter-page delay in VACUUM

Поиск
Список
Период
Сортировка
От Jan Wieck
Тема Re: Experimental patch for inter-page delay in VACUUM
Дата
Msg-id 3FA7D39A.8060502@Yahoo.com
обсуждение исходный текст
Ответ на Re: Experimental patch for inter-page delay in VACUUM  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Experimental patch for inter-page delay in VACUUM
Список pgsql-hackers
Tom Lane wrote:

> Jan Wieck <JanWieck@Yahoo.com> writes:
>> Tom Lane wrote:
>>> I have never been happy with the fact that we use sync(2) at all.
> 
>> Sure does it do too much. But together with the other layer of 
>> indirection, the virtual file descriptor pool, what is the exact 
>> guaranteed behaviour of
>>      write(); close(); open(); fsync();
>> cross platform?
> 
> That isn't guaranteed, which is why we have to use sync() at the
> moment.  To go over to fsync or O_SYNC we'd need more control over which
> file descriptors are used to issue writes.  Which is why I was thinking
> about moving the writes to a centralized writer process.
> 
>>> Actually, once you build it this way, you could make all writes
>>> synchronous (open the files O_SYNC) so that there is never any need for
>>> explicit fsync at checkpoint time.
> 
>> Yes, but then the configuration leans more towards "take over the RAM" 
> 
> Why?  The idea is to try to issue writes at a fairly steady rate, which
> strikes me as much better than the current behavior.  I don't see why it
> would force you to have large numbers of buffers available.  You'd want
> a few thousand, no doubt, but that's not a large number.

That is part of the idea. The whole idea is to issue "physical" writes 
at a fairly steady rate without increasing the number of them 
substantial or interfering with the drives opinion about their order too 
much. I think O_SYNC for random access can be in conflict with write 
reordering.

How I can see the background writer operating is that he's keeping the 
buffers in the order of the LRU chain(s) clean, because those are the 
buffers that most likely get replaced soon. In my experimental ARC code 
it would traverse the T1 and T2 queues from LRU to MRU, write out n1 and 
n2 dirty buffers (n1+n2 configurable), then fsync all files that have 
been involved in that, nap depending on where he got down the queues (to 
increase the write rate when running low on clean buffers), and do it 
all over again.

That way, everyone else doing a write must issue an fsync too because 
it's not guaranteed that the fsync of one process flushes the writes of 
another. But as you said, if that is a relatively seldom operation for a 
regular backend, it won't hurt.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Stark
Дата:
Сообщение: Re: Experimental patch for inter-page delay in VACUUM
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: [PATCHES] equal() perf tweak