Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance

Поиск
Список
Период
Сортировка
От Claudio Freire
Тема Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Дата
Msg-id CAGTBQpaPnsX3LpCA+jbF72TmMw3mby15OKLUVL5fyrf=Yf=MoQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Linux kernel impact on PostgreSQL performance  (Josh Berkus <josh@agliodbs.com>)
Список pgsql-hackers
On Tue, Jan 14, 2014 at 12:42 PM, Trond Myklebust <trondmy@gmail.com> wrote:
>> James Bottomley <James.Bottomley@HansenPartnership.com> writes:
>>> The current mechanism for coherency between a userspace cache and the
>>> in-kernel page cache is mmap ... that's the only way you get the same
>>> page in both currently.
>>
>> Right.
>>
>>> glibc used to have an implementation of read/write in terms of mmap, so
>>> it should be possible to insert it into your current implementation
>>> without a major rewrite.  The problem I think this brings you is
>>> uncontrolled writeback: you don't want dirty pages to go to disk until
>>> you issue a write()
>>
>> Exactly.
>>
>>> I think we could fix this with another madvise():
>>> something like MADV_WILLUPDATE telling the page cache we expect to alter
>>> the pages again, so don't be aggressive about cleaning them.
>>
>> "Don't be aggressive" isn't good enough.  The prohibition on early write
>> has to be absolute, because writing a dirty page before we've done
>> whatever else we need to do results in a corrupt database.  It has to
>> be treated like a write barrier.
>
> Then why are you dirtying the page at all? It makes no sense to tell the kernel “we’re changing this page in the page
cache,but we don’t want you to change it on disk”: that’s not consistent with the function of a page cache. 


PG doesn't currently.

All that dirtying happens in anonymous shared memory, in pg-specific buffers.

The proposal is to use mmap instead of anonymous shared memory as
pg-specific buffers to avoid the extra copy (mmap would share the page
with both kernel and user space). But that would dirty the page when
written to, because now the kernel has the correspondence between that
specific memory region and the file, and that's forbidden for PG's
usage.

I believe the only option here is for the kernel to implement
zero-copy reads. But that implementation is doomed for the performance
reasons I outlined on an eariler mail. So...



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: PoC: Partial sort
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance