Re: WAL prefetch

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: WAL prefetch
Дата
Msg-id 20180616194120.x4gsw2np5jhm7xni@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: WAL prefetch  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: WAL prefetch  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Hi,

On 2018-06-16 21:34:30 +0200, Tomas Vondra wrote:
> > - it leads to guaranteed double buffering, in a way that's just about
> >   guaranteed to *never* be useful. Because we'd only prefetch whenever
> >   there's an upcoming write, there's simply no benefit in the page
> >   staying in the page cache - we'll write out the whole page back to the
> >   OS.
> 
> How does reading directly into shared buffers substantially change the
> behavior? The only difference is that we end up with the double
> buffering after performing the write. Which is expected to happen pretty
> quick after the read request.

Random reads directly as a response to a read() request can be cached
differently - and we trivially could force that with another fadvise() -
than posix_fadvise(WILLNEED).  There's pretty much no other case - so
far - where we know as clearly that we won't re-read the page until
write as here.


> > - you don't have any sort of completion notification, so you basically
> >   just have to guess how far ahead you want to read. If you read a bit
> >   too much you suddenly get into synchronous blocking land.
> > - The OS page is actually not particularly scalable to large amounts of
> >   data either. Nor are the decisions what to keep cached likley to be
> >   particularly useful.
> 
> The posix_fadvise approach is not perfect, no doubt about that. But it
> works pretty well for bitmap heap scans, and it's about 13249x better
> (rough estimate) than the current solution (no prefetching).

Sure, but investing in an architecture we know might not live long also
has it's cost. Especially if it's not that complicated to do better.


> My point was that I don't think this actually adds a significant amount
> of work to the direct IO patch, as we already do prefetch for bitmap
> heap scans. So this needs to be written anyway, and I'd expect those two
> places to share most of the code. So where's the additional work?

I think it's largely entirely separate from what we'd do for bitmap
index scans.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: WAL prefetch
Следующее
От: Darafei "Komяpa" Praliaskouski
Дата:
Сообщение: Re: [HACKERS] GUC for cleanup indexes threshold.