Re: finding changed blocks using WAL scanning

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: finding changed blocks using WAL scanning
Дата
Msg-id 20190423170939.gqehdh6ncu5bw4nq@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: finding changed blocks using WAL scanning  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: finding changed blocks using WAL scanning  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
Hi,

On 2019-04-23 19:01:29 +0200, Tomas Vondra wrote:
> On Tue, Apr 23, 2019 at 09:34:54AM -0700, Andres Freund wrote:
> > Hi,
> > 
> > On 2019-04-23 18:07:40 +0200, Tomas Vondra wrote:
> > > Well, the thing is that for prefetching to be possible you actually have
> > > to be a bit behind. Otherwise you can't really look forward which blocks
> > > will be needed, right?
> > > 
> > > IMHO the main use case for prefetching is when there's a spike of activity
> > > on the primary, making the standby to fall behind, and then hours takes
> > > hours to catch up. I don't think the cases with just a couple of MBs of
> > > lag are the issue prefetching is meant to improve (if it does, great).
> > 
> > I'd be surprised if a good implementation didn't. Even just some smarter
> > IO scheduling in the startup process could help a good bit. E.g. no need
> > to sequentially read the first and then the second block for an update
> > record, if you can issue both at the same time - just about every
> > storage system these days can do a number of IO requests in parallel,
> > and it nearly halves latency effects. And reading a few records (as in a
> > few hundred bytes commonly) ahead, allows to do much more than that.
> > 
> 
> I don't disagree with that - prefetching certainly can improve utilization
> of the storage system. The question is whether it can meaningfully improve
> performance of the recovery process in cases when it does not lag.  And I
> think it can't (perhaps with remote_apply being  an exception).

Well. I think a few dozen records behind doesn't really count as "lag",
and I think that's where it'd start to help (and for some record types
like updates it'd start to help even for single records). It'd convert
scenarios where we'd currently fall behind slowly into scenarios where
we can keep up - but where there's no meaningful lag while we keep up.
What's your argument for me being wrong?

And even if we'd keep up without any prefetching, issuing requests in a
more efficient manner allows for more efficient concurrent use of the
storage system. It'll often effectively reduce the amount of random
iops.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Regression test PANICs with master-standby setup on same machine
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Symbol referencing errors