Re: WAL prefetch
От | Andres Freund |
---|---|
Тема | Re: WAL prefetch |
Дата | |
Msg-id | 20180619164415.ta6q47vwvyzcjwjo@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: WAL prefetch (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Список | pgsql-hackers |
On 2018-06-19 19:34:22 +0300, Konstantin Knizhnik wrote: > On 19.06.2018 18:50, Andres Freund wrote: > > On 2018-06-19 12:08:27 +0300, Konstantin Knizhnik wrote: > > > I do not think that prefetching in shared buffers requires much more efforts > > > and make patch more envasive... > > > It even somehow simplify it, because there is no to maintain own cache of > > > prefetched pages... > > > But it will definitely have much more impact on Postgres performance: > > > contention for buffer locks, throwing away pages accessed by read-only > > > queries,... > > These arguments seem bogus to me. Otherwise the startup process is going > > to do that work. > > There is just one process replaying WAL. Certainly it has some impact on hot > standby query execution. > But if there will be several prefetch workers (128???) then this impact will > be dramatically increased. Hence me suggesting how you can do that with one process (re locking). I still entirely fail to see how "throwing away pages accessed by read-only queries" is meaningful here - the startup process is going to read the data anyway, and we *do not* want to use a ringbuffer as that'd make the situation dramatically worse. > Well, originally it was proposed by Sean - the author of pg-prefaulter. I > just ported it from GO to C using standard PostgreSQL WAL iterator. > Then I performed some measurements and didn't find some dramatic improvement > in performance (in case of synchronous replication) or reducing replication > lag for asynchronous replication neither at my desktop (SSD, 16Gb RAM, local > replication within same computer, pgbench scale 1000), neither at pair of > two powerful servers connected by > InfiniBand and 3Tb NVME (pgbench with scale 100000). > Also I noticed that read rate at replica is almost zero. > What does it mean: > 1. I am doing something wrong. > 2. posix_prefetch is not so efficient. > 3. pgbench is not right workload to demonstrate effect of prefetch. > 4. Hardware which I am using is not typical. I think it's probably largely a mix of 3 and 4. pgbench with random distribution probably indeed is a bad testcase, because either everything is in cache or just about every write ends up as a full page write because of the scale. You might want to try a) turn of full page writes b) use a less random distribution. > So it make me think when such prefetch may be needed... And it caused new > questions: > I wonder how frequently checkpoint interval is much larger than OS > cache? Extremely common. > If we enforce full pages writes (let's say each after each 1Gb), how it > affect wal size and performance? Extremely badly. If you look at stats of production servers (using pg_waldump) you can see that large percentage of the total WAL volume is FPWs, that FPWs are a storage / bandwidth / write issue, and that higher FPW rates after a checkpoint correlate strongly negatively with performance. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: