Re: postgresql latency & bgwriter not doing its job
От | Andres Freund |
---|---|
Тема | Re: postgresql latency & bgwriter not doing its job |
Дата | |
Msg-id | 20140830180405.GB25523@awork2.anarazel.de обсуждение исходный текст |
Ответ на | Re: postgresql latency & bgwriter not doing its job (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: postgresql latency & bgwriter not doing its job
(Tom Lane <tgl@sss.pgh.pa.us>)
Re: postgresql latency & bgwriter not doing its job (Robert Haas <robertmhaas@gmail.com>) |
Список | pgsql-hackers |
On 2014-08-30 13:50:40 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-08-27 19:23:04 +0300, Heikki Linnakangas wrote: > >> A long time ago, Itagaki Takahiro wrote a patch sort the buffers and write > >> them out in order (http://www.postgresql.org/message-id/flat/20070614153758.6A62.ITAGAKI.TAKAHIRO@oss.ntt.co.jp). > >> The performance impact of that was inconclusive, but one thing that it > >> allows nicely is to interleave the fsyncs, so that you write all the buffers > >> for one file, then fsync it, then next file and so on. > > > ... > > So, *very* clearly sorting is a benefit. > > pg_bench alone doesn't convince me on this. The original thread found > cases where it was a loss, IIRC; you will need to test many more than > one scenario to prove the point. Sure. And I'm not claiming Itagaki/your old patch is immediately going to be ready for commit. But our checkpoint performance has sucked for years in the field. I don't think we can wave that away. I think the primary reason it wasn't easily visible as being beneficial back then was that only the throughput, not the latency and such were looked at. > Also, it does not matter how good it looks in test cases if it causes > outright failures due to OOM; unlike you, I am not prepared to just "wave > away" that risk. I'm not "waving away" any risks. If the sort buffer is allocated when the checkpointer is started, not everytime we sort, as you've done in your version of the patch I think that risk is pretty manageable. If we really want to be sure nothing is happening at runtime, even if the checkpointer was restarted, we can put the sort array in shared memory. We're talking about (sizeof(BufferTag) + sizeof(int))/8192 ~= 0.3 % overhead over shared_buffers here. If we decide to got that way, it's a pretty darn small to price not to regularly have stalls that last minutes. > A possible compromise is to sort a limited number of > buffers ---- say, collect a few thousand dirty buffers then sort, dump and > fsync them, repeat as needed. Yea, that's what I suggested nearby. But I don't really like it, because it robs us of the the chance to fsync() a relfilenode immediately after having synced all its buffers. Doing so is rather beneficial because then fewer independently dirtied pages have to be flushed out - reducing the impact of the fsync(). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: