Re: Checkpoint cost, looks like it is WAL/CRC
| От | Simon Riggs | 
|---|---|
| Тема | Re: Checkpoint cost, looks like it is WAL/CRC | 
| Дата | |
| Msg-id | 1122502691.3670.219.camel@localhost.localdomain обсуждение исходный текст | 
| Ответ на | Re: Checkpoint cost, looks like it is WAL/CRC (Tom Lane <tgl@sss.pgh.pa.us>) | 
| Ответы | Re: Checkpoint cost, looks like it is WAL/CRC | 
| Список | pgsql-hackers | 
On Tue, 2005-07-26 at 19:15 -0400, Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: > >> We should run tests with much higher wal_buffers numbers to nullify the > >> effect described above and reduce contention. That way we will move > >> towards the log disk speed being the limiting factor, patch or no patch. > > > I've run such tests, at a glance they do seem to improve performance. I > > need some time to collate the results. > > With larger wal_buffers values it might also be interesting to take some > measures to put a larger share of the WAL writing burden on the bgwriter. > > Currently the bgwriter only writes out WAL buffers in two scenarios: > > 1. It wants to write a dirty shared buffer that has LSN beyond the > current WAL flush marker. Just like any backend, the bgwriter must > flush WAL as far as the LSN before writing the buffer. > > 2. The bgwriter is completing a checkpoint. It must flush WAL as far as > the checkpoint record before updating pg_control. > > It might be interesting to add some logic to explicitly check for and > write out any full-but-unwritten WAL buffers during the bgwriter's > main loop. > > In a scenario with many small transactions, this is probably a waste of > effort since backends will be forcing WAL write/flush any time they > commit. (This is why I haven't pursued the idea already.) However, > given a large transaction and adequate wal_buffer space, such a tactic > should offload WAL writing work nicely. > > I have no idea whether the DBT benchmarks would benefit at all, but > given that they are affected positively by increasing wal_buffers, > they must have a fair percentage of not-small transactions. Yes, I was musing on that also. I think it would help keep response time even, which seems to be the route to higher performance anyway. This is more important in real world than in benchmarks, where a nice even stream of commits arrives to save the day... I guess I'd be concerned that the poor bgwriter can't do all of this work. I was thinking about a separate log writer, so we could have both bgwriter and logwriter active simultaneously on I/O. It has taken a while to get bgwriter to perform its duties efficiently, so I'd rather not give it so many that it performs them all badly. The logwriter would be more of a helper, using LWLockConditionalAcquire to see if the WALWriteLock was kept active. Each backend would still perform its own commit write. (We could change that in the future, but thats a lot more work.) We would only need one new GUC log_writer_delay, defaulting to 50 ms (??) - if set to zero, the default, then we don't spawn a logwriter daemon at all. (Perhaps we also need another one to say how many blocks get written each time its active... but I'm not hugely in favour of more parameters to get wrong). That way we could take the LWLockConditionalAcquire on WALWriteLock out of the top of XLogInsert, which was effectively doing that work. I think this would also reduce the apparent need for high wal_buffer settings - probably could get away with a lot less than the 2048 recent performance results would suggest. Best Regards, Simon Riggs
В списке pgsql-hackers по дате отправления: