rebased background worker reimplementation prototype
От | Andres Freund |
---|---|
Тема | rebased background worker reimplementation prototype |
Дата | |
Msg-id | 20190611032249.kfi7pgqu2ipmlqca@alap3.anarazel.de обсуждение исходный текст |
Ответы |
Re: rebased background worker reimplementation prototype
(Tomas Vondra <tomas.vondra@2ndquadrant.com>)
|
Список | pgsql-hackers |
Hi, I've talked a few times about a bgwriter replacement prototype I'd written a few years back. That happened somewhere deep in another thread [1], and thus not easy to fix. Tomas Vondra asked me for a link, but there was some considerable bitrot since. Attached is a rebased and slightly improved version. It's also available at [2][3]. The basic observation is that there's some fairly fundamental issues with the current bgwriter implementation: 1) The pacing logic is complicated, but doesn't work well 2) If most/all buffers have a usagecount, it cannot do anything, because it doesn't participate in the clock-sweep 3) Backends have to re-discover the now clean buffers. The prototype is much simpler - in my opinion of course. It has a ringbuffer of buffers it thinks are clean (which might be reused concurrently though). It fills that ringbuffer by performing clock-sweep, and if necessary cleaning, usagecount=pincount=0 buffers. Backends can then pop buffers from that ringbuffer. Pacing works by bgwriter trying to keep the ringbuffer full, and backends emptying the ringbuffer. If the ringbuffer is less than 1/4 full, backends wake up bgwriter using the existing latch mechanism. The ringbuffer is a pretty simplistic lockless (but just obstruction free, not lock free) implementation, with a lot of unneccessary constraints. I've had to improve the current instrumentation for pgwriter (i.e. pg_stat_bgwriter) considerably - the details in there imo are not even remotely good enough to actually understand the system (nor are the names understandable). That needs to be split into a separate commit, and the half dozen different implementations of the counters need to be unified. Obviously this is very prototype-stage code. But I think it's a good starting point for going forward. To enable it, one currently has to set the bgwriter_legacy = false GUC. Some early benchmarks show that in IO heavy cases there's somewhere between a very mild regression (close to noise), to a pretty considerable improvement. To see a benefit one - fairly obviously - needs a workload that is bigger than shared buffers, because otherwise checkpointer is going to do all writes (and should, it can sort them perfectly!). It's quite possible to saturate what a single bgwriter can write out (as it is before the replacement). I'm inclined to think the next solution for that is asynchronous IO, and write-combining, rather than multiple bgwriters. Here's an example pg_stat_bgwriter from the middle of a pgbench run (after resetting it a short while before): ┌─[ RECORD 1 ]───────────────┬───────────────────────────────┐ │ checkpoints_timed │ 1 │ │ checkpoints_req │ 0 │ │ checkpoint_write_time │ 179491 │ │ checkpoint_sync_time │ 266 │ │ buffers_written_checkpoint │ 172414 │ │ buffers_written_bgwriter │ 475802 │ │ buffers_written_backend │ 7140 │ │ buffers_written_ring │ 0 │ │ buffers_fsync_checkpointer │ 137 │ │ buffers_fsync_bgwriter │ 0 │ │ buffers_fsync_backend │ 0 │ │ buffers_bgwriter_clean │ 832616 │ │ buffers_alloc_preclean │ 1306572 │ │ buffers_alloc_free │ 0 │ │ buffers_alloc_sweep │ 4639 │ │ buffers_alloc_ring │ 767 │ │ buffers_ticks_bgwriter │ 4398290 │ │ buffers_ticks_backend │ 17098 │ │ maxwritten_clean │ 17 │ │ stats_reset │ 2019-06-10 20:17:56.087704-07 │ └────────────────────────────┴───────────────────────────────┘ Note that buffers_written_backend (as buffers_backend before) accounts for file extensions too - which bgwriter can't offload. We should replace that by a non-write (i.e. fallocate) anyway. Greetings, Andres Freund [1] https://postgr.es/m/20160204155458.jrw3crmyscusdqf6%40alap3.anarazel.de [2] https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/bgwriter-rewrite [3] https://github.com/anarazel/postgres/tree/bgwriter-rewrite
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Kyotaro HoriguchiДата:
Сообщение: Re: pg_upgrade: prep_status doesn't translate messages
Следующее
От: Michael PaquierДата:
Сообщение: Re: Missing generated column in ALTER TABLE ADD COLUMN doc