Обсуждение: Background Writer and performances
Hello List, I’m using BenchmarkSQL to evaluate, characterize and optimize PostgreSQL in transaction processing. I work with PostgreSQL 8.1.3 on RHEL4-AS, Itanium-2 processor, 8GB RAM. The database, generated via BenchmarkSQL and used, is a 200-warehouses database and its size is about 20GB. The best performances are obtained with 9 terminals. My configuration is (I can’t do better…) : - one file system created from a LUN of 34GB, in which all the postmaster logs, WAL files and BenchmarkSQL reports are stored => this corresponds to the disk “sdr” that appears in the charts - one file system created from 4 LUNS of 33GB and a stripped logical volume, on which the data (tables and index) are stored => this corresponds to the disk “sdz”, “sdy”, “sdx”, and “sdw” that appears in the charts My question concerns the Background Writer. The Background process sleeps during “bgwriter_delay” and when it wakes, it scans the shared buffers looking for modified pages. Then it writes these modified pages to disk and evicts those pages from the shared buffers. By doing this, the Background Writer decreases the effects of a CHECKPOINT. I got performances problems with the Background Writer : if I let the default value for the parameters “bgwriter_delay”, “bgwriter_lru_percent”, “bgwriter_lru_maxpages”, “bgwriter_all_percent”, “bgwriter_all_maxpages” or if I set them to the maximum value possible, I don’t see significant differences on the disks activities, disks IO/s, disks write throughput. The tests have been done with one processor, during 10 minutes and with a checkpoint each 5 minutes. Here are my settings from the “postgresql.conf” file: - max_connections = 100 - shared_buffers = 50000 - work_mem = 1024 - max_fsm_pages= 20000 - max_fsm_relations = 1000 - fsync = on - wal_sync_method = fsync - wal_buffers = 16 - checkpoint_segments = 125 - checkpoint_timeout = 300 - effective_cache_size = 218750 - defaults_statistics_target = 1000 ********************************** I would like to send charts to show you exactly what happens on the server but, with the pictures, this e-mail is not posted on the mailing list. I can send charts to a personal e-mail adress if needed. ********************************** The tests were made by setting : - bgwriter_delay = 200 - bgwriter_lru_percent = 1.0 - bgwriter_lru_maxpages = 5 - bgwriter_all_percent = 0.333 - bgwriter_all_maxpages = 5 and then, the opposite : - bgwriter_delay = 50 - bgwriter_lru_percent = 100 - bgwriter_lru_maxpages = 1000 - bgwriter_all_percent = 100 - bgwriter_all_maxpages = 1000 By comparing the charts, I can see that the checkpoints are less expensive in term of Disk activity, IO/s and disk write throughput when the parameters are set to the maximum values but I don’t not reach to have constant disk IO/s, disk activity, disk write throughput before and after a checkpoint. I was expecting to see more activity on the disks during the bench (and not only a peak during the checkpoint) when the parameters are set to the maximum values. Is it possible ? Have you already experimented the Background Writer ? What results could I obtain by setting properly the parameters “bgwriter_xxx” ? Could somenone explain me how I can used it ? Thank you very much for your help. Regards, Alexandra DANTE
On Mon, Jul 10, 2006 at 02:56:48PM +0200, DANTE Alexandra wrote: > ********************************** > I would like to send charts to show you exactly what happens on the > server but, with the pictures, this e-mail is not posted on the mailing > list. > I can send charts to a personal e-mail adress if needed. > ********************************** The best idea is to upload them to a website. > By comparing the charts, I can see that the checkpoints are less > expensive in term of Disk activity, IO/s and disk write throughput when > the parameters are set to the maximum values but I don?t not reach to > have constant disk IO/s, disk activity, disk write throughput before and > after a checkpoint. I was expecting to see more activity on the disks > during the bench (and not only a peak during the checkpoint) when the > parameters are set to the maximum values. Is it possible ? I have very little experience with the bgwriter, but on the whole, I don't think the bgwriter will change the total number of I/Os. Rather, it changes the timing to make them more consistant and the load more even. However, reading the descriptions, there's always going to be an unavoidable spike at checkpoint time because the checkpoint actually needs to sync() the data to disk, whereas the bgwriter merely queues it to the kernel. > Have you already experimented the Background Writer ? What results could > I obtain by setting properly the parameters ?bgwriter_xxx? ? Could > somenone explain me how I can used it ? Looking arond I mostly see that people are still testing. I'm not really in a position to know however... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Вложения
On Mon, Jul 10, 2006 at 05:06:56PM +0200, DANTE Alexandra wrote: > Hello List, > > I have uploaded charts on a ftp server. > You can access to these 6 graphs by doing > ftp visibull.frec.bull.fr Or more easily, by putting this in your web-browser: ftp://visibull.frec.bull.fr/PGS_bgwriter/ I'm presuming these graphs are traffic to the disks, right? Do you have a measurement of the requests from postgres? I'm not sure how you'd get that but it'd probably help with understanding the graphs. Maybe there's a way to get the kernel to be more aggressive with pushing pages out to disk? It has a bgwriter too... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Вложения
Hello List, I have uploaded charts on a ftp server. You can access to these 6 graphs by doing ftp visibull.frec.bull.fr login : ftp password : ftp You are under the "/" directory and with "ls" command you see the directory "PGS_bgwriter". In this directory, the 3 charts joined to this e-mail and called “DiskActivity_defaultBgwriter.png”, “DiskIO_defaultBgwriter.png”, “DiskWriteThroughput_defaultBgwriter.png” are obtained by setting : - bgwriter_delay = 200 - bgwriter_lru_percent = 1.0 - bgwriter_lru_maxpages = 5 - bgwriter_all_percent = 0.333 - bgwriter_all_maxpages = 5 and the 3 charts joined to this e-mail and called “DiskActivity_maxBgwriter.png”, “DiskIO_maxBgwriter.png”, “DiskWriteThroughput_maxBgwriter.png” are obtained by setting : - bgwriter_delay = 50 - bgwriter_lru_percent = 100 - bgwriter_lru_maxpages = 1000 - bgwriter_all_percent = 100 - bgwriter_all_maxpages = 1000 I hope someone will give me explanations on the background writer. Thank you very much for your help. Regards, Alexandra DANTE Martijn van Oosterhout wrote: >On Mon, Jul 10, 2006 at 02:56:48PM +0200, DANTE Alexandra wrote: > > >>********************************** >>I would like to send charts to show you exactly what happens on the >>server but, with the pictures, this e-mail is not posted on the mailing >>list. >>I can send charts to a personal e-mail adress if needed. >>********************************** >> >> > >The best idea is to upload them to a website. > > > >>By comparing the charts, I can see that the checkpoints are less >>expensive in term of Disk activity, IO/s and disk write throughput when >>the parameters are set to the maximum values but I don?t not reach to >>have constant disk IO/s, disk activity, disk write throughput before and >>after a checkpoint. I was expecting to see more activity on the disks >>during the bench (and not only a peak during the checkpoint) when the >>parameters are set to the maximum values. Is it possible ? >> >> > >I have very little experience with the bgwriter, but on the whole, I >don't think the bgwriter will change the total number of I/Os. Rather, >it changes the timing to make them more consistant and the load more >even. > >However, reading the descriptions, there's always going to be an >unavoidable spike at checkpoint time because the checkpoint actually >needs to sync() the data to disk, whereas the bgwriter merely queues it >to the kernel. > > > >>Have you already experimented the Background Writer ? What results could >>I obtain by setting properly the parameters ?bgwriter_xxx? ? Could >>somenone explain me how I can used it ? >> >> > >Looking arond I mostly see that people are still testing. I'm not >really in a position to know however... > >Have a nice day, > >
These graphs represent traffic to the disks, and have been generated from a "home-made" tool based on "top", "vmstat" and "iostat". Only PostgreSQL accesses to them, a JVM is launched via BenchmarkSQL but does not access to the disks on which are stored the data. BenchmarkSQL stores its reports on the "sdr" disk. Concerning the logs of the postmaster, I let the defaults values, so I do not have the queries, timing, statements, ... The fsync is activated and the "wal_method_fsync" is "fsync" (by default). Is there a way to be more aggressive with pushing pages out to disk via PostgreSQL ? Regards, Alexandra Martijn van Oosterhout wrote: >On Mon, Jul 10, 2006 at 05:06:56PM +0200, DANTE Alexandra wrote: > > >>Hello List, >> >>I have uploaded charts on a ftp server. >>You can access to these 6 graphs by doing >>ftp visibull.frec.bull.fr >> >> > >Or more easily, by putting this in your web-browser: > >ftp://visibull.frec.bull.fr/PGS_bgwriter/ > >I'm presuming these graphs are traffic to the disks, right? Do you have >a measurement of the requests from postgres? I'm not sure how you'd get >that but it'd probably help with understanding the graphs. > >Maybe there's a way to get the kernel to be more aggressive with >pushing pages out to disk? It has a bgwriter too... > >Have a nice day, > >
On 7/10/2006 9:49 AM, Martijn van Oosterhout wrote: > On Mon, Jul 10, 2006 at 02:56:48PM +0200, DANTE Alexandra wrote: >> ********************************** >> I would like to send charts to show you exactly what happens on the >> server but, with the pictures, this e-mail is not posted on the mailing >> list. >> I can send charts to a personal e-mail adress if needed. >> ********************************** > > The best idea is to upload them to a website. > >> By comparing the charts, I can see that the checkpoints are less >> expensive in term of Disk activity, IO/s and disk write throughput when >> the parameters are set to the maximum values but I don?t not reach to >> have constant disk IO/s, disk activity, disk write throughput before and >> after a checkpoint. I was expecting to see more activity on the disks >> during the bench (and not only a peak during the checkpoint) when the >> parameters are set to the maximum values. Is it possible ? > > I have very little experience with the bgwriter, but on the whole, I > don't think the bgwriter will change the total number of I/Os. Rather, > it changes the timing to make them more consistant and the load more > even. The bgwriter can only "increase" the total amount of IO. What it does is to write dirty pages out before a checkpoint or another backend (due to eviction of the buffer) has to do it. This means that without the bgwriter doing so, there would be a chance that a later update to the same buffer would hit an already dirty buffer as opposed to a now clean one. The upside of this increased write activity is that it happens all the time, spread out between the checkpoints and that this doesn't allow for large buffer cache configurations to accumulate tens of thousands of dirty buffers. The latter is a typical problem with OLTP type benchmarks that are designed more closely to real world behaviour, like the TPC-C and TPC-W. In those benchmarks, hundreds or thousands of simulated users basically go through dialog steps of an application, and just like a real user they don't fill in the form in milliseconds and slam ASAP onto the submit button, they need a bit of time to "think" or "type". In that scenario, the performance drop caused by a checkpoint will let more and more "users" to finish their think/type phase and actually submit the next transaction (dialog step), causing a larger and larger number of concurrent DB requests and basically spiraling down the DB server. The default settings are not sufficient for update intense applications. I am not familiar with BenchmarkSQL, but 9 terminals with a 200 warehouse configuration doesn't sound like it is simulating real user behaviour like outlined above. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #