Обсуждение: Vacuum wait time problem
Hi, I'm running vacuum full analyze verbose on a table with 20million rows and 11 indexes. In top, I'm seeing [pdflush] and postgres: writer process each using diferent cpu cores, with wait time well above 90% on each of them. The vacuum has been running for several hours, and the last thing to show on screen, over an hour ago, was : DETAIL: 8577281 index row versions were removed. 736 index pages have been deleted, 736 are currently reusable. CPU 7.57s/52.52u sec elapsed 381.70 sec. That's the last index The vacuum process itself is using less than 2% of a core. The pg version is 8.3.1 running on Suse. Hardware is 2X dual core Opterons, 16 GB RAM, 24 drives in RAID 50 It would seem to me that the system is extremely IO bound, but I don't know how to find out what specifically is wrong here. Any advice greatly appreciated. -- Roger Ging --
On Fri, Feb 13, 2009 at 10:20 AM, Roger Ging <rging@musicreports.com> wrote: > Hi, > > I'm running vacuum full analyze verbose on a table with 20million rows and > 11 indexes. In top, I'm seeing [pdflush] and postgres: writer process each > using diferent cpu cores, with wait time well above 90% on each of them. > The vacuum has been running for several hours, and the last thing to show > on screen, over an hour ago, was : > > DETAIL: 8577281 index row versions were removed. > 736 index pages have been deleted, 736 are currently reusable. > CPU 7.57s/52.52u sec elapsed 381.70 sec. > > That's the last index > > The vacuum process itself is using less than 2% of a core. > The pg version is 8.3.1 running on Suse. Hardware is 2X dual core Opterons, > 16 GB RAM, 24 drives in RAID 50 > > It would seem to me that the system is extremely IO bound, but I don't know > how to find out what specifically is wrong here. Any advice greatly > appreciated. A couple of questions. Why Vacuum full as opposed to vacuum (regular)? Why 8.3.1 which has known bugs, instead of 8.3.latest? What do "vmstat 10" and iostat -x 10 have to say about your drive arrays while this vacuum is running?
Scott,<br /><br /> I can only answer a couple of the questions at the moment. I had to kill the vacuum full and do a regularvacuum, so I can't get the iostat and vmstat outputs right now. This message is the reason I was trying to run vacuumfull:<br /><br /> INFO: "license": found 257 removable, 20265895 nonremovable row versions in 1088061 pages<br />DETAIL: 0 dead row versions cannot be removed yet.<br /> There were 18434951 unused item pointers.<br /> 687274 pagescontain useful free space.<br /> 0 pages are entirely empty.<br /> CPU 38.15s/37.02u sec elapsed 621.19 sec.<br /> WARNING: relation "licensing.license" contains more than "max_fsm_pages" pages with useful free space<br /> HINT: Considerusing VACUUM FULL on this relation or increasing the configuration parameter "max_fsm_pages".<br /><br /> A cleanrestore of the database to another server create a size on disk of about 244GB. This server was at over 400GB yesterday,and now, after aggressive vacuuming by hand, is down to 350GB. It had gotten so bad that the backup was not finishedwhen I got in yesterday, almost 8 hours after it started.<br /><br /> The machine has been under heavy load 24/7for a couple of months, so I have not been able to upgrade versions. I am taking it offline this weekend and will installthe latest. I'll try to re-create the scenario I had going on yesterday over the weekend and get some io statistics.<br/><br /> Roger<br /><br /> Scott Marlowe wrote: <blockquote cite="mid:dcc563d10902131321h548104a0vdf3d16f0dc27f015@mail.gmail.com"type="cite"><pre wrap="">On Fri, Feb 13, 2009 at 10:20AM, Roger Ging <a class="moz-txt-link-rfc2396E" href="mailto:rging@musicreports.com"><rging@musicreports.com></a>wrote: </pre><blockquote type="cite"><pre wrap="">Hi, I'm running vacuum full analyze verbose on a table with 20million rows and 11 indexes. In top, I'm seeing [pdflush] and postgres: writer process each using diferent cpu cores, with wait time well above 90% on each of them.The vacuum has been running for several hours, andthe last thing to show on screen, over an hour ago, was : DETAIL: 8577281 index row versions were removed. 736 index pages have been deleted, 736 are currently reusable. CPU 7.57s/52.52u sec elapsed 381.70 sec. That's the last index The vacuum process itself is using less than 2% of a core. The pg version is 8.3.1 running on Suse. Hardware is 2X dual core Opterons, 16 GB RAM, 24 drives in RAID 50 It would seem to me that the system is extremely IO bound, but I don't know how to find out what specifically is wrong here. Any advice greatly appreciated. </pre></blockquote><pre wrap=""> A couple of questions. Why Vacuum full as opposed to vacuum (regular)? Why 8.3.1 which has known bugs, instead of 8.3.latest? What do "vmstat 10" and iostat -x 10 have to say about your drive arrays while this vacuum is running? </pre></blockquote><br />
On Fri, Feb 13, 2009 at 2:56 PM, Roger Ging <rging@musicreports.com> wrote: > Scott, > > I can only answer a couple of the questions at the moment. I had to kill > the vacuum full and do a regular vacuum, so I can't get the iostat and > vmstat outputs right now. This message is the reason I was trying to run > vacuum full: > > INFO: "license": found 257 removable, 20265895 nonremovable row versions in > 1088061 pages > DETAIL: 0 dead row versions cannot be removed yet. > There were 18434951 unused item pointers. > 687274 pages contain useful free space. > 0 pages are entirely empty. > CPU 38.15s/37.02u sec elapsed 621.19 sec. > WARNING: relation "licensing.license" contains more than "max_fsm_pages" > pages with useful free space > HINT: Consider using VACUUM FULL on this relation or increasing the > configuration parameter "max_fsm_pages". Yeah, that's pretty bad. ~2 Million live rows and ~18 Million dead ones is a pretty badly bloated table. Vacuum full is one way to reclaim that lost space. You can also dump and restore that one table, inside a drop / create restraints in a transaction during maintenance if you can. A Vacuum full is quite intrusive, so avoid it during normal working hours. Dumping and reloading the whole db may be faster than either a vacuum full or a cluster. A common trick is to do something like: begin; select * into ordermydata from bigbloatedtable order by some_field; delete * from bigbloatedtable; insert into bigbloatedtable select * from ordermydata; commit; This will both put your table in some order which might help, and remove the bloat. > A clean restore of the database to another server create a size on disk of > about 244GB. This server was at over 400GB yesterday, and now, after > aggressive vacuuming by hand, is down to 350GB. It had gotten so bad that > the backup was not finished when I got in yesterday, almost 8 hours after it > started. Sounds like either autovacuum isn't running right, or it was turned off and no decent vacuum schedule was set up by the dba. Or the FSM is too small, but judging by how bloated this table is, I'd guess vacuuming got turned off. > The machine has been under heavy load 24/7 for a couple of months, so I have > not been able to upgrade versions. I am taking it offline this weekend and > will install the latest. I'll try to re-create the scenario I had going on > yesterday over the weekend and get some io statistics. Definitely look into the check_postgresql.pl script for nagios or something similar to keep track of bloated tables. Every friday I log into my production servers and fire off a few of the queries from that script to check for bloat and make sure everything's running fine. Another useful trick is to email yourself a copy of the last 20 or 30 lines of vacuum verbose for the whole db every so often (once a week) and pay attention to your fsm usage.
On Fri, Feb 13, 2009 at 2:56 PM, Roger Ging <rging@musicreports.com> wrote: Oh yeah, also, any chance of testing your RAID array in RAID-10 some day? RAID5 anything tends to be pretty slow at writes, especially random ones, and RAID-10 may give you a lot more bandwidth where you need it, on the write side of the equation.
Hi Roger, On Fri, Feb 13, 2009 at 01:56:32PM -0800, Roger Ging wrote: > <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> Please don't post HTML mails to mailing lists. Thanks. > I can only answer a couple of the questions at the moment. I had to > kill the vacuum full and do a regular vacuum, so I can't get the iostat > and vmstat outputs right now. This message is the reason I was trying > to run vacuum full:<br> > <br> > INFO: "license": found 257 removable, 20265895 nonremovable row > versions in 1088061 pages<br> > DETAIL: 0 dead row versions cannot be removed yet.<br> > There were 18434951 unused item pointers.<br> > 687274 pages contain useful free space.<br> > 0 pages are entirely empty.<br> > CPU 38.15s/37.02u sec elapsed 621.19 sec.<br> > WARNING: relation "licensing.license" contains more than > "max_fsm_pages" pages with useful free space<br> > HINT: Consider using VACUUM FULL on this relation or increasing the > configuration parameter "max_fsm_pages". Just raise max_fsm_pages to a reasonable value (I've heard a rule of thumb somewhere: 65536 per GB of data), then do a regular vacuum. The space reclaimed by the vacuum might not become available to the OS, but it will be reused. The above warning tells you that the relation will probably start growing rather fast because unused space within a table cannot be tracked. > A clean restore of the database to another server create a size on disk > of about 244GB. This server was at over 400GB yesterday, and now, > after aggressive vacuuming by hand, is down to 350GB. It had gotten so > bad that the backup was not finished when I got in yesterday, almost 8 > hours after it started.<br> In my experience, dump/restore is faster than VACUUM FULL. I've had a VACUUM FULL running for about 3 days some time ago. Dump/restore took less than 12 hours (about 200 GB database on disk after restore). Actually, VACUUM FULL is a last-resort after something has been screwed. HTH, Tino. -- "What we nourish flourishes." - "Was wir nähren erblüht." www.lichtkreis-chemnitz.de www.craniosacralzentrum.de
On Fri, Feb 13, 2009 at 03:13:28PM -0700, Scott Marlowe wrote: [...] > Yeah, that's pretty bad. ~2 Million live rows and ~18 Million dead > ones is a pretty badly bloated table. > > Vacuum full is one way to reclaim that lost space. You can also dump > and restore that one table, inside a drop / create restraints in a > transaction during maintenance if you can. A Vacuum full is quite > intrusive, so avoid it during normal working hours. Dumping and > reloading the whole db may be faster than either a vacuum full or a > cluster. A common trick is to do something like: > > begin; > select * into ordermydata from bigbloatedtable order by some_field; > delete * from bigbloatedtable; > insert into bigbloatedtable select * from ordermydata; > commit; > > This will both put your table in some order which might help, and > remove the bloat. Really? Wouldn't that add even more bloat? How does that work? (I'd expect a drop table/create table instead of the delete...) Note: I suppose that you know a lot more about PG than I do, so I'm just curious. Thanks, Tino. -- "What we nourish flourishes." - "Was wir nähren erblüht." www.lichtkreis-chemnitz.de www.craniosacralzentrum.de
On Fri, Feb 13, 2009 at 3:34 PM, Tino Schwarze <postgresql@tisc.de> wrote: > On Fri, Feb 13, 2009 at 03:13:28PM -0700, Scott Marlowe wrote: > > [...] > >> Yeah, that's pretty bad. ~2 Million live rows and ~18 Million dead >> ones is a pretty badly bloated table. >> >> Vacuum full is one way to reclaim that lost space. You can also dump >> and restore that one table, inside a drop / create restraints in a >> transaction during maintenance if you can. A Vacuum full is quite >> intrusive, so avoid it during normal working hours. Dumping and >> reloading the whole db may be faster than either a vacuum full or a >> cluster. A common trick is to do something like: >> >> begin; >> select * into ordermydata from bigbloatedtable order by some_field; >> delete * from bigbloatedtable; >> insert into bigbloatedtable select * from ordermydata; >> commit; >> >> This will both put your table in some order which might help, and >> remove the bloat. > > Really? Wouldn't that add even more bloat? How does that work? (I'd > expect a drop table/create table instead of the delete...) > Note: I suppose that you know a lot more about PG than I do, so I'm just > curious. Whoops, meant truncate bigbloatedtable; sheesh, long week.
On Freitag 13 Februar 2009 Roger Ging wrote: > I'm running vacuum full analyze verbose on a table with 20million > rows and 11 indexes. In top, I'm seeing [pdflush] and postgres: > writer process each using diferent cpu cores, with wait time well > above 90% on each of them. The vacuum has been running for several > hours Roger, I've had the same issue some time ago and wondered why it was so slow. I did "iostat -kx 5 555" and saw that I/O was also quite low. vacuum_cost_delay = 0 That was the trick for me. It was set to 250(ms), where it took 5 hours for a vacuum to run. Now it takes 5-15 minutes. And for your bloated table, you should also REINDEX afterwards, because likely the index will be a mess also. mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4
On Fri, Feb 13, 2009 at 5:02 PM, Michael Monnerie <michael.monnerie@is.it-management.at> wrote: > On Freitag 13 Februar 2009 Roger Ging wrote: >> I'm running vacuum full analyze verbose on a table with 20million >> rows and 11 indexes. In top, I'm seeing [pdflush] and postgres: >> writer process each using diferent cpu cores, with wait time well >> above 90% on each of them. The vacuum has been running for several >> hours > > Roger, I've had the same issue some time ago and wondered why it was so > slow. I did "iostat -kx 5 555" and saw that I/O was also quite low. > > vacuum_cost_delay = 0 > That was the trick for me. It was set to 250(ms), where it took 5 hours > for a vacuum to run. Now it takes 5-15 minutes. Wow!!! 250 ms is HUGE in the scheme of vacuum cost delay. even 10ms is usually plenty to slow down vacuum enough to keep it out of your way and double to quadruple your vacuum times. 250 is like taking a nap every 5 feet while running a mile. :)
Scott Marlowe <scott.marlowe@gmail.com> writes: > On Fri, Feb 13, 2009 at 5:02 PM, Michael Monnerie > <michael.monnerie@is.it-management.at> wrote: >> vacuum_cost_delay = 0 >> That was the trick for me. It was set to 250(ms), where it took 5 hours >> for a vacuum to run. Now it takes 5-15 minutes. > Wow!!! 250 ms is HUGE in the scheme of vacuum cost delay. even 10ms > is usually plenty to slow down vacuum enough to keep it out of your > way and double to quadruple your vacuum times. I wonder whether we ought to tighten the allowed range of vacuum_cost_delay. The upper limit is 1000ms at the moment; but that's clearly much higher than is useful, and it seems to encourage people to pick silly values ... regards, tom lane
On Fri, Feb 13, 2009 at 7:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Scott Marlowe <scott.marlowe@gmail.com> writes: >> On Fri, Feb 13, 2009 at 5:02 PM, Michael Monnerie >> <michael.monnerie@is.it-management.at> wrote: >>> vacuum_cost_delay = 0 >>> That was the trick for me. It was set to 250(ms), where it took 5 hours >>> for a vacuum to run. Now it takes 5-15 minutes. > >> Wow!!! 250 ms is HUGE in the scheme of vacuum cost delay. even 10ms >> is usually plenty to slow down vacuum enough to keep it out of your >> way and double to quadruple your vacuum times. > > I wonder whether we ought to tighten the allowed range of > vacuum_cost_delay. The upper limit is 1000ms at the moment; > but that's clearly much higher than is useful, and it seems > to encourage people to pick silly values ... I agree. I can't imagine using a number over 50 or so.
On Fri, Feb 13, 2009 at 7:24 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Fri, Feb 13, 2009 at 7:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Scott Marlowe <scott.marlowe@gmail.com> writes: >>> On Fri, Feb 13, 2009 at 5:02 PM, Michael Monnerie >>> <michael.monnerie@is.it-management.at> wrote: >>>> vacuum_cost_delay = 0 >>>> That was the trick for me. It was set to 250(ms), where it took 5 hours >>>> for a vacuum to run. Now it takes 5-15 minutes. >> >>> Wow!!! 250 ms is HUGE in the scheme of vacuum cost delay. even 10ms >>> is usually plenty to slow down vacuum enough to keep it out of your >>> way and double to quadruple your vacuum times. >> >> I wonder whether we ought to tighten the allowed range of >> vacuum_cost_delay. The upper limit is 1000ms at the moment; >> but that's clearly much higher than is useful, and it seems >> to encourage people to pick silly values ... > > I agree. I can't imagine using a number over 50 or so. Although I'd probably just emit a log warning for anything over that saying that values over 50 will result in very very long vacuum times.
>>> Scott Marlowe <scott.marlowe@gmail.com> wrote: > On Fri, Feb 13, 2009 at 7:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Scott Marlowe <scott.marlowe@gmail.com> writes: >>> On Fri, Feb 13, 2009 at 5:02 PM, Michael Monnerie >>> <michael.monnerie@is.it-management.at> wrote: >>>> vacuum_cost_delay = 0 >>>> That was the trick for me. It was set to 250(ms), where it took 5 >>>> hours for a vacuum to run. Now it takes 5-15 minutes. >> >>> Wow!!! 250 ms is HUGE in the scheme of vacuum cost delay. even >>> 10ms is usually plenty to slow down vacuum enough to keep it out >>> of your way and double to quadruple your vacuum times. >> >> I wonder whether we ought to tighten the allowed range of >> vacuum_cost_delay. The upper limit is 1000ms at the moment; >> but that's clearly much higher than is useful, and it seems >> to encourage people to pick silly values ... > > I agree. I can't imagine using a number over 50 or so. I don't know what other people have found useful, but when I experimented with this in our environment, it seemed like I should just treat vacuum_cost_delay as a boolean, where 0 meant off and 10 meant on, and tune it by adjusting vacuum_cost_limit. The granularity of vacuum_cost_delay is course and surprising unpredictable. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > I don't know what other people have found useful, but when I > experimented with this in our environment, it seemed like I should > just treat vacuum_cost_delay as a boolean, where 0 meant off and 10 > meant on, and tune it by adjusting vacuum_cost_limit. The granularity > of vacuum_cost_delay is course and surprising unpredictable. Making it a boolean is a bit further than I care to go ;-) What I'd suggest at this point is changing the upper limit to 100ms (from 1000) and adding documentation suggesting that the value should be kept small, preferring to use the other vacuum_cost parameters to tune the behavior. regards, tom lane