Обсуждение: autovacuum stuck on a table for 18+ hours, consuming lots of CPU time
Greetings, I'm running PostgreSQL-9.0.4 on a Linux-x86_64 cluster with 1 master, and two streaming replication slaves. Since late yesterday, the load on the server has been noticably higher (5.00+) than normal (generally under 1.00). I investigated, and found that for the past ~18 hours, there's one autovacuum process that has been running, and not making any obvious progress: select procpid,query_start,current_query from pg_stat_activity where current_query LIKE 'autovacuum%' ; procpid | query_start | current_query ---------+-------------------------------+----------------------------------------- 30188 | 2011-11-21 22:42:26.426315-08 | autovacuum: VACUUM public.nppsmoketests (1 row) select c.oid,c.relname,l.pid,l.mode,l.granted from pg_class c join pg_locks l on c.oid=l.relation where l.pid='30188' order by l.pid; oid | relname | pid | mode | granted -------+------------------------------------+-------+--------------------------+--------- 72112 | nppsmoketests | 30188 | ShareUpdateExclusiveLock | t 72617 | nppsmoketests_pkey | 30188 | RowExclusiveLock | t 72619 | nppsmoketests_bug_idx | 30188 | RowExclusiveLock | t 72620 | nppsmoketests_active_idx | 30188 | RowExclusiveLock | t 72621 | nppsmoketests_arch_idx | 30188 | RowExclusiveLock | t 72622 | nppsmoketests_branch_idx | 30188 | RowExclusiveLock | t 72623 | nppsmoketests_current_status_idx | 30188 | RowExclusiveLock | t 72624 | nppsmoketests_build_type_idx | 30188 | RowExclusiveLock | t 72625 | nppsmoketests_gpu_idx | 30188 | RowExclusiveLock | t 72626 | nppsmoketests_os_idx | 30188 | RowExclusiveLock | t 72627 | nppsmoketests_owner_idx | 30188 | RowExclusiveLock | t 72628 | nppsmoketests_regressioncl_idx | 30188 | RowExclusiveLock | t 72629 | nppsmoketests_subtest_idx | 30188 | RowExclusiveLock | t 72630 | nppsmoketests_suiteid_idx | 30188 | RowExclusiveLock | t 72631 | nppsmoketests_suiteid_testname_idx | 30188 | RowExclusiveLock | t 72632 | nppsmoketests_testcl_idx | 30188 | RowExclusiveLock | t 72633 | nppsmoketests_testname_idx | 30188 | RowExclusiveLock | t 80749 | nppsmoketests_osversion_idx | 30188 | RowExclusiveLock | t (18 rows) When I strace PID 30188, I see tons of this scrolling past quickly, but I'm not really sure what it means beyond a 'Timeout' not looking good: select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) lseek(95, 753901568, SEEK_SET) = 753901568 read(95, "\202\1\0\0\260\315\250\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 lseek(95, 753917952, SEEK_SET) = 753917952 read(95, "\202\1\0\0 N\253\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) lseek(95, 768606208, SEEK_SET) = 768606208 read(95, "\204\1\0\0h!~\233\1\0\0\0\230\0\360\20\360\37\4 \0\0\0\0x\237\360\0\0\237\360\0"..., 8192) = 8192 lseek(95, 753934336, SEEK_SET) = 753934336 read(95, "\202\1\0\0 &\275\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) read(95, "\202\1\0\0\10\33\276\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 lseek(95, 753958912, SEEK_SET) = 753958912 read(95, "\202\1\0\0x\317\307\245\1\0\0\0\220\0\360\20\360\37\4 \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) lseek(95, 768614400, SEEK_SET) = 768614400 An old thread suggests that this is a stuck spinlock: http://archives.postgresql.org/pgsql-performance/2009-05/msg00455.php I'm using the defaults for all the *vacuum* options in postgresql.conf, except for: log_autovacuum_min_duration = 2500 At this point, I'm not sure what I can safely do to either terminate this autovacuum process, or kick it into making progress again? thanks
Lonni J Friedman <netllama@gmail.com> writes: > When I strace PID 30188, I see tons of this scrolling past quickly, > but I'm not really sure what it means beyond a 'Timeout' not looking > good: > select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) > lseek(95, 753901568, SEEK_SET) = 753901568 > read(95, "\202\1\0\0\260\315\250\245\1\0\0\0\220\0\360\20\360\37\4 > \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 > lseek(95, 753917952, SEEK_SET) = 753917952 > read(95, "\202\1\0\0 N\253\245\1\0\0\0\220\0\360\20\360\37\4 > \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 > select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) I'm betting the selects are implementing vacuum_cost_delay, and that the reason this is taking forever is that you have that cranked up to an unreasonably high value. There's no evidence of looping in what you showed us, because the seek addresses are changing. regards, tom lane
On Tue, Nov 22, 2011 at 6:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Lonni J Friedman <netllama@gmail.com> writes: >> When I strace PID 30188, I see tons of this scrolling past quickly, >> but I'm not really sure what it means beyond a 'Timeout' not looking >> good: >> select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) >> lseek(95, 753901568, SEEK_SET) = 753901568 >> read(95, "\202\1\0\0\260\315\250\245\1\0\0\0\220\0\360\20\360\37\4 >> \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 >> lseek(95, 753917952, SEEK_SET) = 753917952 >> read(95, "\202\1\0\0 N\253\245\1\0\0\0\220\0\360\20\360\37\4 >> \0\0\0\0p\237\0\1\360\236\0\1"..., 8192) = 8192 >> select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) > > I'm betting the selects are implementing vacuum_cost_delay, and that > the reason this is taking forever is that you have that cranked up > to an unreasonably high value. There's no evidence of looping in > what you showed us, because the seek addresses are changing. Thanks for your prompt reply. I was pretty sure that I was using the default, but just to confirm, I just ran: 'SHOW vacuum_cost_delay;' and it returned 0. Any other suggestions? Is there no way to safely kill off this autovacuum process that doesn't involve shutting down & restarting the database? thanks
Lonni J Friedman <netllama@gmail.com> writes: > Thanks for your prompt reply. I was pretty sure that I was using the > default, but just to confirm, I just ran: > 'SHOW vacuum_cost_delay;' What about autovacuum_vacuum_cost_delay? The selects seem to be delaying for 32msec, which is not the default for anything. > Is there no way to safely kill off this autovacuum process that > doesn't involve shutting down & restarting the database? Sending it a SIGINT ought to be safe enough, though I don't think that is necessarily advisable, because the next autovacuum will probably take just as long. Killing this one will mainly mean you've wasted (much of) the work it did so far. Before getting hasty I'd suggest identifying what table (or index) it's working on --- lsof on the process to see what FD 95 is connected to would be the first step. I'm thinking it's an index since the seek addresses don't seem to be consecutive. And it might be worth watching the seek addresses for awhile to see if you can prove that it's looping --- if it is, that might be an indication of a corrupt index. If it isn't, but is just steadily working through the index, you'd be best advised to have patience. regards, tom lane
Hi, On 23 November 2011 13:20, Lonni J Friedman <netllama@gmail.com> wrote: > I investigated, and found that for the past ~18 hours, > there's one autovacuum process that has been running, and not making > any obvious progress: snip... > I'm using the defaults for all the *vacuum* options in > postgresql.conf, except for: > log_autovacuum_min_duration = 2500 Defaults are: autovacuum_vacuum_cost_delay = 20 msec autovacuum_vacuum_cost_limit = -1 (ie vacuum_cost_limit is used) vacuum_cost_limit = 200 If table is busy -- many updates and deletes then auto vacuum exhausts cost limit almost immediately. You can try to set autovacuum_vacuum_cost_delay to -1 (which disables cost based auto vacuum) but you don't want to saturate your disks. Other option is to increase vacuum_cost_limit/autovacuum_vacuum_cost_limit -- Ondrej Ivanic (ondrej.ivanic@gmail.com)
On Tue, Nov 22, 2011 at 7:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Lonni J Friedman <netllama@gmail.com> writes: >> Thanks for your prompt reply. I was pretty sure that I was using the >> default, but just to confirm, I just ran: >> 'SHOW vacuum_cost_delay;' > > What about autovacuum_vacuum_cost_delay? The selects seem to be > delaying for 32msec, which is not the default for anything. 20ms is what it returns. I grepped for '32' in postgresql.conf and other than the '5432' port, nothing was returned. I'm using the official postgres RPMs that shipped with Fedora15-x86_64, so unless something weird got compiled in, I have no idea where that 32ms is coming from. > >> Is there no way to safely kill off this autovacuum process that >> doesn't involve shutting down & restarting the database? > > Sending it a SIGINT ought to be safe enough, though I don't think that > is necessarily advisable, because the next autovacuum will probably take > just as long. Killing this one will mainly mean you've wasted (much of) > the work it did so far. ok, then I guess I'll wait longer. > > Before getting hasty I'd suggest identifying what table (or index) it's > working on --- lsof on the process to see what FD 95 is connected to > would be the first step. I'm thinking it's an index since the seek > addresses don't seem to be consecutive. And it might be worth watching > the seek addresses for awhile to see if you can prove that it's looping > --- if it is, that might be an indication of a corrupt index. If it > isn't, but is just steadily working through the index, you'd be best > advised to have patience. I suspect you're right. I just ran strace against that PID again, and now all the lseek & read FD's are referrring to a different number (115), so that means its moved onto something new since I looked a few hours ago? Anyway, I think this is what you were referring to: /proc/30188/fd/115 -> /var/lib/pgsql/data/base/64793/72633.10 How do I correlate that file to an actual database object? thanks
Lonni J Friedman <netllama@gmail.com> writes: > I suspect you're right. I just ran strace against that PID again, and > now all the lseek & read FD's are referrring to a different number > (115), so that means its moved onto something new since I looked a few > hours ago? > Anyway, I think this is what you were referring to: > /proc/30188/fd/115 -> /var/lib/pgsql/data/base/64793/72633.10 > How do I correlate that file to an actual database object? 64793 is the pg_database.oid of the database, and 72633 is the pg_class.relfilenode value of the table/index. regards, tom lane
On Tue, Nov 22, 2011 at 7:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Lonni J Friedman <netllama@gmail.com> writes: >> I suspect you're right. I just ran strace against that PID again, and >> now all the lseek & read FD's are referrring to a different number >> (115), so that means its moved onto something new since I looked a few >> hours ago? > >> Anyway, I think this is what you were referring to: >> /proc/30188/fd/115 -> /var/lib/pgsql/data/base/64793/72633.10 > >> How do I correlate that file to an actual database object? > > 64793 is the pg_database.oid of the database, and 72633 is the > pg_class.relfilenode value of the table/index. Its definitely an index. Thanks for your help, I just need to be patient now that I understand how to better monitor this.
On Tue, Nov 22, 2011 at 11:00 PM, Lonni J Friedman <netllama@gmail.com> wrote: > On Tue, Nov 22, 2011 at 7:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Lonni J Friedman <netllama@gmail.com> writes: >>> I suspect you're right. I just ran strace against that PID again, and >>> now all the lseek & read FD's are referrring to a different number >>> (115), so that means its moved onto something new since I looked a few >>> hours ago? >> >>> Anyway, I think this is what you were referring to: >>> /proc/30188/fd/115 -> /var/lib/pgsql/data/base/64793/72633.10 >> >>> How do I correlate that file to an actual database object? >> >> 64793 is the pg_database.oid of the database, and 72633 is the >> pg_class.relfilenode value of the table/index. > > Its definitely an index. Thanks for your help, I just need to be > patient now that I understand how to better monitor this. > Well, it sounds like you have things set up for both a cost limit and a cost delay, which means if you manually vacuumed the thing, it would probably go quicker, at the cost of more i/o, but given the cpu overhead, probably a trade worth making. Personally I'd throw out those vacuum cost settings entirely as they cause more trouble than they're worth (IMNSHO), and you'll likely see this again in the future. Robert Treat conjecture: xzilla.net consulting: omniti.com
On Tue, Nov 22, 2011 at 10:51 PM, Robert Treat <rob@xzilla.net> wrote: > On Tue, Nov 22, 2011 at 11:00 PM, Lonni J Friedman <netllama@gmail.com> wrote: >> On Tue, Nov 22, 2011 at 7:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Lonni J Friedman <netllama@gmail.com> writes: >>>> I suspect you're right. I just ran strace against that PID again, and >>>> now all the lseek & read FD's are referrring to a different number >>>> (115), so that means its moved onto something new since I looked a few >>>> hours ago? >>> >>>> Anyway, I think this is what you were referring to: >>>> /proc/30188/fd/115 -> /var/lib/pgsql/data/base/64793/72633.10 >>> >>>> How do I correlate that file to an actual database object? >>> >>> 64793 is the pg_database.oid of the database, and 72633 is the >>> pg_class.relfilenode value of the table/index. >> >> Its definitely an index. Thanks for your help, I just need to be >> patient now that I understand how to better monitor this. >> > > Well, it sounds like you have things set up for both a cost limit and > a cost delay, which means if you manually vacuumed the thing, it would > probably go quicker, at the cost of more i/o, but given the cpu > overhead, probably a trade worth making. Personally I'd throw out > those vacuum cost settings entirely as they cause more trouble than > they're worth (IMNSHO), and you'll likely see this again in the > future. I'd keep an eye on iostat -xd 10 output when playing with vacuum settings.