Обсуждение: explanation for seeks in VACUUM
"bigtable" has about 60M records, about 2M of which are dead at the time of VACUUM. Shared_buffers are about 1GB, and the machine has 4GB of memory. If I run a "SELECT COUNT(*) FROM bigtable", and I ktrace that (FreeBSD) for 10 seconds, I see only a handful of lseek calls (33), which is no surprise since I am asking for sequential I/O. I assume those lseeks are just to skip over pages that already happen to be in shared_buffers. However, If I have several indexes on that table, and I run a VACUUM, I observe a lot of seeking. In a 10 second interval, I saw about 5000 lseek calls in the ktrace to the same file descriptor (which is an index). That's about one every 2ms, so I'm sure a large portion of the file must have been in the OS buffer cache. I just don't quite understand what's causing the lseeks. My understanding is that vacuum uses maintenance_work_mem to hold the list of dead tuples. In my case that's 2M row versions, times about 6 bytes per entry (in the list of dead tuples) equals about 12MB, which is much less than 128MB maintenance_work_mem. So it doesn't appear that maintenance_work_mem is too small. Even if maintenance_work_mem was the limiting factor, wouldn't the VACUUM still be operating mostly sequentially, even if it takes multiple passes? The only seeking that it seems like VACUUM would need to do in an index file is when an index page completely empties out, but that wouldn't account for 5000 lseeks in 10 seconds, would it? Where am I going wrong? Are many of these lseeks no-ops or something? Regards, Jeff Davis
On Fri, 2007-12-14 at 11:29 -0800, Jeff Davis wrote: > "bigtable" has about 60M records, about 2M of which are dead at the time > of VACUUM. Shared_buffers are about 1GB, and the machine has 4GB of > memory. Forgot to mention: version 8.2.4 Regards, Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > Where am I going wrong? Are many of these lseeks no-ops or something? They're not supposed to be, but if you only tracked seeks and not reads or writes, it's hard to be sure what's going on. 8.2's VACUUM should process a btree index (this is a btree index no?) in physical order, so I'd expect lseeks only when a page is already in buffers --- at least on the read side. On the write side things might be a great deal less predictable. You're cleaning out about one tuple in 30, so the odds are that nearly every index page is getting dirtied, and they're going to need to be written sometime. regards, tom lane
On Fri, 2007-12-14 at 19:04 -0500, Tom Lane wrote: > Jeff Davis <pgsql@j-davis.com> writes: > > Where am I going wrong? Are many of these lseeks no-ops or something? > > They're not supposed to be, but if you only tracked seeks and not > reads or writes, it's hard to be sure what's going on. The seeks were comparable to reads and writes, which is what surprised me: $ ktrace -p 42440; sleep 5; ktrace -C $ kdump -f ktrace.out | grep "GIO fd 38 read" | wc -l 11479 $ kdump -f ktrace.out | grep "GIO fd 38 wrote" | wc -l 11480 $ kdump -f ktrace.out | grep "lseek.0x26" | wc -l 22960 > 8.2's VACUUM should process a btree index (this is a btree index no?) > in physical order, so I'd expect lseeks only when a page is already in > buffers --- at least on the read side. On the write side things might > be a great deal less predictable. You're cleaning out about one tuple > in 30, so the odds are that nearly every index page is getting dirtied, > and they're going to need to be written sometime. Actually, the table I was VACUUMing had both btree and GIN indexes, and I'm not entirely sure which one was happening at the time. I will investigate more. My intuition tells me that, if the pages are being read and dirtied sequentially, it would be able to write mostly sequentially (at least in theory). In the box that was causing the problem (which had more constrained memory than my reproduced case), it seemed to be swamped by random I/O -- low CPU usage, and low disk usage (about 1-2 MB/s write), yet VACUUM would take forever and the box would appear very sluggish. Regards, Jeff Davis