Andres,
> all visible is only set in vacuum and it determines which parts of a
> table will be scanned in a non full table vacuum. So, since we won't
> regularly start vacuum in the insert only case there will still be a
> batch of work at once. But nearly all of that work is *already*
> performed. We would just what the details of that around for a
> bit. *But* since we now would only need to vacuum the non all-visible
> part that would get noticeably cheaper as well.
Yeah, I can see that. Seems worthwhile, then.
> I think for that case we should run vacuum more regularly for insert
> only tables since we currently don't do regularly enough which a) increases
> the amount of work needed at once and b) prevents index only scans from
> working there.
Yes. I'm not sure how we would set this though; I think it's another
example of how autovacuum's parameters for when to vaccuum etc. are too
simple-minded for the real world. Doing an all-visible scan on an
insert-only table, for example, should be based on XID age and not on %
inserted, no?
Speaking of which, I need to get on revamping the math for autoanalyze.
Mind you, in the real-world insert-only table case, this does create
extra IO -- real insert-only tables often have a few rows ( < 5% ) which
are updated/deleted. Vacuum would see these and want to clean the pages
up, which would create much more substantial IO. It might still be a
good tradeoff, but we should be aware of it.
Unless we want a special VACUUM ALL VISIBLE mode. I vote no, unless we
demonstrate some really convincing case for it.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com