On T, 2005-09-27 at 17:57 -0500, Jim C. Nasby wrote:
> On Tue, Sep 27, 2005 at 02:47:46PM -0400, Jan Wieck wrote:
> > On 9/24/2005 8:17 PM, Jim C. Nasby wrote:
> >
> > >Would it be difficult to vacuum as part of a dump? The reasoning behind
> > >this is that you have to read the table to do the dump anyway,
> >
> > I think aside from what's been said so far, it would be rather difficult
> > anyway. pg_dump relies on MVCC and requires to run in one transaction to
> > see a consistent snapshot while vacuum jiggles around with transactions
> > in some rather non-standard way.
>
> Is this true even if they were in different connections?
>
> My (vague) understanding of the vacuum process is that it first vacuums
> indexes, and then vacuums the heap.
actually (lazy) vacuum does this
1) scan heap, collect ctids of rows to remove
2) clean indexes
3) clean heap
> Since we don't dump indexes, there's
> nothing for backup to do while those are vacuumed, so my idea is:
>
> pg_dump:
> foreach (table)
> spawn vacuum
> wait for vacuum to hit heap
> start copy
> wait for analyze to finish
> next;
probably the first heap scan of vacuum would go faster than dump as it
does not have to write out anything, and the second scan ( nr 3 in above
list ) would be either faster or slower, as it has to lock each page and
rearrange tuples there.
so it would be very hard to synchronize vacuum with either of them.
--
Hannu Krosing <hannu@skype.net>