Re: Autovacuum in the backend

Поиск
Список
Период
Сортировка
От Matthew T. O'Connor
Тема Re: Autovacuum in the backend
Дата
Msg-id 42B1AECF.9080005@zeut.net
обсуждение исходный текст
Ответ на Re: Autovacuum in the backend  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Autovacuum in the backend  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Tom Lane wrote:

>Alvaro Herrera <alvherre@surnet.cl> writes:
>  
>
>>Now, I'm hearing people don't like using libpq.
>>    
>>
>
>Yeah --- a libpq-based solution is not what I think of as integrated at
>all, because it cannot do anything that couldn't be done by the existing
>external autovacuum process.  About all you can buy there is having the
>postmaster spawn the autovacuum process, which is slightly more
>convenient to use but doesn't buy any real new functionality.
>  
>

Yes libpq has to go, I thought this was clear, but perhaps I didn't say 
it clearly enough.  Anyway, this was the stumbling block which prevented 
me from making more progress on autovacuum integration.


>>Some people say "keep it simple and have one process per cluster."  I
>>think they don't realize it's actually more complex, not the other way
>>around.
>>    
>>
>
>A simple approach would be a persistent autovac background process for
>each database, but I don't think that's likely to be acceptable because
>of the amount of resources tied up (PGPROC slots, open files, etc).
>  
>

Agreed, this seems ugly.

>One thing that might work is to have the postmaster spawn an autovac
>process every so often.  The first thing the autovac child does is pick
>up the current statistics dump file (which it can find without being
>connected to any particular database).  It looks through that to
>determine which database is most in need of work, then connects to that
>database and does some "reasonable" amount of work there, and finally
>quits.  Awhile later the postmaster spawns another autovac process that
>can connect to a different database and do work there.
>  
>

I don't think you can use a dump to determine who should be connected to 
next since you don't really know what happened since the last time you 
exited.  What was a priority 5 or 10 minutes ago might not be a priority 
now.

>This design would mean that the autovac process could not have any
>long-term state of its own: any long-term state would have to be in
>either system catalogs or the statistics.  But I don't see that as
>a bad thing really --- exposing the state will be helpful from a
>debugging and administrative standpoint.
>

This is not a problem as my patch,  that Alvaro has now taken over, 
already created a new system catalog for all autovac data, so autovac 
really doesn't contain any static persistent data.

The rough design I had in mind was:
1)  On startup postmaster spawns the master autovacuum process
2)  The master autovacuum process spawns backends to do the vacuuming 
work on a particular database
3)  The master autovacuum waits for this process to exit, then spaws the 
next backend for the next database
4)  Repeat this loop until all databases in the cluster have been 
checked, then sleep for a while, and start over again.

I'm not sure if this is feasible, or if this special master autovacuum 
process would be able to fork off or request that the postmaster fork 
off an autovacuum process for a particular database in the cluster.  
Thoughts or comments?

Matthew



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: Proposal - Continue stmt for PL/pgSQL
Следующее
От: Bruno Wolff III
Дата:
Сообщение: Re: Proposal - Continue stmt for PL/pgSQL