Обсуждение: autovacuum: multiple workers

Поиск
Список
Период
Сортировка

autovacuum: multiple workers

От
Alvaro Herrera
Дата:
Hi,

This is the patch to put multiple workers into autovacuum.  This patch
applies after the recheck patch I just posted.

The main change is to have an array of Worker structs in shared memory;
each worker checks the current table of all other Workers, and skips a
table that's being vacuumed by any of them.  It also rechecks the table
before vacuuming, which removes the problem of redundant vacuuming.

It also introduces the business of SIGUSR1 between workers and launcher.
The launcher keeps a database list in memory and schedules workers to
vacuum databases depending on that list.  The actual database selected
may differ from what was in the schedule; in that case, the list is
reconstructed.

There are two main FIXMEs in this code:

1. have the list reconstruction and scheduling be smarter so that
databases are not ganged together in the schedule.  The only difficulty
is keeping the sort order that the databases had.

2. have a way to clean up after failed workers filling up the Worker
array and thus starving other databases from vacuuming.  I don't really
know a way to do this that works in all cases.  The only idea I have so
far is that workers that started more than autovacuum_naptime seconds
ago are considered failed to start.


Neither of these is really minor, but I think they are solvable.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Вложения

Re: autovacuum: multiple workers

От
"Simon Riggs"
Дата:
On Tue, 2007-03-27 at 17:41 -0400, Alvaro Herrera wrote:

> The main change is to have an array of Worker structs in shared memory;
> each worker checks the current table of all other Workers, and skips a
> table that's being vacuumed by any of them.  It also rechecks the table
> before vacuuming, which removes the problem of redundant vacuuming.

Slightly OT: Personally, I'd like it if we added an array for all
special backends, with configurable behaviour. That way it would be
easier to have multiple copies of other backends of any flavour using
the same code, as well as adding others without cutting and pasting each
time. That part of the postmaster code has oozed sideways in the past
few years and seems in need of some love. (A former sinner repents).

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: autovacuum: multiple workers

От
Alvaro Herrera
Дата:
Simon Riggs wrote:
> On Tue, 2007-03-27 at 17:41 -0400, Alvaro Herrera wrote:
>
> > The main change is to have an array of Worker structs in shared memory;
> > each worker checks the current table of all other Workers, and skips a
> > table that's being vacuumed by any of them.  It also rechecks the table
> > before vacuuming, which removes the problem of redundant vacuuming.
>
> Slightly OT: Personally, I'd like it if we added an array for all
> special backends, with configurable behaviour. That way it would be
> easier to have multiple copies of other backends of any flavour using
> the same code, as well as adding others without cutting and pasting each
> time. That part of the postmaster code has oozed sideways in the past
> few years and seems in need of some love. (A former sinner repents).

I'm not really thrilled about it, each case being so different from the
others.  For the autovac workers, for example, the array in shared
memory is kept on the autovac launcher, _not_ in the postmaster.  In the
postmaster, they are kept in the regular BackendList array, so they
don't fit on that array you describe.  And as far as the other processes
are concerned, every one of them is a special case, and we don't add new
ones frequently.  In fact, the autovac work is the only thing that has
added new processes in a long time, since the Windows port was
introduced (which required the logger process) and the bgwriter.

How would you make it "configurable"?  Have a struct containing function
pointers, each function being called when some event takes place?

What other auxiliary processes are you envisioning, anyway?

In any case I don't think this is something that would be good to attack
this late in the devel cycle -- we could discuss it for 8.4 though.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: autovacuum: multiple workers

От
"Simon Riggs"
Дата:
On Wed, 2007-03-28 at 09:39 -0400, Alvaro Herrera wrote:
> What other auxiliary processes are you envisioning, anyway?

WAL Writer, multiple bgwriters, checkpoint process, parallel query and
sort slaves....plus all the ones I haven't dreamed of yet.

No need to agree with my short list, but we do seem to keep adding them
on a regular basis....

> In any case I don't think this is something that would be good to
> attack
> this late in the devel cycle -- we could discuss it for 8.4 though.

OK

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com