Обсуждение: autovacuum: multiple workers
Hi, This is the patch to put multiple workers into autovacuum. This patch applies after the recheck patch I just posted. The main change is to have an array of Worker structs in shared memory; each worker checks the current table of all other Workers, and skips a table that's being vacuumed by any of them. It also rechecks the table before vacuuming, which removes the problem of redundant vacuuming. It also introduces the business of SIGUSR1 between workers and launcher. The launcher keeps a database list in memory and schedules workers to vacuum databases depending on that list. The actual database selected may differ from what was in the schedule; in that case, the list is reconstructed. There are two main FIXMEs in this code: 1. have the list reconstruction and scheduling be smarter so that databases are not ganged together in the schedule. The only difficulty is keeping the sort order that the databases had. 2. have a way to clean up after failed workers filling up the Worker array and thus starving other databases from vacuuming. I don't really know a way to do this that works in all cases. The only idea I have so far is that workers that started more than autovacuum_naptime seconds ago are considered failed to start. Neither of these is really minor, but I think they are solvable. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Вложения
On Tue, 2007-03-27 at 17:41 -0400, Alvaro Herrera wrote: > The main change is to have an array of Worker structs in shared memory; > each worker checks the current table of all other Workers, and skips a > table that's being vacuumed by any of them. It also rechecks the table > before vacuuming, which removes the problem of redundant vacuuming. Slightly OT: Personally, I'd like it if we added an array for all special backends, with configurable behaviour. That way it would be easier to have multiple copies of other backends of any flavour using the same code, as well as adding others without cutting and pasting each time. That part of the postmaster code has oozed sideways in the past few years and seems in need of some love. (A former sinner repents). -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs wrote: > On Tue, 2007-03-27 at 17:41 -0400, Alvaro Herrera wrote: > > > The main change is to have an array of Worker structs in shared memory; > > each worker checks the current table of all other Workers, and skips a > > table that's being vacuumed by any of them. It also rechecks the table > > before vacuuming, which removes the problem of redundant vacuuming. > > Slightly OT: Personally, I'd like it if we added an array for all > special backends, with configurable behaviour. That way it would be > easier to have multiple copies of other backends of any flavour using > the same code, as well as adding others without cutting and pasting each > time. That part of the postmaster code has oozed sideways in the past > few years and seems in need of some love. (A former sinner repents). I'm not really thrilled about it, each case being so different from the others. For the autovac workers, for example, the array in shared memory is kept on the autovac launcher, _not_ in the postmaster. In the postmaster, they are kept in the regular BackendList array, so they don't fit on that array you describe. And as far as the other processes are concerned, every one of them is a special case, and we don't add new ones frequently. In fact, the autovac work is the only thing that has added new processes in a long time, since the Windows port was introduced (which required the logger process) and the bgwriter. How would you make it "configurable"? Have a struct containing function pointers, each function being called when some event takes place? What other auxiliary processes are you envisioning, anyway? In any case I don't think this is something that would be good to attack this late in the devel cycle -- we could discuss it for 8.4 though. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Wed, 2007-03-28 at 09:39 -0400, Alvaro Herrera wrote: > What other auxiliary processes are you envisioning, anyway? WAL Writer, multiple bgwriters, checkpoint process, parallel query and sort slaves....plus all the ones I haven't dreamed of yet. No need to agree with my short list, but we do seem to keep adding them on a regular basis.... > In any case I don't think this is something that would be good to > attack > this late in the devel cycle -- we could discuss it for 8.4 though. OK -- Simon Riggs EnterpriseDB http://www.enterprisedb.com