Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue)
| От | Michael Paquier | 
|---|---|
| Тема | Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue) | 
| Дата | |
| Msg-id | Y555O1zrtHOshuRC@paquier.xyz обсуждение исходный текст | 
| Ответ на | Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue) (Tom Lane <tgl@sss.pgh.pa.us>) | 
| Ответы | Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue) Re: BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue) | 
| Список | pgsql-bugs | 
On Thu, Dec 15, 2022 at 01:56:30PM -0500, Tom Lane wrote: > * To fix vacuumdb properly, it might be enough to get it to > batch VACUUMs, say by naming up to 1000 tables per command > instead of just one. I'm not sure how that would interact > with its parallelization logic, though. It's not really > solving the O(N^2) issue either, just pushing it further out. I have been thinking about this part, and using a hardcoded rule for the batches would be tricky. The list of relations returned by the scan of pg_class are ordered by relpages, so depending on the distribution of the sizes (few tables with a large size and a lot of table with small sizes, exponential distribution of table sizes), we may finish with more downsides than upsides in some cases, even if we use a linear rule based on the number of relations, or even if we distribute the relations across the slots in a round robin fashion for example. In order to control all that, rather than a hardcoded rule, could it be as simple as introducing an option like vacuumdb --batch=N defaulting to 1 to let users control the number of relations grouped in a single command with a round robin distribution for each slot? -- Michael
Вложения
В списке pgsql-bugs по дате отправления: