Re: another autovacuum scheduling thread
| От | Sami Imseih |
|---|---|
| Тема | Re: another autovacuum scheduling thread |
| Дата | |
| Msg-id | CAA5RZ0sw+9rEaW9taNpRZWvuLYMjRa9iibneGfB2ftNSUHT0Ww@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: another autovacuum scheduling thread (David Rowley <dgrowleyml@gmail.com>) |
| Ответы |
Re: another autovacuum scheduling thread
|
| Список | pgsql-hackers |
Thanks for the ideas on improving the test! I am still trying to see how useful this type of testing is, but I will share what I have done. > I wonder if it would be more realistic to throttle the work simulation > to a certain speed with pgbench -R rather than having it go flat out. good point > > If we logged the score, we could do the "unpatched" test with the > > patched code, just with commenting out the > > list_sort(tables_to_process, TableToProcessComparator); It'd then be > > interesting to zero the log_auto*_min_duration settings and review the > > order differences and how high the scores got. Would the average score > > be higher or lower with patched version? I agree. I attached a patch on top of v7 that implements a debug GUC to enable or disable sorting for testing purposes. > I'm not yet sure how meaningful it is, but I tried adding the > following to recheck_relation_needs_vacanalyze(): > > elog(LOG, "Performing autovacuum of table \"%s\" with score = %f", > get_rel_name(relid), score); The same attached patch also implements this log. I also spent more time working on the test script. I cleaned it up and combined it into a single script. I added a few things: - Ability to run with or without the batch workload. - OLTP tables are no longer the same size; they are created with different row counts using a minimum and maximum row count and a multiplier for scaling the next table. - A background collector for pg_stat_all_tables on relevant tables, stored in relstats_monitor.log. - Logs are saved after the run for further analysis, such as examining the scores. Also attached is analysis for a run with 16 OLTP tables and 3 batch tables. It shows that with sorting enabled or disabled, the vacuum/analyze activity does not show any major differences. OLTP had very similar DML and autovacuum/autoanalyze activity. A few points to highlight: 1/ In the sorted run, we had an equal number of autovacuums/autoanalyze on the smaller OLTP tables, as if every eligible table needed both autovacuum and autoanalyze. The unsorted run was less consistent on the smaller tables. I observed this on several runs. I don't think it's a big deal, but interesting nonetheless. 2/ Batch tables in the sorted run had less autovacuum time (1,257,821 vs 962,794 ms), but very similar autovacuum counts. 3/ OLTP tables, on the other hand, had more autovacuum time in the sorted run (3,590,964 vs 3,852,460 ms), but I do not see much difference in autovacuum/autoanalyze counts. Other tests I plan on running: - batch updates/deletes, since the current batch option only tests append-only tables. - OLTP only test. Also, I am thinking about another sorting strategy based on average autovacuum/autoanalyze time per table. The idea is to sort ascending by the greater of the two averages, so workers process quicker tables first instead of all workers potentially getting hung on the slowest tables. We can calculate the average now that v18 includes total_autovacuum_time and total_autoanalyze time. The way I see it, regardless of prioritization, a few large tables may still monopolize autovacuum workers. But at least this way, the quick tables get a chance to get processed first. Will this be an idea worth testing out? -- Sami Imseih Amazon Web Services (AWS)
Вложения
В списке pgsql-hackers по дате отправления: