Re: Vacuum statistics

Поиск
Список
Период
Сортировка
От Alena Rybakina
Тема Re: Vacuum statistics
Дата
Msg-id 18169b68-5b10-40fd-9657-be04f2bd0161@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Vacuum statistics  (Alexander Korotkov <aekorotkov@gmail.com>)
Список pgsql-hackers
On 02.06.2025 19:25, Alexander Korotkov wrote:
> On Tue, May 13, 2025 at 12:49 PM Alena Rybakina
> <a.rybakina@postgrespro.ru> wrote:
>> On 12.05.2025 08:30, Amit Kapila wrote:
>>> On Fri, May 9, 2025 at 5:34 PM Alena Rybakina <a.rybakina@postgrespro.ru> wrote:
>>>> I did a rebase and finished the part with storing statistics separately from the relation statistics - now it is
possibleto disable the collection of statistics for relationsh using gucs and
 
>>>> this allows us to solve the problem with the memory consumed.
>>>>
>>> I think this patch is trying to collect data similar to what we do for
>>> pg_stat_statements for SQL statements. So, can't we follow a similar
>>> idea such that these additional statistics will be collected once some
>>> external module like pg_stat_statements is enabled? That module should
>>> be responsible for accumulating and resetting the data, so we won't
>>> have this memory consumption issue.
>> The idea is good, it will require one hook for the pgstat_report_vacuum
>> function, the extvac_stats_start and extvac_stats_end functions can be
>> run if the extension is loaded, so as not to add more hooks.
> +1
> Nice idea of a hook.  Given the volume of the patch, it might be a
> good idea to keep this as an extension.

Today, I finalized the vacuum statistics separation approach and 
refactored the vacuum statistics structures (patch 4).

I also reworked the table statistics to avoid mixing index statistics in 
parallel vacuum mode (patch 2).

The new approach excludes buffer usage and WAL statistics for indexes 
from the table’s statistics.
For timing, if vacuuming is sequential, the total time spent on all 
indexes is subtracted from the table’s total vacuum time by adding up 
the individual index vacuum times. If vacuuming is parallel, the total 
index vacuum time is subtracted as a whole.

static void
accumulate_idxs_vacuum_statistics(LVRelState *vacrel, ExtVacReport 
*extVacIdxStats)
{
     if (!pgstat_track_vacuum_statistics)
         return;

     /* Fill heap-specific extended stats fields */
     vacrel->extVacReportIdx.blk_read_time += extVacIdxStats->blk_read_time;
     vacrel->extVacReportIdx.blk_write_time += 
extVacIdxStats->blk_write_time;
     vacrel->extVacReportIdx.total_blks_dirtied += 
extVacIdxStats->total_blks_dirtied;
     vacrel->extVacReportIdx.total_blks_hit += 
extVacIdxStats->total_blks_hit;
     vacrel->extVacReportIdx.total_blks_read += 
extVacIdxStats->total_blks_read;
     vacrel->extVacReportIdx.total_blks_written += 
extVacIdxStats->total_blks_written;
     vacrel->extVacReportIdx.wal_bytes += extVacIdxStats->wal_bytes;
     vacrel->extVacReportIdx.wal_fpi += extVacIdxStats->wal_fpi;
     vacrel->extVacReportIdx.wal_records += extVacIdxStats->wal_records;
     vacrel->extVacReportIdx.delay_time += extVacIdxStats->delay_time;

     vacrel->extVacReportIdx.total_time += extVacIdxStats->total_time;

}

if (ParallelVacuumIsActive(vacrel))
{
     LVExtStatCounters counters;
     ExtVacReport extVacReport;

     memset(&extVacReport, 0, sizeof(ExtVacReport));

     extvac_stats_start(vacrel->rel, &counters);

     /* Outsource everything to parallel variant */
     parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
vacrel->num_index_scans);

     extvac_stats_end(vacrel->rel, &counters, &extVacReport);
     accumulate_idxs_vacuum_statistics(vacrel, &extVacReport);
}

Currently, database statistics work incorrectly — I'm investigating the 
issue.


In parallel, I'm starting work on the extension.

-- 
Regards,
Alena Rybakina
Postgres Professional

Вложения

В списке pgsql-hackers по дате отправления: