Обсуждение: Vacuum statistics

Поиск
Список
Период
Сортировка

Vacuum statistics

От
Alena Rybakina
Дата:
Hello, everyone!

I think we don't have enough information to analyze vacuum functionality.

Needless to say that the vacuum is the most important process for a 
database system. It prevents problems like table and index bloating and 
emergency freezing if we have a wraparound problem. Furthermore, it 
keeps the visibility map up to date. On the other hand, because of 
incorrectly adjusted aggressive settings of autovacuum it can consume a 
lot of computing resources that lead to all queries to the system 
running longer.

Nowadays the vacuum gathers statistical information about tables, but it 
is important not for optimizer only.

Because the vacuum is an automation process, there are a lot of settings 
that determine their aggressive functionality to other objects of the 
database. Besides, sometimes it is important to set a correct parameter 
for the specified table, because of its dynamic changes.

An administrator of a database needs to set the settings of autovacuum 
to have a balance between the vacuum's useful action in the database 
system on the one hand, and the overhead of its workload on the other. 
However, it is not enough for him to decide on vacuum functionality 
through statistical information about the number of vacuum passes 
through tables and operational data from progress_vacuum, because it is 
available only during vacuum operation and does not provide a strategic 
overview over the considered period.

To sum up, an automation vacuum has a strategic behavior because the 
frequency of its functionality and resource consumption depends on the 
workload of the database. Its workload on the database is minimal for an 
append-only table and it is a maximum for the table with a 
high-frequency updating. Furthermore, there is a high dependence of the 
vacuum load on the number and volume of indexes. Because of the absence 
of the visibility map for indexes, the vacuum scans the index 
completely, and the worst situation when it needs to do it during a 
bloating index situation in a small table.


I suggest gathering information about vacuum resource consumption for 
processing indexes and tables and storing it in the table and index 
relationships (for example, PgStat_StatTabEntry structure like it has 
realized for usual statistics). It will allow us to determine how well 
the vacuum is configured and evaluate the effect of overhead on the 
system at the strategic level, the vacuum has gathered this information 
already, but this valuable information doesn't store it.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Vacuum statistics

От
Alena Rybakina
Дата:
On 30.05.2024 10:33, Alena Rybakina wrote:
>
> I suggest gathering information about vacuum resource consumption for 
> processing indexes and tables and storing it in the table and index 
> relationships (for example, PgStat_StatTabEntry structure like it has 
> realized for usual statistics). It will allow us to determine how well 
> the vacuum is configured and evaluate the effect of overhead on the 
> system at the strategic level, the vacuum has gathered this 
> information already, but this valuable information doesn't store it.
>
My colleagues and I have prepared a patch that can help to solve this 
problem.

We are open to feedback.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

Re: Vacuum statistics

От
Andrei Zubkov
Дата:
Hi,

Th, 30/05/2024 at 10:33 -0700, Alena Rybakina wrote:
> I suggest gathering information about vacuum resource consumption for
> processing indexes and tables and storing it in the table and index
> relationships (for example, PgStat_StatTabEntry structure like it has
> realized for usual statistics). It will allow us to determine how
> well
> the vacuum is configured and evaluate the effect of overhead on the
> system at the strategic level, the vacuum has gathered this
> information
> already, but this valuable information doesn't store it.
>
It seems a little bit unclear to me, so let me explain a little the
point of a proposition.

As the vacuum process is a backend it has a workload instrumentation.
We have all the basic counters available such as a number of blocks
read, hit and written, time spent on I/O, WAL stats and so on.. Also,
we can easily get some statistics specific to vacuum activity i.e.
number of tuples removed, number of blocks removed, number of VM marks
set and, of course the most important metric - time spent on vacuum
operation.

All those statistics must be stored by the Cumulative Statistics System
on per-relation basis. I mean individual cumulative counters for every
table and every index in the database.

Such counters will provide us a clear view about vacuum workload on
individual objects of the database, providing means to measure the
efficiency of performed vacuum fine tuning.
--
Andrei Zubkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Vacuum statistics

От
Dilip Kumar
Дата:
On Thu, May 30, 2024 at 11:57 PM Alena Rybakina
<lena.ribackina@yandex.ru> wrote:
>
> On 30.05.2024 10:33, Alena Rybakina wrote:
> >
> > I suggest gathering information about vacuum resource consumption for
> > processing indexes and tables and storing it in the table and index
> > relationships (for example, PgStat_StatTabEntry structure like it has
> > realized for usual statistics). It will allow us to determine how well
> > the vacuum is configured and evaluate the effect of overhead on the
> > system at the strategic level, the vacuum has gathered this
> > information already, but this valuable information doesn't store it.
> >
> My colleagues and I have prepared a patch that can help to solve this
> problem.
>
> We are open to feedback.

I was reading through the patch here are some initial comments.

--
+typedef struct LVExtStatCounters
+{
+ TimestampTz time;
+ PGRUsage ru;
+ WalUsage walusage;
+ BufferUsage bufusage;
+ int64 VacuumPageMiss;
+ int64 VacuumPageHit;
+ int64 VacuumPageDirty;
+ double VacuumDelayTime;
+ PgStat_Counter blocks_fetched;
+ PgStat_Counter blocks_hit;
+} LVExtStatCounters;


I noticed that you are storing both pgBufferUsage and
VacuumPage(Hit/Miss/Dirty) stats. Aren't these essentially the same?
It seems they both exist in the system because some code, like
heap_vacuum_rel(), uses pgBufferUsage, while do_analyze_rel() still
relies on the old counters. And there is already a patch to remove
those old counters.


--
+static Datum
+pg_stats_vacuum(FunctionCallInfo fcinfo, ExtVacReportType type, int ncolumns)
+{

I don't think you need this last parameter (ncolumns) we can anyway
fetch that from tupledesc, so adding an additional parameter
just for checking doesn't look good to me.

--
+ /* Tricky turn here: enforce pgstat to think that our database us dbid */
+
+ MyDatabaseId = dbid;

typo
/think that our database us dbid/think that our database has dbid

Also, remove the blank line between the comment and the next code
block that is related to that comment.


--
  VacuumPageDirty = 0;
+ VacuumDelayTime = 0.;

There is an extra "." after 0


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Vacuum statistics

От
Alena Rybakina
Дата:

Hi! Thank you for your interest in this topic!

On 07.06.2024 09:46, Dilip Kumar wrote:
On Thu, May 30, 2024 at 11:57 PM Alena Rybakina
<lena.ribackina@yandex.ru> wrote:
On 30.05.2024 10:33, Alena Rybakina wrote:
I suggest gathering information about vacuum resource consumption for
processing indexes and tables and storing it in the table and index
relationships (for example, PgStat_StatTabEntry structure like it has
realized for usual statistics). It will allow us to determine how well
the vacuum is configured and evaluate the effect of overhead on the
system at the strategic level, the vacuum has gathered this
information already, but this valuable information doesn't store it.

My colleagues and I have prepared a patch that can help to solve this
problem.

We are open to feedback.
I was reading through the patch here are some initial comments.

--
+typedef struct LVExtStatCounters
+{
+ TimestampTz time;
+ PGRUsage ru;
+ WalUsage walusage;
+ BufferUsage bufusage;
+ int64 VacuumPageMiss;
+ int64 VacuumPageHit;
+ int64 VacuumPageDirty;
+ double VacuumDelayTime;
+ PgStat_Counter blocks_fetched;
+ PgStat_Counter blocks_hit;
+} LVExtStatCounters;


I noticed that you are storing both pgBufferUsage and
VacuumPage(Hit/Miss/Dirty) stats. Aren't these essentially the same?
It seems they both exist in the system because some code, like
heap_vacuum_rel(), uses pgBufferUsage, while do_analyze_rel() still
relies on the old counters. And there is already a patch to remove
those old counters.
I agree with you and I have fixed it.

--
+static Datum
+pg_stats_vacuum(FunctionCallInfo fcinfo, ExtVacReportType type, int ncolumns)
+{

I don't think you need this last parameter (ncolumns) we can anyway
fetch that from tupledesc, so adding an additional parameter
just for checking doesn't look good to me.
To be honest, I'm not sure if ncolumns should be deleted at all, because the pg_stats_vacuum function is used to display three different types of statistics: for tables, indexes, and databases. We use this parameter to pass information about the number of parameters (or how many statistics we expect) depending on the type of statistics. For example, table vacuum statistics contain 27 parameters, while indexes and databases contain 19 and 15 parameters, respectively. You can see that the pg_stats_vacuum function contains an Assert that checks that the expected number of tupledesc parameters matches the actual number.

Assert(tupdesc->natts == ncolumns);


Perhaps I can convert it to a local parameter and determine its value already in the function, for example:

pg_stats_vacuum(FunctionCallInfo fcinfo, ExtVacReportType type, int ncolumns)
{

int columns = 0;

switch (type)

{

case PGSTAT_EXTVAC_HEAP:

ncolumns = EXTVACHEAPSTAT_COLUMNS;

break;

case PGSTAT_EXTVAC_INDEX:

ncolumns = EXTVACINDEXSTAT_COLUMNS;

break;

case PGSTAT_EXTVAC_DB:

ncolumns = EXTVACDBSTAT_COLUMNS;

break; }

...

}

What do you think?

--
+ /* Tricky turn here: enforce pgstat to think that our database us dbid */
+
+ MyDatabaseId = dbid;

typo
/think that our database us dbid/think that our database has dbid

Also, remove the blank line between the comment and the next code
block that is related to that comment.


--  VacuumPageDirty = 0;
+ VacuumDelayTime = 0.;

There is an extra "." after 0


Thank you, I fixed it.


In addition to these changes, I fixed the problem with displaying vacuum statistics for databases: I found an error in defining the pg_stats_vacuum_database system view. In addition, I rewrote the tests and converted them into a regression test. In addition, I have divided the test to test the functionality of the output of vacuum statistics into two tests: one of them checks the functionality of tables and databases, and the other - indexes. This is caused by a problem with the vacuum functionality when the table contains an index. You can find more information about this here: [0] and [1].

I attached the diff to this letter.

[0] https://www.postgresql.org/message-id/d1ca3a1d-7ead-41a7-bfd0-5b66ad97b1cd%40yandex.ru

[1] https://www.postgresql.org/message-id/CAH2-Wznv94Q_Td8OS8bAN7fYLpfU6CGgjn6Xau5eJ_sDxEGeBA%40mail.gmail.com


I am currently working on dividing this patch into three parts to simplify the review process: one of them will contain code for collecting vacuum statistics on tables, the second on indexes and the last on databases. I also write the documentation.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: Vacuum statistics

От
Alena Rybakina
Дата:

Hi!

On 11.06.2024 16:09, Alena Rybakina wrote:
On 08.06.2024 09:30, Alena Rybakina wrote:

I am currently working on dividing this patch into three parts to simplify the review process: one of them will contain code for collecting vacuum statistics on tables, the second on indexes and the last on databases.

I have divided the patch into three: the first patch contains code for the functionality of collecting and storage for tables, the second one for indexes and the last one for databases.
-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Vacuum statistics

От
Alena Rybakina
Дата:

Hi!

On 11.06.2024 16:09, Alena Rybakina wrote:
On 08.06.2024 09:30, Alena Rybakina wrote:

I am currently working on dividing this patch into three parts to simplify the review process: one of them will contain code for collecting vacuum statistics on tables, the second on indexes and the last on databases.

I have divided the patch into three: the first patch contains code for the functionality of collecting and storage for tables, the second one for indexes and the last one for databases.
-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: Vacuum statistics

От
Alena Rybakina
Дата:

I have written the documentary and attached the patch.

On 08.06.2024 09:30, Alena Rybakina wrote:

I am currently working on dividing this patch into three parts to simplify the review process: one of them will contain code for collecting vacuum statistics on tables, the second on indexes and the last on databases. I also write the documentation.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: Vacuum statistics

От
Masahiko Sawada
Дата:
On Fri, May 31, 2024 at 4:19 AM Andrei Zubkov <zubkov@moonset.ru> wrote:
>
> Hi,
>
> Th, 30/05/2024 at 10:33 -0700, Alena Rybakina wrote:
> > I suggest gathering information about vacuum resource consumption for
> > processing indexes and tables and storing it in the table and index
> > relationships (for example, PgStat_StatTabEntry structure like it has
> > realized for usual statistics). It will allow us to determine how
> > well
> > the vacuum is configured and evaluate the effect of overhead on the
> > system at the strategic level, the vacuum has gathered this
> > information
> > already, but this valuable information doesn't store it.
> >
> It seems a little bit unclear to me, so let me explain a little the
> point of a proposition.
>
> As the vacuum process is a backend it has a workload instrumentation.
> We have all the basic counters available such as a number of blocks
> read, hit and written, time spent on I/O, WAL stats and so on.. Also,
> we can easily get some statistics specific to vacuum activity i.e.
> number of tuples removed, number of blocks removed, number of VM marks
> set and, of course the most important metric - time spent on vacuum
> operation.

I've not reviewed the patch closely but it sounds helpful for users. I
would like to add a statistic, the high-water mark of memory usage of
dead tuple TIDs. Since the amount of memory used by TidStore is hard
to predict, I think showing the high-water mark would help users to
predict how much memory they set to maintenance_work_mem.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Vacuum statistics

От
Andrei Zubkov
Дата:
Hello!

On Thu, 27/06/2024 at 10:39 +0900, Masahiko Sawada:
> On Fri, May 31, 2024 at 4:19 AM Andrei Zubkov <zubkov@moonset.ru>
> wrote:
> > As the vacuum process is a backend it has a workload
> > instrumentation.
> > We have all the basic counters available such as a number of blocks
> > read, hit and written, time spent on I/O, WAL stats and so on..
> > Also,
> > we can easily get some statistics specific to vacuum activity i.e.
> > number of tuples removed, number of blocks removed, number of VM
> > marks
> > set and, of course the most important metric - time spent on vacuum
> > operation.
>
> I've not reviewed the patch closely but it sounds helpful for users.
> I
> would like to add a statistic, the high-water mark of memory usage of
> dead tuple TIDs. Since the amount of memory used by TidStore is hard
> to predict, I think showing the high-water mark would help users to
> predict how much memory they set to maintenance_work_mem.
>
Thank you for your interest on this patch. I've understand your idea.
The obvious goal of it is to avoid expensive index multi processing
during vacuum of the heap. Provided statistics in the patch contain the
index_vacuum_count counter for each table which can be compared to the
pg_stat_all_tables.vacuum_count to detect specific relation index
multi-passes. Previous setting of maintenance_work_mem is known. Usage
of TidStore should be proportional to the amount of dead-tuples vacuum
workload on the table, so as the first evaluation we can take the
number of index passes per one heap pass as a maintenance_work_mem
multiplier.

But there is a better way. Once we detected the index multiprocessing
we can lower the vacuum workload for the heap pass making vacuum a
little bit more aggressive for this particular relation. I mean, in
such case increasing maintenance_work_mem is not only decision.

Suggested high-water mark statistic can't be used as cumulative
statistic - any high-water mark statistic as maximim-like statistic is
valid for certain time period thus should be reset on some kind of
schedule. Without resets it should reach 100% once under the heavy load
and stay there forever.

Said that such high-water mark seems a little bit unclear and
complicated for the DBA. It seems redundant to me right now. I can see
the main value of such statistic is to avoid too large
maintenance_work_mem setting. But I can't see really dramatic
consequences of that. Maybe I've miss something..

--
Andrei Zubkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company