Обсуждение: [PATCH] Add memory usage reporting to VACUUM VERBOSE

Поиск
Список
Период
Сортировка

[PATCH] Add memory usage reporting to VACUUM VERBOSE

От
河田達也
Дата:
Hi,

I would like to propose a patch that adds memory usage reporting to
VACUUM VERBOSE output. This helps users understand how much memory
is being used for dead tuple tracking and whether memory limits are
being hit during vacuum operations.


I have tested this patch with both serial and parallel VACUUM:
- Serial VACUUM with two maintenance_work_mem settings
- Parallel VACUUM with two maintenance_work_mem settings
- Cases with and without memory resets

Test cases are included showing:
1. Reset behavior with constrained memory (64KB)
2. No reset behavior with ample memory (64MB)
3. Both serial and parallel VACUUM scenarios

I look forward to your feedback.

Best regards,
Tatsuya Kawata
Вложения

Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
Masahiko Sawada
Дата:
Hi,

On Sun, Nov 16, 2025 at 8:01 AM 河田達也 <kawatatatsuya0913@gmail.com> wrote:
>
> Hi,
>
> I would like to propose a patch that adds memory usage reporting to
> VACUUM VERBOSE output. This helps users understand how much memory
> is being used for dead tuple tracking and whether memory limits are
> being hit during vacuum operations.

+1. The memory usage is already shown in the progress view but it
would make sense to report it in the vacuum verbose log too.

>
> I have tested this patch with both serial and parallel VACUUM:
> - Serial VACUUM with two maintenance_work_mem settings
> - Parallel VACUUM with two maintenance_work_mem settings
> - Cases with and without memory resets
>
> Test cases are included showing:
> 1. Reset behavior with constrained memory (64KB)
> 2. No reset behavior with ample memory (64MB)
> 3. Both serial and parallel VACUUM scenarios
>
> I look forward to your feedback.

Here are some review comments:

+                       appendStringInfo(&buf,
+                                                       _("memory
usage: peak %.2f MB of %.2f MB allowed (%.1f%%), %d reset(s)\n"),
+                                                       (double)
vacrel->peak_dead_items_bytes / (1024.0 * 1024.0),
+                                                       (double)
vac_work_mem / 1024.0,
+                                                       100.0 *
vacrel->peak_dead_items_bytes / (vac_work_mem * 1024.0),
+
vacrel->dead_items_resets);

With the proposed patch, we report peak memory usage and reset count.
Since we limit vacuum's memory usage to maintenance_work_mem, when
performing one or more index vacuum operations, the peak memory usage
typically equals maintenance_work_mem, with a reset count of 1 or
higher. I wonder if it might be more useful to show the total memory
used by vacuum. For example, if maintenance_work_mem is 512MB and
vacuum uses 128MB during the second heap scan pass, the report would
show a total memory usage of 640MB. I think probably the reset counter
is not necessary as we already show the number of index scans.

---
@@ -3590,6 +3629,10 @@ dead_items_add(LVRelState *vacrel, BlockNumber
blkno, OffsetNumber *offsets,
        prog_val[0] = vacrel->dead_items_info->num_items;
        prog_val[1] = TidStoreMemoryUsage(vacrel->dead_items);
        pgstat_progress_update_multi_param(2, prog_index, prog_val);
+
+       /* Track peak memory usage */
+       if (prog_val[1] > vacrel->peak_dead_items_bytes)
+               vacrel->peak_dead_items_bytes = prog_val[1];
 }

The vacuum memory usage is monotonically increasing until it's reset.
So we don't need to check the memory usage for every dead_items_add()
call.

---
+       /*
+        * Capture final peak memory usage before cleanup.
+        * This is especially important for parallel vacuum where
workers may have
+        * added items that weren't tracked in the leader's
peak_dead_items_bytes.
+        */
+       if (vacrel->dead_items != NULL)
+       {
+               Size            final_bytes =
TidStoreMemoryUsage(vacrel->dead_items);
+
+               if (final_bytes > vacrel->peak_dead_items_bytes)
+                       vacrel->peak_dead_items_bytes = final_bytes;
+       }

Please note that we call dead_items_reset() at the end of
lazy_vacuum(). The dead items should already be empty at the above
point.

Regards,

[1] https://commitfest.postgresql.org/57/

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
河田達也
Дата:
Hi Sawada-san,

Thank you very much for your review and helpful comments on my previous patch.
I have revised the patch following your suggestions:

・Report total memory usage rather than peak memory, which is more meaningful for users.
・Removed the reset counter, since index scan counts already reflect memory resets.
・Updated the code so that memory usage is recorded at the proper points, avoiding unnecessary per-item checks.
・Removed the final peak memory check, as dead_items are already cleared at that point.

All previous test cases have been re-run and verified to work correctly with both serial and parallel VACUUM scenarios.
Please find the updated patch attached. I look forward to any further feedback.

Best regards,
Tatsuya Kawata


2025年11月18日(火) 11:29 Masahiko Sawada <sawada.mshk@gmail.com>:
Hi,

On Sun, Nov 16, 2025 at 8:01 AM 河田達也 <kawatatatsuya0913@gmail.com> wrote:
>
> Hi,
>
> I would like to propose a patch that adds memory usage reporting to
> VACUUM VERBOSE output. This helps users understand how much memory
> is being used for dead tuple tracking and whether memory limits are
> being hit during vacuum operations.

+1. The memory usage is already shown in the progress view but it
would make sense to report it in the vacuum verbose log too.

>
> I have tested this patch with both serial and parallel VACUUM:
> - Serial VACUUM with two maintenance_work_mem settings
> - Parallel VACUUM with two maintenance_work_mem settings
> - Cases with and without memory resets
>
> Test cases are included showing:
> 1. Reset behavior with constrained memory (64KB)
> 2. No reset behavior with ample memory (64MB)
> 3. Both serial and parallel VACUUM scenarios
>
> I look forward to your feedback.

Here are some review comments:

+                       appendStringInfo(&buf,
+                                                       _("memory
usage: peak %.2f MB of %.2f MB allowed (%.1f%%), %d reset(s)\n"),
+                                                       (double)
vacrel->peak_dead_items_bytes / (1024.0 * 1024.0),
+                                                       (double)
vac_work_mem / 1024.0,
+                                                       100.0 *
vacrel->peak_dead_items_bytes / (vac_work_mem * 1024.0),
+
vacrel->dead_items_resets);

With the proposed patch, we report peak memory usage and reset count.
Since we limit vacuum's memory usage to maintenance_work_mem, when
performing one or more index vacuum operations, the peak memory usage
typically equals maintenance_work_mem, with a reset count of 1 or
higher. I wonder if it might be more useful to show the total memory
used by vacuum. For example, if maintenance_work_mem is 512MB and
vacuum uses 128MB during the second heap scan pass, the report would
show a total memory usage of 640MB. I think probably the reset counter
is not necessary as we already show the number of index scans.

---
@@ -3590,6 +3629,10 @@ dead_items_add(LVRelState *vacrel, BlockNumber
blkno, OffsetNumber *offsets,
        prog_val[0] = vacrel->dead_items_info->num_items;
        prog_val[1] = TidStoreMemoryUsage(vacrel->dead_items);
        pgstat_progress_update_multi_param(2, prog_index, prog_val);
+
+       /* Track peak memory usage */
+       if (prog_val[1] > vacrel->peak_dead_items_bytes)
+               vacrel->peak_dead_items_bytes = prog_val[1];
 }

The vacuum memory usage is monotonically increasing until it's reset.
So we don't need to check the memory usage for every dead_items_add()
call.

---
+       /*
+        * Capture final peak memory usage before cleanup.
+        * This is especially important for parallel vacuum where
workers may have
+        * added items that weren't tracked in the leader's
peak_dead_items_bytes.
+        */
+       if (vacrel->dead_items != NULL)
+       {
+               Size            final_bytes =
TidStoreMemoryUsage(vacrel->dead_items);
+
+               if (final_bytes > vacrel->peak_dead_items_bytes)
+                       vacrel->peak_dead_items_bytes = final_bytes;
+       }

Please note that we call dead_items_reset() at the end of
lazy_vacuum(). The dead items should already be empty at the above
point.

Regards,

[1] https://commitfest.postgresql.org/57/

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Вложения

Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
Masahiko Sawada
Дата:
(please avoid top-posting[1] on this mailing list)

On Tue, Nov 18, 2025 at 7:15 AM 河田達也 <kawatatatsuya0913@gmail.com> wrote:
>
> Hi Sawada-san,
>
> Thank you very much for your review and helpful comments on my previous patch.
> I have revised the patch following your suggestions:
>
> ・Report total memory usage rather than peak memory, which is more meaningful for users.
> ・Removed the reset counter, since index scan counts already reflect memory resets.
> ・Updated the code so that memory usage is recorded at the proper points, avoiding unnecessary per-item checks.
> ・Removed the final peak memory check, as dead_items are already cleared at that point.
>
> All previous test cases have been re-run and verified to work correctly with both serial and parallel VACUUM
scenarios.
> Please find the updated patch attached. I look forward to any further feedback.

Thank you for updating the patch!

Here are some review comments for the v2 patch:

+           /* Report memory usage for dead_items tracking */
+           vac_work_mem = AmAutoVacuumWorkerProcess() &&
+                       autovacuum_work_mem != -1 ?
+                       autovacuum_work_mem : maintenance_work_mem;

We can use vacrel->dead_items_info->max_bytes instead of calculating it again.

---
+           appendStringInfo(&buf,
+                           _("total memory usage: %.2f MB of %.2f MB
allowed\n"),
+                           (double) vacrel->total_dead_items_bytes /
(1024.0 * 1024.0),
+                           (double) vac_work_mem / 1024.0
+                           );

How about the following message style?

memory usage: total 0.69 MB used across 1 index scans (max 64.00 MB at once)

---
+/*
+ * Add current memory usage to the running total before resetting.
+ * This tracks cumulative memory across all index vacuum cycles.
+ */
+   if (vacrel->dead_items != NULL)
+   {
+       Size        final_bytes = TidStoreMemoryUsage(vacrel->dead_items);
+
+       vacrel->total_dead_items_bytes += final_bytes;
+   }

The comments need to be indented. I'd recommend running
src/tools/pgindent/pgindent to format the codes before creating the
patch.

The above change doesn't cover some cases where index vacuuming is
disabled (see the first if statement in the same function). It happens
for example when the user specified INDEX_CLEANUP=off, we used the
bypassing index vacuuming optimization, or failsafe was dynamically
triggered. The proposed change covers the bypassing optimization but
doesn't for the other two cases, which seems inconsistent and needs to
be avoided. Another case we need to consider is where the table
doesn't have an index, where we don't collect dead items to the
TidStore. In this case, we always report that 0 bytes are used. Do we
want to report the memory usage also in this case?

---
@@ -3553,6 +3581,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
        {
            vacrel->dead_items = parallel_vacuum_get_dead_items(vacrel->pvs,

&vacrel->dead_items_info);
+
            return;
        }
    }
@@ -3598,6 +3627,7 @@ dead_items_add(LVRelState *vacrel, BlockNumber
blkno, OffsetNumber *offsets,
 static void
 dead_items_reset(LVRelState *vacrel)
 {
+
    if (ParallelVacuumIsActive(vacrel))
    {
        parallel_vacuum_reset_dead_items(vacrel->pvs);

There are unnecessary line additions.

Regards,

[1] https://wiki.postgresql.org/wiki/Mailing_Lists#Mailing_List_Culture

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
河田達也
Дата:
Hi Sawada-san,

Thank you very much for your detailed review and suggestions for the v2 patch.
Also, thank you for pointing out the top-posting issue. I'll follow the mailing list style from now on.

---
>+           /* Report memory usage for dead_items tracking */
>+           vac_work_mem = AmAutoVacuumWorkerProcess() &&
>+                       autovacuum_work_mem != -1 ?
>+                       autovacuum_work_mem : maintenance_work_mem;
>
>We can use vacrel->dead_items_info->max_bytes instead of calculating it again.

I changed it to retrieve the value from max_bytes.
However, I encountered a segmentation fault during parallel VACUUM, so I modified the code to store the value before resetting it.
I would appreciate any suggestions if there is a better approach.
---

>How about the following message style?
>
>memory usage: total 0.69 MB used across 1 index scans (max 64.00 MB at once)

I would like to confirm one point:
Does “1 index scans” refer to (1) the number of reset cycles during the VACUUM process, or (2) the number of indexes attached to the target relation?
In my understanding, the latter seems more useful to users, so in the v3 patch I implemented it as the number of vacuumed indexes.
If this interpretation is not correct, I will adjust it accordingly.

---
>+/*
>+ * Add current memory usage to the running total before resetting.
>+ * This tracks cumulative memory across all index vacuum cycles.
>+ */
>+   if (vacrel->dead_items != NULL)
>+   {
>+       Size        final_bytes = TidStoreMemoryUsage(vacrel->dead_items);
>+
>+       vacrel->total_dead_items_bytes += final_bytes;
>+   }
>
>The comments need to be indented. I'd recommend running
>src/tools/pgindent/pgindent to format the codes before creating the
>patch.
>
>The above change doesn't cover some cases where index vacuuming is
>disabled (see the first if statement in the same function). It happens
>for example when the user specified INDEX_CLEANUP=off, we used the
>bypassing index vacuuming optimization, or failsafe was dynamically
>triggered. The proposed change covers the bypassing optimization but
>doesn't for the other two cases, which seems inconsistent and needs to
>be avoided. Another case we need to consider is where the table
>doesn't have an index, where we don't collect dead items to the
>TidStore. In this case, we always report that 0 bytes are used. Do we
>want to report the memory usage also in this case?

As for tracking memory usage, I separated the internal accounting from what is ultimately reported to the user,
and therefore the implementation is slightly redundant, recording the memory usage at several decision points.

For the display behavior:
Even with INDEX_CLEANUP = off, or when do_index_vacuuming = false due to failsafe, memory usage may not be strictly zero.
Therefore, I believe it is still appropriate to report the memory usage in these cases.

On the other hand, when the relation has no indexes at all, no memory is allocated for dead_items, so printing a memory usage line seems unnecessary.
In the v3 patch, I adjusted the code so that no message is emitted in this specific case.

I also removed the unintended blank lines and ran pgindent as you suggested.
I appreciate your continued feedback, and I would be grateful for another review of the v3 patch.

Best regards,
Tatsuya Kawata
Вложения

Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
Masahiko Sawada
Дата:
On Wed, Nov 19, 2025 at 9:31 AM 河田達也 <kawatatatsuya0913@gmail.com> wrote:
>
> Hi Sawada-san,
>
> Thank you very much for your detailed review and suggestions for the v2 patch.
> Also, thank you for pointing out the top-posting issue. I'll follow the mailing list style from now on.

+1

>
> ---
> >+           /* Report memory usage for dead_items tracking */
> >+           vac_work_mem = AmAutoVacuumWorkerProcess() &&
> >+                       autovacuum_work_mem != -1 ?
> >+                       autovacuum_work_mem : maintenance_work_mem;
> >
> >We can use vacrel->dead_items_info->max_bytes instead of calculating it again.
>
> I changed it to retrieve the value from max_bytes.
> However, I encountered a segmentation fault during parallel VACUUM, so I modified the code to store the value before
resettingit. 
> I would appreciate any suggestions if there is a better approach.

Right. But the dead_items_max_bytes is used in the same function
heap_vacuum_rel(), so we don't need to have it in LVRelStats.

> ---
>
> >How about the following message style?
> >
> >memory usage: total 0.69 MB used across 1 index scans (max 64.00 MB at once)
>
> I would like to confirm one point:
> Does “1 index scans” refer to (1) the number of reset cycles during the VACUUM process, or (2) the number of indexes
attachedto the target relation? 
> In my understanding, the latter seems more useful to users, so in the v3 patch I implemented it as the number of
vacuumedindexes. 
> If this interpretation is not correct, I will adjust it accordingly.

I meant (1) actually. I don't think the number of indexes attached to
the target relation is not relevant with the memory usage.

We could add a new counter to track how many times we had to reset the
dead items storage because it was full. The existing num_index_scans
counter only shows how many times we performed index vacuuming. This
means that when index vacuuming is disabled, we still collect dead
items but num_index_scans shows 0. Adding this new counter would help
users understand how often the dead items storage reaches capacity,
even when index vacuuming is turned off.

>
> ---
> >+/*
> >+ * Add current memory usage to the running total before resetting.
> >+ * This tracks cumulative memory across all index vacuum cycles.
> >+ */
> >+   if (vacrel->dead_items != NULL)
> >+   {
> >+       Size        final_bytes = TidStoreMemoryUsage(vacrel->dead_items);
> >+
> >+       vacrel->total_dead_items_bytes += final_bytes;
> >+   }
> >
> >The comments need to be indented. I'd recommend running
> >src/tools/pgindent/pgindent to format the codes before creating the
> >patch.
> >
> >The above change doesn't cover some cases where index vacuuming is
> >disabled (see the first if statement in the same function). It happens
> >for example when the user specified INDEX_CLEANUP=off, we used the
> >bypassing index vacuuming optimization, or failsafe was dynamically
> >triggered. The proposed change covers the bypassing optimization but
> >doesn't for the other two cases, which seems inconsistent and needs to
> >be avoided. Another case we need to consider is where the table
> >doesn't have an index, where we don't collect dead items to the
> >TidStore. In this case, we always report that 0 bytes are used. Do we
> >want to report the memory usage also in this case?
>
> As for tracking memory usage, I separated the internal accounting from what is ultimately reported to the user,
> and therefore the implementation is slightly redundant, recording the memory usage at several decision points.

How about updating total bytes in dead_items_reset()?

>
> For the display behavior:
> Even with INDEX_CLEANUP = off, or when do_index_vacuuming = false due to failsafe, memory usage may not be strictly
zero.
> Therefore, I believe it is still appropriate to report the memory usage in these cases.
>
> On the other hand, when the relation has no indexes at all, no memory is allocated for dead_items, so printing a
memoryusage line seems unnecessary. 
> In the v3 patch, I adjusted the code so that no message is emitted in this specific case.

I guess that it doesn't necessary omit such information even though we
always show 0 bytes used.Explicitly showing 0 bytes used could help
users to understand the vacuum behavior.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
Chao Li
Дата:
Hi Tatsuya,

I just ran a test against v3.

> On Nov 20, 2025, at 01:30, 河田達也 <kawatatatsuya0913@gmail.com> wrote:
>
>
> I also removed the unintended blank lines and ran pgindent as you suggested.
> I appreciate your continued feedback, and I would be grateful for another review of the v3 patch.
>
> Best regards,
> Tatsuya Kawata
> <v3-0001-Add-memory-usage-reporting-to-VACUUM-VERBOSE.patch>

The test shows:
```
evantest=# VACUUM (VERBOSE) vtest;
INFO:  vacuuming "evantest.public.vtest"
INFO:  finished vacuuming "evantest.public.vtest": index scans: 1
pages: 0 removed, 8621 remain, 8621 scanned (100.00% of total), 0 eagerly scanned
tuples: 250000 removed, 250000 remain, 0 are dead but not yet removable
removable cutoff: 775, which was 0 XIDs old when operation ended
new relfrozenxid: 773, which is 1 XIDs ahead of previous value
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
visibility map: 8621 pages set all-visible, 0 pages set all-frozen (0 were all-visible)
index scan needed: 8621 pages from table (100.00% of total) had 250000 dead item identifiers removed
index "vtest_pkey": pages: 1374 in total, 0 newly deleted, 0 currently deleted, 0 reusable
avg read rate: 0.000 MB/s, avg write rate: 0.475 MB/s
buffer usage: 27263 hits, 0 reads, 3 dirtied
WAL usage: 18613 records, 3 full page images, 2593581 bytes, 24576 full page image bytes, 0 buffers full
memory usage: total 0.44 MB used across 1 index scan(s) (max 64.00 MB at once)
system usage: CPU: user: 0.04 s, system: 0.00 s, elapsed: 0.04 s
INFO:  vacuuming "evantest.pg_toast.pg_toast_16393"
INFO:  finished vacuuming "evantest.pg_toast.pg_toast_16393": index scans: 0
pages: 0 removed, 0 remain, 0 scanned (100.00% of total), 0 eagerly scanned
tuples: 0 removed, 0 remain, 0 are dead but not yet removable
removable cutoff: 775, which was 0 XIDs old when operation ended
new relfrozenxid: 775, which is 3 XIDs ahead of previous value
frozen: 0 pages from table (100.00% of total) had 0 tuples frozen
visibility map: 0 pages set all-visible, 0 pages set all-frozen (0 were all-visible)
index scan not needed: 0 pages from table (100.00% of total) had 0 dead item identifiers removed
avg read rate: 39.657 MB/s, avg write rate: 0.000 MB/s
buffer usage: 27 hits, 1 reads, 0 dirtied
WAL usage: 1 records, 0 full page images, 258 bytes, 0 full page image bytes, 0 buffers full
memory usage: total 0.00 MB used across 1 index scan(s) (max 64.00 MB at once)
system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
VACUUM
```

Notice the second “memory usage” line:
```
memory usage: total 0.00 MB used across 1 index scan(s) (max 64.00 MB at once)
```

It says total usage is 0, but max is 64, which is confusing.

I think that is because we initiates as
```
+    /*
+     * Save max_bytes before cleanup, as dead_items_info may be freed in
+     * parallel mode during dead_items_cleanup().
+     */
+    vacrel->dead_items_max_bytes = vacrel->dead_items_info->max_bytes;
```

Where “max_bytes” is "/* the maximum bytes TidStore can use */“, but I think we actually want to show max memories ever
consumedduring vacuuming, right? 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
河田達也
Дата:
Hi Sawada-san, Chao-san,

Thank you very much for your helpful feedback and testing.

> We can use vacrel->dead_items_info->max_bytes instead of calculating it again.
I fixed it as you mentioned.


> I meant (1) actually. I don't think the number of indexes attached to
> the target relation is not relevant with the memory usage.
>
> We could add a new counter to track how many times we had to reset the
> dead items storage because it was full. The existing num_index_scans
> counter only shows how many times we performed index vacuuming. This
> means that when index vacuuming is disabled, we still collect dead
> items but num_index_scans shows 0. Adding this new counter would help
> users understand how often the dead items storage reaches capacity,
> even when index vacuuming is turned off.
Thank you for your detailed explanation; I finally understand.
I added a new counter to track the number of dead items storage resets caused by reaching capacity.


> How about updating total bytes in dead_items_reset()?
Thank you. I fixed it as you mentioned.


> Another case we need to consider is where the table doesn't have an index...
I fixed it as you mentioned.


> Where “max_bytes” is "/* the maximum bytes TidStore can use */“, but I think we
> actually want to show max memories ever consumed during vacuuming, right?
Thank you for pointing out the confusing “max” wording.
The word  "max” is not the peak memory *actually used* during VACUUM.
It represents the upper limit of memory that the TidStore is allowed to use at once
(derived from maintenance_work_mem or autovacuum_work_mem).
So the current output may look misleading, especially when total usage is 0 but the limit is non-zero.

To make this more explicit, I updated the message wording to avoid implying that
this is the maximum consumption, and instead describe it as the configured memory limit.
The updated style is:
```
memory usage: total 1.02 MB used across 15 index scan(s) (max mem space is limited 0.06 MB at once)
```
I believe this wording better reflects the actual meaning of max_bytes and avoids
the interpretation that it is a peak usage value.


Regarding pgindent, running pgindent produced a large number of unrelated changes across the file,
not only around the lines modified. Since it seemed inappropriate to include those unrelated diffs,
I picked and applied only the necessary indentation changes for this patch.

I apologize for not mentioning this in the v3 submission.
I will revisit my pgindent setup to ensure it matches the project’s expected behavior.

I have posted v4 incorporating all these changes.  
Thank you again for your guidance.

Best regards,  
Tatsuya Kawata

Вложения

Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
Masahiko Sawada
Дата:
On Thu, Nov 20, 2025 at 6:16 AM 河田達也 <kawatatatsuya0913@gmail.com> wrote:
>
> Hi Sawada-san, Chao-san,
>
> Thank you very much for your helpful feedback and testing.
>
> > We can use vacrel->dead_items_info->max_bytes instead of calculating it again.
> I fixed it as you mentioned.
>
>
> > I meant (1) actually. I don't think the number of indexes attached to
> > the target relation is not relevant with the memory usage.
> >
> > We could add a new counter to track how many times we had to reset the
> > dead items storage because it was full. The existing num_index_scans
> > counter only shows how many times we performed index vacuuming. This
> > means that when index vacuuming is disabled, we still collect dead
> > items but num_index_scans shows 0. Adding this new counter would help
> > users understand how often the dead items storage reaches capacity,
> > even when index vacuuming is turned off.
> Thank you for your detailed explanation; I finally understand.
> I added a new counter to track the number of dead items storage resets caused by reaching capacity.
>
>
> > How about updating total bytes in dead_items_reset()?
> Thank you. I fixed it as you mentioned.
>
>
> > Another case we need to consider is where the table doesn't have an index...
> I fixed it as you mentioned.
>
>
> > Where “max_bytes” is "/* the maximum bytes TidStore can use */“, but I think we
> > actually want to show max memories ever consumed during vacuuming, right?
> Thank you for pointing out the confusing “max” wording.
> The word  "max” is not the peak memory *actually used* during VACUUM.
> It represents the upper limit of memory that the TidStore is allowed to use at once
> (derived from maintenance_work_mem or autovacuum_work_mem).
> So the current output may look misleading, especially when total usage is 0 but the limit is non-zero.
>
> To make this more explicit, I updated the message wording to avoid implying that
> this is the maximum consumption, and instead describe it as the configured memory limit.
> The updated style is:
> ```
> memory usage: total 1.02 MB used across 15 index scan(s) (max mem space is limited 0.06 MB at once)
> ```
> I believe this wording better reflects the actual meaning of max_bytes and avoids
> the interpretation that it is a peak usage value.

I think we should not use "index scans" since the number doesn't
actually represent the number of index scans performed. How about
something like:

memory usage: 1.02 MB in total, with dead-item storage reset 15 times
(limit was 0.06 MB)

Also when it comes to the plural, we can use errmsg_plural() instead.

>
>
> Regarding pgindent, running pgindent produced a large number of unrelated changes across the file,
> not only around the lines modified. Since it seemed inappropriate to include those unrelated diffs,
> I picked and applied only the necessary indentation changes for this patch.
>
> I apologize for not mentioning this in the v3 submission.
> I will revisit my pgindent setup to ensure it matches the project’s expected behavior.
>
> I have posted v4 incorporating all these changes.
> Thank you again for your guidance.
>

Thank you for updating the patch! Here are some minor comments for v4 patch:

+       dead_items_max_bytes = 0;
        /*
         * Get cutoffs that determine which deleted tuples are considered DEAD,
         * not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze.
Then determine

We can initialize dead_items_max_bytes when declared.

---
+       /* Track total memory usage for dead_items */
+       if (vacrel->dead_items != NULL)
+       {
+               vacrel->total_dead_items_bytes +=
TidStoreMemoryUsage(vacrel->dead_items);
+       }

Does it need to check if vacrel->dead_items is non-NULL? If it could
be NULL, the following operations like TidStoreDestroy() should fail
anyway.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
河田達也
Дата:
Hi Sawada-san,

Thank you for your additional feedback on v4!

>We can initialize dead_items_max_bytes when declared.
Thank you. I fixed it as you mentioned.

>Does it need to check if vacrel->dead_items is non-NULL?
Thank you. I fixed it as you mentioned.

>I think we should not use "index scans" since the number doesn't
>actually represent the number of index scans performed. How about
>something like:
>
>memory usage: 1.02 MB in total, with dead-item storage reset 15 times
>(limit was 0.06 MB)
>
>Also when it comes to the plural, we can use errmsg_plural() instead.
Thank you. I fixed the message as you mentioned.
Also, I found that ngettext() with appendStringInfo() is a standard pattern
in the PostgreSQL codebase (e.g., src/backend/catalog/pg_shdepend.c,
src/backend/catalog/dependency.c), so I use ngettext() in this code.

If it's good to use errmsg_plural(), I can restructure the code to use errmsg_plural().

Best regards,
Tatsuya Kawata

Вложения

Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
Masahiko Sawada
Дата:
On Fri, Nov 21, 2025 at 8:26 AM 河田達也 <kawatatatsuya0913@gmail.com> wrote:
>
> Hi Sawada-san,
>
> Thank you for your additional feedback on v4!
>
> >We can initialize dead_items_max_bytes when declared.
> Thank you. I fixed it as you mentioned.
>
> >Does it need to check if vacrel->dead_items is non-NULL?
> Thank you. I fixed it as you mentioned.
>
> >I think we should not use "index scans" since the number doesn't
> >actually represent the number of index scans performed. How about
> >something like:
> >
> >memory usage: 1.02 MB in total, with dead-item storage reset 15 times
> >(limit was 0.06 MB)
> >
> >Also when it comes to the plural, we can use errmsg_plural() instead.
> Thank you. I fixed the message as you mentioned.

Thank you for updating the patch!

> Also, I found that ngettext() with appendStringInfo() is a standard pattern
> in the PostgreSQL codebase (e.g., src/backend/catalog/pg_shdepend.c,
> src/backend/catalog/dependency.c), so I use ngettext() in this code.

You're right. We should use ngettext() instead.

The patch basically looks good to me. I've made some cosmetic changes
to the v5 patch and attached it as a separate patch. Most of the
changes are to remove redundant comments (because it's obvious from
the codes) and rephrasing the comments. Please review it.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
河田達也
Дата:
Hi Sawada-san,

Thank you very much for reviewing v5 patch!

I have reviewed the changes you made, and everything looks good to me.
Your revisions, especially the cleanup of redundant comments and the improved
wording, make the patch clearer and more consistent with the existing codebase.

Thank you again for your detailed reviews and helpful suggestions throughout
this patch series. I appreciate your support very much.

Best regards,
Tatsuya Kawata

Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
河田達也
Дата:

Hi Sawada-san,

I rebased and applied the changes you suggested.
The updated v6 is attached.

Best regards,
Tatsuya Kawata

Вложения

Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
Chao Li
Дата:
Hi Tatsuya-san,

I just reviewed and tested v6, and got a major concern and a few small comments.

> On Nov 27, 2025, at 22:59, 河田達也 <kawatatatsuya0913@gmail.com> wrote:
>
> Hi Sawada-san,
> I rebased and applied the changes you suggested.
> The updated v6 is attached.
> Best regards,
> Tatsuya Kawata
> <v6-0001-Add-memory-usage-reporting-to-VACUUM-VERBOSE.patch>

1. Major concern: In this patch, you only increase vacrel->total_dead_items_bytes in dead_items_reset(), however,
dead_items_reset()is not always called, that way I often see usage is 0. We should also increate the counter in
heap_vacuum_rel()where you now get dead_items_max_bytes. My dirty fix is like below: 

```
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ef2c72edf5d..635e3cf8346 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -862,6 +862,8 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
         */
        dead_items_max_bytes = vacrel->dead_items_info->max_bytes;

+       vacrel->total_dead_items_bytes += TidStoreMemoryUsage(vacrel->dead_items);
+
        /*
         * Free resources managed by dead_items_alloc.  This ends parallel mode in
         * passing when necessary.
```

I verified the fix with a simple procedure:
```
drop table if exists vacuum_test;
create table vacuum_test (id int);
insert into vacuum_test select generate_series(1, 100000);
delete from vacuum_test WHERE id % 2 = 0;
vacuum (verbose) vacuum_test;
```

Without my fix, memory usage was reported as 0:
```
memory usage: 0.00 MB in total, with dead-item storage reset 0 times (limit was 64.00 MB)
```

And with my fix, memory usage is greater than 0:
```
memory usage: 0.02 MB in total, with dead-item storage reset 0 times (limit was 64.00 MB)
```

2
```
+     * Save dead items max_bytes before cleanup for reporting the memory usage
+     * as the dead_items_info is freed in parallel vacuum cases during
+     * cleanup.
+     */
+    dead_items_max_bytes = vacrel->dead_items_info->max_bytes;
```

The comment says "the dead_items_info is freed in parallel vacuum”, should we check if vacrel->dead_items_info != NULL
beforedeferencing max_bytes? 

3
```
+            appendStringInfo(&buf,
+                             ngettext("memory usage: %.2f MB in total, with dead-item storage reset %d time (limit was
%.2fMB)\n", 
+                                      "memory usage: %.2f MB in total, with dead-item storage reset %d times (limit
was%.2f MB)\n", 

```

I just feel “limit was xxx MB” is still not clear enough. How about be explicit, like:

   Memory usage: 0.2 MB in total, memory allocated across 0 dead-item storage resets in total: 64.00 MB

Or

   Memory usage: 0.2 MB in total, with dead-item storage reset %d time, total allocated: 64.00 MB


Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
河田達也
Дата:
Hi Chao-san,

Thank you very much for your testing and review!


> 1. Major concern: In this patch, you only increase vacrel->total_dead_items_bytes in dead_items_reset(), however, dead_items_reset() is not always called, that way I often see usage is 0. We should also increate the counter in heap_vacuum_rel() where you now get dead_items_max_bytes. My dirty fix is like below:

I believe the key point is how to display memory usage when index vacuum is skipped.
In the v6 implementation, memory usage was always shown as 0 when there were no indexes.
However, more accurately, memory is allocated during initialization even when index vacuum is not executed, so it seems appropriate to display memory consumption as you suggested.
To avoid double-counting memory usage, I've modified the location and conditions where this processing is added.


> 2
> +        * Save dead items max_bytes before cleanup for reporting the memory usage
> +        * as the dead_items_info is freed in parallel vacuum cases during
> +        * cleanup.
> +        */
> +       dead_items_max_bytes = vacrel->dead_items_info->max_bytes;
> The comment says "the dead_items_info is freed in parallel vacuum", should we check if vacrel->dead_items_info != NULL before deferencing max_bytes?

In the case of parallel vacuum, the reference to dead_items_info is released in the subsequent dead_items_cleanup() call, so it cannot be NULL at this point.
I've updated the comment to make it clearer which function call causes the memory to become inaccessible.


> 3
> +                       appendStringInfo(&buf,
> +                                                        ngettext("memory usage: %.2f MB in total, with dead-item storage reset %d time (limit was %.2f MB)\n",
> +                                                                         "memory usage: %.2f MB in total, with dead-item storage reset %d times (limit was %.2f MB)\n",
>
> I just feel "limit was xxx MB" is still not clear enough. How about be explicit, like: Memory usage: 0.2 MB in total, memory allocated across 0 dead-item storage resets in total: 64.00 MB Or Memory usage: 0.2 MB in total, with dead-item storage reset %d time, total allocated: 64.00 MB

I've revised it as follows. What do you think?
memory usage: 0.02 MB in total, with dead-item storage reset 0 times (memory allocated: 64.00 MB)


I applied the changes above.
The updated v7 is attached.
I look forward to your feedback.

Best regards,
Tatsuya Kawata

Вложения

Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

От
Tatsuya Kawata
Дата:
Hi all,

Just a gentle ping on this patch.
Please let me know if there are any additional comments
or points I should address in the next revision.

Regards,
Tatsuya Kawata