Re: New GUC autovacuum_max_threshold ?

Поиск
Список
Период
Сортировка
От Frédéric Yhuel
Тема Re: New GUC autovacuum_max_threshold ?
Дата
Msg-id 5b55790c-16ed-4e90-800c-678ad2655ace@dalibo.com
обсуждение исходный текст
Ответ на Re: New GUC autovacuum_max_threshold ?  (Melanie Plageman <melanieplageman@gmail.com>)
Ответы Re: New GUC autovacuum_max_threshold ?  (Melanie Plageman <melanieplageman@gmail.com>)
Список pgsql-hackers

Le 24/04/2024 à 21:10, Melanie Plageman a écrit :
> On Wed, Apr 24, 2024 at 8:08 AM Frédéric Yhuel
> <frederic.yhuel@dalibo.com> wrote:
>>
>> Hello,
>>
>> I would like to suggest a new parameter, autovacuum_max_threshold, which
>> would set an upper limit on the number of tuples to delete/update/insert
>> prior to vacuum/analyze.
> 
> Hi Frédéric, thanks for the proposal! You are tackling a very tough
> problem. I would also find it useful to know more about what led you
> to suggest this particular solution. I am very interested in user
> stories around difficulties with what tables are autovacuumed and
> when.
>

Hi Melanie! I can certainly start compiling user stories about that.

Recently, one of my colleagues wrote an email to our DBA team saying 
something along these lines:

« Hey, here is our suggested settings for per table autovacuum 
configuration:

|  *autovacuum*        | L < 1 million | L >= 1 million | L >= 5 
millions | L >= 10 millions |
|:---------------------|--------------:|---------------:|----------------:|-----------------:|
|`vacuum_scale_factor` |  0.2 (défaut) |            0.1 |     0.05 
  |              0.0 |
|`vacuum_threshold`    |   50 (défaut) |    50 (défaut) |     50 
(défaut) |          500 000 |
|`analyze_scale_factor`|  0.1 (défaut) |   0.1 (défaut) |     0.05 
  |              0.0 |
|`analyze_threshold`   |   50 (défaut) |    50 (défaut) |     50 
(défaut) |          500 000 |

Let's update this table with values for the vacuum_insert_* parameters. »

I wasn't aware that we had this table, and although the settings made 
sense to me, I thought it was rather ugly and cumbersome for the user, 
and I started thinking about how postgres could make his life easier.

> Am I correct in thinking that one of the major goals here is for a
> very large table to be more likely to be vacuumed?
>

Absolutely.

>> The idea would be to replace the following calculation :
>>
>> vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples;
>>
>> with this one :
>>
>> vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples / (1
>> + vac_scale_factor * reltuples / autovacuum_max_threshold)
>>
>> (and the same for the others, vacinsthresh and anlthresh).
> 
> My first thought when reviewing the GUC and how it is used is
> wondering if its description is a bit misleading.
> 
> autovacuum_vacuum_threshold is the "minimum number of updated or
> deleted tuples needed to trigger a vacuum". That is, if this many
> tuples are modified, it *may* trigger a vacuum, but we also may skip
> vacuuming the table for other reasons or due to other factors.
> autovacuum_max_threshold's proposed definition is the upper
> limit/maximum number of tuples to insert/update/delete prior to
> vacuum/analyze. This implies that if that many tuples have been
> modified or inserted, the table will definitely be vacuumed -- which
> isn't true. Maybe that is okay, but I thought I would bring it up.
>

I'm not too sure I understand. What are the reasons it might by skipped? 
I can think of a concurrent index creation on the same table, or 
anything holding a SHARE UPDATE EXCLUSIVE lock or above. Is this the 
sort of thing you are talking about?

Perhaps a better name for the GUC would be 
autovacuum_asymptotic_limit... or something like that?

>> The attached (draft) patch further illustrates the idea.
> 
> Thanks for including a patch!
> 
>> My guess is that a similar proposal has already been submitted... and
>> rejected 🙂 If so, I'm very sorry for the useless noise.
> 
> I rooted around in the hackers archive and couldn't find any threads
> on this specific proposal. I copied some other hackers I knew of who
> have worked on this problem and thought about it in the past, in case
> they know of some existing threads or prior work on this specific
> topic.
> 

Thanks!



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Andrey M. Borodin"
Дата:
Сообщение: Re: broken reading on standby (PostgreSQL 16.2)
Следующее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Add missing ConditionVariableCancelSleep() in slot.c