Re: Checkpointer write combining

Поиск
Список
Период
Сортировка
От BharatDB
Тема Re: Checkpointer write combining
Дата
Msg-id CAAh00EQy3KsgT33SRDOndUsMveDbLGXgdCZk9AoC8tYiqHXk4w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Checkpointer write combining  (Melanie Plageman <melanieplageman@gmail.com>)
Список pgsql-hackers

Hi all,

Considering the CI failures in earlier patch versions around “max batch size”, upon my observation I found the failures arise either from boundary conditions when io_combine_limit (GUC) is set larger than the compile-time MAX_IO_COMBINE_LIMIT or when pin limits return small/zero values due to which it produce out-of-range batch sizes or assertion failures in CI.

Comparing the approaches suggested in the thread, I think implementing (GUC + compile-time cap first, and then pin limits) could be more effective in avoiding CI failures and also we should consider the following logic conditions:

  1. Set io_combine_limit == 0 explicitly (fallback to 1 for forward progress).

  2. Cap early to a conservative compile_cap (MAX_IO_COMBINE_LIMIT - 1) to avoid array overflow. Otherwise if we confirm all slots are usable, change to MAX_IO_COMBINE_LIMIT.

  3. Apply per-strategy pin and global pin limits only if they are positive.

  4. Use explicit typed comparisons to avoid signed/unsigned pitfalls and add a final Assert() to capture assumptions in CI.

Implementation logic:

uint32
StrategyMaxWriteBatchSize(BufferAccessStrategy strategy)
{
    uint32        max_write_batch_size;
    uint32        compile_cap = MAX_IO_COMBINE_LIMIT - 1;    /* conservative cap */
    int           strategy_pin_limit = GetAccessStrategyPinLimit(strategy);
    uint32        max_possible_buffer_limit = GetPinLimit();
    max_write_batch_size = (io_combine_limit == 0) ? 1 : io_combine_limit;
    if (max_write_batch_size > compile_cap)
        max_write_batch_size = compile_cap;
    if (strategy_pin_limit > 0 &&
        (uint32) strategy_pin_limit < max_write_batch_size)
        max_write_batch_size = (uint32) strategy_pin_limit;
    if (max_possible_buffer_limit > 0 &&
        max_possible_buffer_limit < max_write_batch_size)
        max_write_batch_size = max_possible_buffer_limit;
    if (max_write_batch_size == 0)
        max_write_batch_size = 1;
    Assert(max_write_batch_size <= compile_cap);
    return max_write_batch_size;
}

I hope this will be helpful for proceeding further. Looking forward to more feedback.
Thanking you.
Regards,
Soumya

On Tue, Nov 4, 2025 at 5:04 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
Thanks for continuing to review! I've revised the patches to
incorporate all of your feedback except for where I mention below.

There were failures in CI due to issues with max batch size, so
attached v8 also seeks to fix those.

- Melanie

On Thu, Oct 16, 2025 at 12:25 AM Chao Li <li.evan.chao@gmail.com> wrote:
>
> 3 - 0003
> ```
> +/*
> + * Return the next buffer in the ring or InvalidBuffer if the current sweep is
> + * over.
> + */
> +Buffer
> +StrategySweepNextBuffer(BufferAccessStrategy strategy, int *sweep_cursor)
> +{
> +       if (++(*sweep_cursor) >= strategy->nbuffers)
> +               *sweep_cursor = 0;
> +
> +       return strategy->buffers[*sweep_cursor];
> +}
> ```
>
> Feels the function comment is a bit confusing, because the function code doesn’t really perform sweep, the function is just a getter. InvalidBuffer just implies the current sweep is over.
>
> Maybe rephrase to something like: “Return the next buffer in the range. If InvalidBuffer is returned, that implies the current sweep is done."

Yes, actually I think having these helpers mention the sweep is more
confusing than anything else. I've revised them to be named more
generically and updated the comments accordingly.

> 5 - 0004
> ```
> +uint32
> +StrategyMaxWriteBatchSize(BufferAccessStrategy strategy)
> +{
> +       uint32          max_possible_buffer_limit;
> +       uint32          max_write_batch_size;
> +       int                     strategy_pin_limit;
> +
> +       max_write_batch_size = io_combine_limit;
> +
> +       strategy_pin_limit = GetAccessStrategyPinLimit(strategy);
> +       max_possible_buffer_limit = GetPinLimit();
> +
> +       max_write_batch_size = Min(strategy_pin_limit, max_write_batch_size);
> +       max_write_batch_size = Min(max_possible_buffer_limit, max_write_batch_size);
> +       max_write_batch_size = Max(1, max_write_batch_size);
> +       max_write_batch_size = Min(max_write_batch_size, io_combine_limit);
> +       Assert(max_write_batch_size < MAX_IO_COMBINE_LIMIT);
> +       return max_write_batch_size;
> +}
> ```
>
> This implementation is hard to understand. I tried to simplify it:
> ```
> uint32
> StrategyMaxWriteBatchSize(BufferAccessStrategy strategy)
> {
>         int strategy_pin_limit = GetAccessStrategyPinLimit(strategy);
>         uint32 max_write_batch_size = Min(GetPinLimit(), (uint32)strategy_pin_limit);
>
>         /* Clamp to io_combine_limit and enforce minimum of 1 */
>         if (max_write_batch_size > io_combine_limit)
>                 max_write_batch_size = io_combine_limit;
>         if (max_write_batch_size == 0)
>                 max_write_batch_size = 1;
>
>         Assert(max_write_batch_size < MAX_IO_COMBINE_LIMIT);
>         return max_write_batch_size;
> }
> ```

I agree that the implementation was hard to understand. I've not quite
gone with your version but I have rewritten it like this:

uint32
StrategyMaxWriteBatchSize(BufferAccessStrategy strategy)
{
    uint32        max_write_batch_size = Min(io_combine_limit,
MAX_IO_COMBINE_LIMIT);
    int            strategy_pin_limit = GetAccessStrategyPinLimit(strategy);
    uint32        max_possible_buffer_limit = GetPinLimit();

    /* Identify the minimum of the above */
    max_write_batch_size = Min(strategy_pin_limit, max_write_batch_size);
    max_write_batch_size = Min(max_possible_buffer_limit, max_write_batch_size);

    /* Must allow at least 1 IO for forward progress */
    max_write_batch_size = Max(1, max_write_batch_size);

    return max_write_batch_size;
}

Is this better?

- Melanie

В списке pgsql-hackers по дате отправления: