Re: New strategies for freezing, advancing relfrozenxid early

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: New strategies for freezing, advancing relfrozenxid early
Дата
Msg-id 1892cc797f86924e31f00fd3c703ca464936c65e.camel@j-davis.com
обсуждение исходный текст
Ответ на New strategies for freezing, advancing relfrozenxid early  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: New strategies for freezing, advancing relfrozenxid early  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Thu, 2022-08-25 at 14:21 -0700, Peter Geoghegan wrote:
> Attached patch series is a completely overhauled version of earlier
> work on freezing. Related work from the Postgres 15 cycle became
> commits 0b018fab, f3c15cbe, and 44fa8488.
>
> Recap
> =====
>
> The main high level goal of this work is to avoid painful, disruptive
> antiwraparound autovacuums (and other aggressive VACUUMs) that do way
> too much "catch up" freezing, all at once

I agree with the motivation: that keeping around a lot of deferred work
(unfrozen pages) is risky, and that administrators would want a way to
control that risk.

The solution involves more changes to the philosophy and mechanics of
vacuum than I would expect, though. For instance, VM snapshotting,
page-level-freezing, and a cost model all might make sense, but I don't
see why they are critical for solving the problem above. I think I'm
still missing something. My mental model is closer to the bgwriter and
checkpoint_completion_target.

Allow me to make a naive counter-proposal (not a real proposal, just so
I can better understand the contrast with your proposal):

  * introduce a reloption unfrozen_pages_target (default -1, meaning
infinity, which is the current behavior)
  * introduce two fields to LVRelState: n_pages_frozen and
delay_skip_count, both initialized to zero
  * when marking a page frozen: n_pages_frozen++
  * when vacuum begins:
      if (unfrozen_pages_target >= 0 &&
          current_unfrozen_page_count > unfrozen_pages_target)
      {
        vacrel->delay_skip_count = current_unfrozen_page_count -
          unfrozen_pages_target;
        /* ?also use more aggressive freezing thresholds? */
      }
  * in lazy_scan_skip(), have a final check:
      if (vacrel->n_pages_frozen < vacrel->delay_skip_count)
      {
         break;
      }

I know there would still be some problem cases, but to me it seems like
we solve 80% of the problem in a couple dozen lines of code.

a. Can you clarify some of the problem cases, and why it's worth
spending more code to fix them?

b. How much of your effort is groundwork for related future
improvements? If it's a substantial part, can you explain in that
larger context?

c. Can some of your patches be separated into independent discussions?
For instance, patch 1 has been discussed in other threads and seems
independently useful, and I don't see the current work as dependent on
it. Patch 4 also seems largerly independent.

d. Can you help give me a sense of scale of the problems solved by
visibilitymap snapshots and the cost model? Do those need to be in v1?

Regards,
    Jeff Davis




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Postmaster self-deadlock due to PLT linkage resolution
Следующее
От: Aleksander Alekseev
Дата:
Сообщение: Re: Convert *GetDatum() and DatumGet*() macros to inline functions