Re: VACUUM FREEZE output more than double input

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: VACUUM FREEZE output more than double input
Дата
Msg-id 15401.1197676791@sss.pgh.pa.us
обсуждение исходный текст
Ответ на VACUUM FREEZE output more than double input  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Ответы Re: VACUUM FREEZE output more than double input
Список pgsql-performance
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Why double writes per read, plus massive writes at checkpoint?

The double writes aren't surprising: freezing has to be WAL-logged,
and the odds are that each page hasn't been touched since the last
checkpoint, so the WAL log will include a complete page image.
So in the steady state where all shared buffers are dirty, the
per-page cycle is:
    * write out a dirty buffer so it can be reclaimed
    * read in a page
    * modify it to mark tuples frozen
    * write an image of the page to WAL
    * leave the now-dirty page in shared buffers for later writing

The checkpoint spikes would come from trying to flush out all the
dirty buffers at once.

You'd expect a bit of a valley after each peak, since the vacuum
could presumably recycle some buffers without having to flush 'em
first; but I don't see one in your data.  That may just be because
the numbers are too noisy, but I kinda suspect that the vacuum is
dirtying buffers nearly as fast as the bgwriter can clean them,
leaving not a lot of daylight for a valley.

8.3 should pretty well eliminate the checkpoint spike in this scenario,
because vacuum will work in a limited number of shared buffers instead
of dirtying the whole cache.  But you'll still see 2X writes over reads.

If this is data that you could re-generate at need, it might make sense
to turn off full_page_writes during the initial data load and vacuum.

I concur with trying to FREEZE all the data while you do this, else
you'll see the same work done whenever the data happens to slip past
the auto freeze threshold.

            regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: viewing source code
Следующее
От: Tom Lane
Дата:
Сообщение: Re: explanation for seeks in VACUUM