Re: finding changed blocks using WAL scanning

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: finding changed blocks using WAL scanning
Дата	11 апреля 2019 г. 03:11:11
Msg-id	CA+TgmobvLUuu75QQQSsAe=+beB_GBQm1faY96iyqSBPeokp9EQ@mail.gmail.com обсуждение исходный текст
Ответ на	finding changed blocks using WAL scanning (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: finding changed blocks using WAL scanning
Список	pgsql-hackers

Дерево обсуждения

On Wed, Apr 10, 2019 at 5:49 PM Robert Haas <robertmhaas@gmail.com> wrote:
> There is one thing that does worry me about the file-per-LSN-range
> approach, and that is memory consumption when trying to consume the
> information.  Suppose you have a really high velocity system.  I don't
> know exactly what the busiest systems around are doing in terms of
> data churn these days, but let's say just for kicks that we are
> dirtying 100GB/hour.  That means, roughly 12.5 million block
> references per hour.  If each block reference takes 12 bytes, that's
> maybe 150MB/hour in block reference files.  If you run a daily
> incremental backup, you've got to load all the block references for
> the last 24 hours and deduplicate them, which means you're going to
> need about 3.6GB of memory.  If you run a weekly incremental backup,
> you're going to need about 25GB of memory.  That is not ideal.  One
> can keep the memory consumption to a more reasonable level by using
> temporary files.  For instance, say you realize you're going to need
> 25GB of memory to store all the block references you have, but you
> only have 1GB of memory that you're allowed to use.  Well, just
> hash-partition the data 32 ways by dboid/tsoid/relfilenode/segno,
> writing each batch to a separate temporary file, and then process each
> of those 32 files separately.  That does add some additional I/O, but
> it's not crazily complicated and doesn't seem too terrible, at least
> to me.  Still, it's something not to like.

Oh, I'm being dumb.  We should just have the process that writes out
these files sort the records first.  Then when we read them back in to
use them, we can just do a merge pass like MergeAppend would do.  Then
you never need very much memory at all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tatsuo Ishii
Дата: 11 апреля 2019 г., 03:09:15
Сообщение: Re: PostgreSQL pollutes the file system

Следующее

От: David Rowley
Дата: 11 апреля 2019 г., 03:27:20
Сообщение: Re: Reducing the runtime of the core regression tests

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: finding changed blocks using WAL scanning

Предыдущее

Следующее