Re: POC: Cleaning up orphaned files using undo logs

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: POC: Cleaning up orphaned files using undo logs
Дата
Msg-id 20190816052600.k4kiv52y2eqrmkbd@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: POC: Cleaning up orphaned files using undo logs  (Dilip Kumar <dilipbalaut@gmail.com>)
Ответы Re: POC: Cleaning up orphaned files using undo logs  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
Hi,

On 2019-08-16 09:44:25 +0530, Dilip Kumar wrote:
> On Wed, Aug 14, 2019 at 2:48 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Aug 14, 2019 at 12:27 PM Andres Freund <andres@anarazel.de> wrote:
> 
> > >   I think that batch reading should just copy the underlying data into a
> > >   char* buffer. Only the records that currently are being used by
> > >   higher layers should get exploded into an unpacked record. That will
> > >   reduce memory usage quite noticably (and I suspect it also drastically
> > >   reduce the overhead due to a large context with a lot of small
> > >   allocations that then get individually freed).
> >
> > Ok, I got your idea.  I will analyze it further and work on this if
> > there is no problem.
> 
> I think there is one problem that currently while unpacking the undo
> record if the record is compressed (i.e. some of the fields does not
> exist in the record) then we read those fields from the first record
> on the page.  But, if we just memcpy the undo pages to the buffers and
> delay the unpacking whenever it's needed seems that we would need to
> know the page boundary and also we need to know the offset of the
> first complete record on the page from where we can get that
> information (which is currently in undo page header).

I don't understand why that's a problem?


> As of now even if we leave this issue apart I am not very clear what
> benefit you are seeing in the way you are describing compared to the
> way I am doing it now?
> 
> a) Is it the multiple palloc? If so then we can allocate memory at
> once and flatten the undo records in that.  Earlier, I was doing that
> but we need to align each unpacked undo record so that we can access
> them directly and based on Robert's suggestion I have modified it to
> multiple palloc.

Part of it.

> b) Is it the memory size problem that the unpack undo record will take
> more memory compared to the packed record?

Part of it.

> c) Do you think that we will not need to unpack all the records?  But,
> I think eventually, at the higher level we will have to unpack all the
> undo records ( I understand that it will be one at a time)

Part of it. There's a *huge* difference between having a few hundred to
thousand unpacked records, each consisting of several independent
allocations, in memory and having one large block containing all
packed records in a batch, and a few allocations for the few unpacked
records that need to exist.

There's also d) we don't need separate tiny memory copies while holding
buffer locks etc.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: POC: Cleaning up orphaned files using undo logs
Следующее
От: Etsuro Fujita
Дата:
Сообщение: Re: Useless bms_free() calls in build_child_join_rel()