Re: How to make ResourceOwnerForgetBuffer() O(1), instead of O(N^2) scale

Поиск
Список
Период
Сортировка
От Kouhei Kaigai
Тема Re: How to make ResourceOwnerForgetBuffer() O(1), instead of O(N^2) scale
Дата
Msg-id 9A28C8860F777E439AA12E8AEA7694F8010477BA@BPXM15GP.gisp.nec.co.jp
обсуждение исходный текст
Ответ на Re: How to make ResourceOwnerForgetBuffer() O(1), instead of O(N^2) scale  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
> On 2014-10-03 10:35:42 +0300, Heikki Linnakangas wrote:
> > On 10/03/2014 07:08 AM, Kouhei Kaigai wrote:
> > > Hello,
> > >
> > > I recently got a trouble on development of my extension that
> > > utilizes the shared buffer when it released each buffer page.
> > >
> > > This extension transfers contents of the shared buffers to GPU
> > > device using DMA feature, then kicks a device kernel code.
> >
> > Wow, that sounds crazy.
>
> Agreed. I doubt that pinning that many buffers is a sane thing to do. At
> the very least you'll heavily interfere with vacuum and such.
>
My assumption is, this extension is used to handle OLAP type workload,
thus relatively less amount of write traffic to the database.
Sorry, I missed to mention about.

> > > Once backend/extension calls ReadBuffer(), resowner.c tracks which
> > > buffer was referenced by the current resource owner, to ensure these
> > > buffers being released at end of the transaction.
> > > However, it seems to me implementation of resowner.c didn't assume
> > > many buffers are referenced by a particular resource owner
> simultaneously.
> > > It manages the buffer index using an expandable array, then looks up
> > > the target buffer by sequential walk but from the tail because
> > > recently pinned buffer tends to be released first.
> > > It made a trouble in my case. My extension pinned multiple thousands
> > > buffers, so owner->buffers[] were enlarged and takes expensive cost
> > > to walk on.
> > > In my measurement, ResourceOwnerForgetBuffer() takes 36 seconds in
> > > total during hash-joining 2M rows; even though hash-joining itself
> > > takes less than 16 seconds.
>
> > > What is the best way to solve the problem?
> >
> > How about creating a separate ResourceOwner for these buffer pins, and
> > doing a wholesale ResourceOwnerRelease() on it when you're done?
>
> Or even just unpinning them in reverse order? That should already fix the
> performance issues?
>
In case when multiple chunks (note: a chunk contains thousands buffers as
a unit of device kernel execution) are running asynchronously, order of
GPU job's completion is not predictable.
So, it does not help my situation if one resource-owner tracks all the
buffers.

Probably, Heikki suggested to create a separate resource-owner per chunk.
In this case, all the buffers in a particular chunk shall be released
on the same time, so ReleaseBuffer() in reverse order makes sense.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kouhei Kaigai
Дата:
Сообщение: Re: How to make ResourceOwnerForgetBuffer() O(1), instead of O(N^2) scale
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Promise index tuples for UPSERT