Re: [PoC] Improve dead tuple storage for lazy vacuum

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [PoC] Improve dead tuple storage for lazy vacuum
Дата
Msg-id CAD21AoBY8nxRoYx8JStNfe-sui=rS67M6JBBbad5NwaO1bgLuQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PoC] Improve dead tuple storage for lazy vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: [PoC] Improve dead tuple storage for lazy vacuum  (John Naylor <johncnaylorls@gmail.com>)
Список pgsql-hackers
On Thu, Mar 21, 2024 at 3:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Mar 21, 2024 at 12:40 PM John Naylor <johncnaylorls@gmail.com> wrote:
> >
> > On Thu, Mar 21, 2024 at 9:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Mar 20, 2024 at 11:19 PM John Naylor <johncnaylorls@gmail.com> wrote:
> > > > Are they (the blocks to be precise) really out of order? The VALUES
> > > > statement is ordered, but after inserting it does not output that way.
> > > > I wondered if this is platform independent, but CI and our dev
> > > > machines haven't failed this test, and I haven't looked into what
> > > > determines the order. It's easy enough to hide the blocks if we ever
> > > > need to, as we do elsewhere...
> > >
> > > It seems not necessary as such a test is already covered by
> > > test_radixtree. I've changed the query to hide the output blocks.
> >
> > Okay.
> >
> > > The buildfarm has been all-green so far.
> >
> > Great!
> >
> > > I've attached the latest vacuum improvement patch.
> > >
> > > I just remembered that the tidstore cannot still be used for parallel
> > > vacuum with minimum maintenance_work_mem. Even when the shared
> > > tidstore is empty, its memory usage reports 1056768 bytes, a bit above
> > > 1MB (1048576 bytes). We need something discussed on another thread[1]
> > > in order to make it work.
> >
> > For exactly this reason, we used to have a clamp on max_bytes when it
> > was internal to tidstore, so that it never reported full when first
> > created, so I guess that got thrown away when we got rid of the
> > control object in shared memory. Forcing callers to clamp their own
> > limits seems pretty unfriendly, though.
>
> Or we can have a new function for dsa.c to set the initial and max
> segment size (or either one) to the existing DSA area so that
> TidStoreCreate() can specify them at creation. In shared TidStore
> cases, since all memory required by shared radix tree is allocated in
> the passed-in DSA area and the memory usage is the total segment size
> allocated in the DSA area, the user will have to prepare a DSA area
> only for the shared tidstore. So we might be able to expect that the
> DSA passed-in to TidStoreCreate() is empty and its segment sizes can
> be adjustable.

Yet another idea is that TidStore creates its own DSA area in
TidStoreCreate(). That is, In TidStoreCreate() we create a DSA area
(using dsa_create()) and pass it to RT_CREATE(). Also, we need a new
API to get the DSA area. The caller (e.g. parallel vacuum) gets the
dsa_handle of the DSA and stores it in the shared memory (e.g. in
PVShared). TidStoreAttach() will take two arguments: dsa_handle for
the DSA area and dsa_pointer for the shared radix tree. This idea
still requires controlling min/max segment sizes since dsa_create()
uses the 1MB as the initial segment size. But the TidStoreCreate()
would be more user friendly.

I've attached a PoC patch for discussion.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: WIP Incremental JSON Parser
Следующее
От: Bertrand Drouvot
Дата:
Сообщение: Re: Introduce XID age and inactive timeout based replication slot invalidation