Re: Changing shared_buffers without restart

Поиск
Список
Период
Сортировка
От Dmitry Dolgov
Тема Re: Changing shared_buffers without restart
Дата
Msg-id ep3kodga4ifasm7siiqprks4zzof2ofgorzhmgemuvr6ipua5m@vmrhpphjow67
обсуждение исходный текст
Ответ на Re: Changing shared_buffers without restart  (Dmitry Dolgov <9erthalion6@gmail.com>)
Список pgsql-hackers
> On Sun, Jul 06, 2025 at 03:21:08PM +0200, Dmitry Dolgov wrote:
> > On Sun, Jul 06, 2025 at 03:01:34PM +0200, Dmitry Dolgov wrote:
> > * This way any backend between the ProcSignalBarriers will be able
> >   proceed with whatever it's doing, and there is need to make sure it
> >   will not access buffers that will soon disappear. A suggestion so far
> >   was to get all backends agree to not allocate any new buffers in the
> >   to-be-truncated range, but accessing already existing buffers that
> >   will soon go away is a problem as well. As far as I can tell there is
> >   no rock solid method to make sure a backend doesn't have a reference
> >   to such a buffer somewhere (this was discussed earlier in thre
> >   thread), meaning that either a backend has to wait or buffers have to
> >   be checked every time on access.
>
> And sure enough, after I wrote this I've realized there should be no
> such references after the buffer eviction and prohibiting new buffer
> allocation. I still need to check it though, because not only buffers,
> but other shared memory structures (which number depends on NBuffers)
> will be truncated. But if they will also be handled by the eviction,
> then maybe everything is just fine.

Pondering more about this topic, I've realized there was one more
problematic case mentioned by Robert early in the thread, which is
relatively easy to construct:

* When increasing shared buffers from NBuffers_small to NBuffers_large
  it's possible that one backend already has applied NBuffers_large,
  then allocated a buffer B from (NBuffer_small, NBuffers_large] and put
  it into the buffer lookup table.

* In the meantime another backend still has NBuffers_small, but got
  buffer B from the lookup table.

Currently it's being addressed via every backend waiting for each other,
but I guess it could be as well managed via handling the freelist, so
that only "available" buffers will be inserted into the lookup table.

It's probably the only such case, but I can't tell that for sure (hard
to say, maybe there are more tricky cases with the latest async io). If
you folks have some other examples that may break, let me know. The
idea behind making everyone wait was to be rock solid that no similar
but unknown scenarios could damage the resize procedure.

As for other structures, BufferBlocks, BufferDescriptors and
BufferIOCVArray are all buffer indexed, so making sure shared memory
resizing works for buffers should automatically mean the same for the
rest. But CkptBufferIds is a different case, as it collects buffers to
sync and process them at later point in time -- it has to be explicitely
handled when shrinking shared memory I guess.

Long story short, in the next version of the patch I'll try to
experiment with a simplified design: a simple function to trigger
resizing, launching a coordinator worker, with backends not waiting for
each other and buffers first allocated and then marked as "available to
use".



В списке pgsql-hackers по дате отправления: