Re: Changing shared_buffers without restart

Поиск
Список
Период
Сортировка
От Dmitry Dolgov
Тема Re: Changing shared_buffers without restart
Дата
Msg-id yesbek3hjzlsu4zjp62cyzrtzqutrwvssytsmjjwgpmsciouxq@jgv7kustu3eb
обсуждение исходный текст
Ответ на Re: Changing shared_buffers without restart  (Tomas Vondra <tomas@vondra.me>)
Список pgsql-hackers
> On Fri, Jul 04, 2025 at 05:23:29PM +0200, Tomas Vondra wrote:
> >> 2) pending GUC changes
> >>
> >> Perhaps this should be a separate utility command, or maybe even just
> >> a new ALTER SYSTEM variant? Or even just a function, similar to what
> >> the "online checksums" patch did, possibly combined with a bgworder
> >> (but probably not needed, there are no db-specific tasks to do).
> >
> > This is one topic we still actively discuss, but haven't had much
> > feedback otherwise. The pros and cons seem to be clear:
> >
> > * Utilizing the existing GUC mechanism would allow treating
> >   shared_buffers as any other configuration, meaning that potential
> >   users of this feature don't have to do anything new to use it -- they
> >   still can use whatever method they prefer to apply new configuration
> >   (pg_reload_conf, pg_ctr reload, maybe even sending SIGHUP directly).
> >
> >   I'm also wondering if it's only shared_buffers, or some other options
> >   could use similar approach.
> >
>
> I don't know. What are the "potential users" of this feature? I don't
> recall any, but there may be some. How do we know the new pending flag
> will work for them too?

It could be potentialy useful for any GUC that controls a resource
shared between backend, and requires restart. To make this GUC
changeable online, every backend has to perform some action, and they
have to coordinate to make sure things are consistent -- exactly the use
case we're trying to address, shared_buffers is just happened to be one
of such resources. While I agree that the currently implemented
interface is wrong (e.g. it doesn't prevent pending GUCs from being
stored in PG_AUTOCONF_FILENAME, this has to happen only when the new
value is actually applied), it still makes sense to me to allow more
flexible lifecycle for certain GUC.

An example I could think of is shared_preload_libraries. If we ever want
to do a hot reload of libraries, this will follow the procedure above:
every backend has to do something like dlclose / dlopen and make sure
that other backends have the same version of the library. Another maybe
less far fetched example is max_worker_processes, which AFAICT is mostly
used to control number of slots in shared memory (altough it's also
stored in the control file, which makes things more complicated).

> > * Having a separate utility command is a mighty simplification, which
> >   helps avoiding problems you've described above.
> >
> > So far we've got two against one in favour of simple utility command, so
> > we can as well go with that.
> >
>
> Not sure voting is a good way to make design decisions ...

I'm somewhat torn between those two options myself. The more I think
about this topic, the more I convinced that pending GUC makes sense, but
the more work I see needed to implement that. Maybe a good middle ground
is to go with a simple utility command, as Ashutosh was suggesting, and
keep pending GUC infrastructure on top of that as an optional patch.

> >> 3) max_available_memory
> >>
> >> I think the GUC should specify the maximum shared_buffers we want to
> >> allow, and then we'd work out the total to pre-allocate? Considering
> >> we're only allowing to resize shared_buffers, that should be pretty
> >> trivial. Yes, it might happen that the "total limit" happens to exceed
> >> the available memory or something, but we already have the problem
> >> with shared_buffers. Seems fine if we explain this in the docs, and
> >> perhaps print the calculated memory limit on start.
> >
> > Somehow I'm not following what you suggest here. You mean having the
> > maximum shared_buffers specified, but not as a separate GUC?
>
> My suggestion was to have a guc max_shared_buffers. Based on that you
> can easily calculate the size of all other segments dependent on
> NBuffers, and reserve memory for that.

Got it, ok.

> >> Moreover, all of the above is for mappings sized based on NBuffers. But
> >> if we allocate 10% for MAIN_SHMEM_SEGMENT, won't that be a problem the
> >> moment someone increases of max_connection, max_locks_per_transaction
> >> and possibly some other stuff?
> >
> > Can you elaborate, what do you mean by that? Increasing max_connection,
> > etc. leading to increased memory consumption in the MAIN_SHMEM_SEGMENT,
> > but the ratio is for memory reservation only.
> >
>
> Stuff like PGPROC, fast-path locks etc. are allocated as part of
> MAIN_SHMEM_SEGMENT, right? Yet the ratio assigns 10% of the maximum
> space for that. If I significantly increase GUCs like max_connections or
> max_locks_per_transaction, how do you know it didn't exceed the 10%?

Still don't see the problem. The 10% we're talking about is the reserved
space, thus it affects only shared memory resizing operation and nothing
else. The real memory allocated is less than or equal to the reserved
size, but is allocated and managed completely in the same way as without
the patch, including size calculations. If some GUCs are increased and
drive real memory usage high, it will be handled as before. Are we on
the same page about this?

> >> 11) Actually, what's the difference between the contents of Mappings
> >>     and Segments? Isn't that the same thing, indexed in the same way?
> >>     Or could it be unified? Or are they conceptually different thing?
> >
> > Unless I'm mixing something badly, the content is the same. The relation
> > is a segment as a structure "contains" a mapping.
> >
> Then, why do we need to track it in two places? Doesn't it just increase
> the likelihood that someone misses updating one of them?

To clarify, under "contents" I mean the shared memory content (the
actual data) behind both "segment" and the "mapping", maybe you had
something else in mind.

On the surface of it those are two different data structures that have
mostly different, but related, fields: a shared memory segment contains
stuff needed for working with memory (header, base, end, lock), mapping
has more lower level details, e.g.  reserved space, fd, IPC key. The
only common fields are size and address, maybe I can factor them out to
not repeat.

> >> 2) In fact, what happens if the user tries to resize to a value that is
> >>    too large for one of the segments? How would the system know before
> >>    starting the resize (and failing)?
> >
> > This type of situation is handled (doing hard stop) in the latest
> > version, because all the necessary information is present in the mapping
> > structure.
> >
> I don't know, but crashing the instance (I assume that's what you mean
> by hard stop) does not seem like something we want to do. AFAIK the GUC
> hook should be able to determine if the value is too large, and reject
> it at that point. Not proceed and crash everything.

I see, you're pointing out that it would be good to have more validation
at the GUC level, right?



В списке pgsql-hackers по дате отправления: