Обсуждение: Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent

Поиск
Список
Период
Сортировка

Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent

От
Robert Haas
Дата:
On Wed, Nov 12, 2025 at 3:31 PM Nathan Bossart <nathan@postgresql.org> wrote:
> Teach DSM registry to ERROR if attaching to an uninitialized entry.
>
> If DSM entry initialization fails, backends could try to use an
> uninitialized DSM segment, DSA, or dshash table (since the entry is
> still added to the registry).  To fix, keep track of whether
> initialization completed, and ERROR if a backend tries to attach to
> an uninitialized entry.  We could instead retry initialization as
> needed, but that seemed complicated, error prone, and unlikely to
> help most cases.  Furthermore, such problems probably indicate a
> coding error.

Having read the thread that led to this commit, I agree that this is
an improvement, but it still doesn't seem like a great situation to
me. Maybe I'm misunderstanding, but it seems like once the
initialization has failed, there's no way to retry: you can't retry
the initialization on the existing segment, and you can't delete it
and create a new segment. If so, that means your only option for
restoring proper function after an initialization failure is to bounce
the entire server.

Now, perhaps the chances of an initialization failure in practice are
quite low. Maybe a typical initialization function won't do anything
that could fail. But it just seems weird to me to design something
that thinks errors are likely enough that it's worth having some
amount of mechanism to deal with them, but unlikely enough that it's
not worth making that mechanism more robust.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent

От
Nathan Bossart
Дата:
On Thu, Nov 20, 2025 at 10:44:31AM -0500, Robert Haas wrote:
> Now, perhaps the chances of an initialization failure in practice are
> quite low. Maybe a typical initialization function won't do anything
> that could fail. But it just seems weird to me to design something
> that thinks errors are likely enough that it's worth having some
> amount of mechanism to deal with them, but unlikely enough that it's
> not worth making that mechanism more robust.

Well, I will attempt to convince you that is indeed the case, although I'm
not sure I will succeed...

ISTM that besides pure coding errors, the most likely reasons for an ERROR
during DSM segment initialization are things like out-of-memory, too many
LWLock tranches registered, etc.  In those cases, do we want every single
backend that tries to attach to the segment to retry and most likely fail
again?  And what if the initialization callback completed halfway before
failing?  Do we expect every extension author to carefully craft their
callbacks to handle that?

On the other hand, if we don't handle ERRORs and someone does run into an
OOM or whatnot, do we want other backends to attach and use the unitialized
segment (leading to things like seg-faults)?  Tracking whether an entry was
initialized seems like cheap insurance against these problems.

Finally, the only report from the field about any of this involves SQLsmith
allocating tons of LWLock tranches via test_dsa, which has been fixed
independently.  So I have no evidence to suggest that we need to make it
more robust.  In fact, without this one arguably contrived report, I
probably wouldn't have done anything at all.

I'll grant you that this is a judgment call.  I tried to find a good
middle ground, but I would be unsurprised to learn that other committers
would have done things differently.

-- 
nathan



Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent

От
Robert Haas
Дата:
On Thu, Nov 20, 2025 at 11:12 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
> ISTM that besides pure coding errors, the most likely reasons for an ERROR
> during DSM segment initialization are things like out-of-memory, too many
> LWLock tranches registered, etc.  In those cases, do we want every single
> backend that tries to attach to the segment to retry and most likely fail
> again?  And what if the initialization callback completed halfway before
> failing?  Do we expect every extension author to carefully craft their
> callbacks to handle that?

Well, I don't want to pretend like I know all the answers here. I do
think it's pretty clear that making it the initialization callback's
job to deal with a partially-failed initialization would be crazy.
What I'd be inclined to do after a failure is tear down the entire
segment. Even if there's no mechanism to retry, you're saving the
memory that the segment would have consumed. And maybe there is a
mechanism to retry. If these initialization callbacks are cheap
enough, maybe having every backend try is not really a big deal --
maybe it's good, insofar as an interactive session would be more
likely to see an error indicating the real cause of failure than a
fake error that basically tells you at some point in the past
something went wrong. If repeated failures are too painful, another
idea could be to just mark the entry as being in a failed state (maybe
with a timestamp) and let the extension decide when it wants to press
the reset button. Or maybe that's all too complicated for too little
utility. I'm not sure. My perception is that this is setting a
significantly lower standard for error tolerance than what we
typically seek to achieve, but sometimes my perceptions are wrong, and
sometimes better options are hard to come by.

> On the other hand, if we don't handle ERRORs and someone does run into an
> OOM or whatnot, do we want other backends to attach and use the unitialized
> segment (leading to things like seg-faults)?  Tracking whether an entry was
> initialized seems like cheap insurance against these problems.

That's fair.

> Finally, the only report from the field about any of this involves SQLsmith
> allocating tons of LWLock tranches via test_dsa, which has been fixed
> independently.  So I have no evidence to suggest that we need to make it
> more robust.  In fact, without this one arguably contrived report, I
> probably wouldn't have done anything at all.

Yeah, I'm not clear who is using this or how, so that makes it hard to judge.

> I'll grant you that this is a judgment call.  I tried to find a good
> middle ground, but I would be unsurprised to learn that other committers
> would have done things differently.

Such is life!

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent

От
Nathan Bossart
Дата:
On Thu, Nov 20, 2025 at 01:18:56PM -0500, Robert Haas wrote:
> What I'd be inclined to do after a failure is tear down the entire
> segment. Even if there's no mechanism to retry, you're saving the
> memory that the segment would have consumed.

Unpinning/detaching the segment/DSA/dshash table and deleting the DSM
registry entry in a PG_CATCH block scares me a little, but it might be
doable.

> Or maybe that's all too complicated for too little utility. I'm not sure.

That's ultimately where I landed.

> My perception is that this is setting a significantly lower standard for
> error tolerance than what we typically seek to achieve, but sometimes my
> perceptions are wrong, and sometimes better options are hard to come by.

Another thing that might be subconsciously guiding my decisions here is the
existing behavior when a shmem request/startup hook ERRORs (server startup
fails).  I'd expect DSM registry users to be doing similar things in their
initialization callbacks, and AFAIK this behavior hasn't been a source of
complaints.

-- 
nathan



Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent

От
Robert Haas
Дата:
On Thu, Nov 20, 2025 at 6:09 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> Unpinning/detaching the segment/DSA/dshash table and deleting the DSM
> registry entry in a PG_CATCH block scares me a little, but it might be
> doable.

It seems a bit weird to be doing explicit unpinning in a PG_CATCH
block. Ideally you'd want to postpone the pinning until initialization
has succeeded, so that if you fail before that, transaction cleanup
takes care of it automatically. Alternatively, destroying what existed
before could be deferred until later, when an as-yet-unfailed
transaction stumbles across the tombstone.

> Another thing that might be subconsciously guiding my decisions here is the
> existing behavior when a shmem request/startup hook ERRORs (server startup
> fails).  I'd expect DSM registry users to be doing similar things in their
> initialization callbacks, and AFAIK this behavior hasn't been a source of
> complaints.

What bugs me here is the fact that you have a perfectly good working
server that you can't "salvage". If the server had failed at startup,
that would have sucked, but you would have been forced into correcting
the problem (or giving up on the extension) and restarting, so once
the server gets up and running, it's always in a good state. With this
mechanism, you can get a running server that's stuck in this
failed-initialization state, and there's no way for the DBA to do
anything except by bouncing the entire server. That seems like it
could be really frustrating. Now, if it's the case that resetting this
mechanism wouldn't fundamentally fix anything because the
reinitialization attempt would inevitably fail for the same reason,
then I suppose it's not so bad, but I'm not sure that would always be
true. I just feel like one-way state transitions from good->bad are
undesirable. One way of viewing log levels like ERROR, FATAL, and
PANIC is that they force you to do a reset of the transaction,
session, or entire server to get back to a good state, but here you
just get stuck in the bad one.

Am I worrying too much? Possibly! But as I said to David on another
thread this morning, it's better to worry on pgsql-hackers before any
problem happens than to start worrying after something bad happens in
a customer situation.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent

От
Nathan Bossart
Дата:
On Sat, Nov 22, 2025 at 10:25:48AM -0500, Robert Haas wrote:
> On Thu, Nov 20, 2025 at 6:09 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> Unpinning/detaching the segment/DSA/dshash table and deleting the DSM
>> registry entry in a PG_CATCH block scares me a little, but it might be
>> doable.
> 
> It seems a bit weird to be doing explicit unpinning in a PG_CATCH
> block. Ideally you'd want to postpone the pinning until initialization
> has succeeded, so that if you fail before that, transaction cleanup
> takes care of it automatically. Alternatively, destroying what existed
> before could be deferred until later, when an as-yet-unfailed
> transaction stumbles across the tombstone.

Oh, yeah, good idea.

> Am I worrying too much? Possibly! But as I said to David on another
> thread this morning, it's better to worry on pgsql-hackers before any
> problem happens than to start worrying after something bad happens in
> a customer situation.

I'll give what you suggested a try.  It seems reasonable enough.

-- 
nathan



Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent

От
Nathan Bossart
Дата:
On Sat, Nov 22, 2025 at 01:41:21PM -0600, Nathan Bossart wrote:
> I'll give what you suggested a try.  It seems reasonable enough.

Here is a mostly-untested first try.

-- 
nathan

Вложения

Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent

От
Nathan Bossart
Дата:
On Mon, Nov 24, 2025 at 12:36:25PM -0600, Nathan Bossart wrote:
> On Sat, Nov 22, 2025 at 01:41:21PM -0600, Nathan Bossart wrote:
>> I'll give what you suggested a try.  It seems reasonable enough.
> 
> Here is a mostly-untested first try.

I think 0002 was more complicated than necessary.  Here's a second try.

-- 
nathan

Вложения

Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent

От
Robert Haas
Дата:
On Mon, Nov 24, 2025 at 2:03 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> I think 0002 was more complicated than necessary.  Here's a second try.

I haven't done a full review of this and I'm not sure whether you want
me to spend more time on it, but something I notice about this version
is that the PG_CATCH() blocks only contain dshash_delete_entry()
calls. That seems good, because that means that all the other cleanup
is being handled by transaction abort (assuming that the patch isn't
buggy, which I haven't attempted to verify). However, it does make me
wonder if we could also find a way to postpone adding the dshash
entries so that we don't need to do anything in PG_CATCH() blocks at
all. I'm guessing that the reason why that doesn't easily work is
because you're relying on those locks to prevent multiple backends
from doing the same initialization?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pgsql: Teach DSM registry to ERROR if attaching to an uninitialized ent

От
Nathan Bossart
Дата:
On Mon, Nov 24, 2025 at 04:02:18PM -0500, Robert Haas wrote:
> I haven't done a full review of this and I'm not sure whether you want
> me to spend more time on it...

I'd appreciate an eyeball check if you have the time.

> I'm guessing that the reason why that doesn't easily work is
> because you're relying on those locks to prevent multiple backends
> from doing the same initialization?

For GetNamedDSMSegment(), I bet we could avoid needing a PG_CATCH by taking
the DSMRegistryLock exclusively when accessing the registry.  But for
GetNamedDSA() and GetNamedDSHash(), we want to keep the half-initialized
entry so that we don't leak LWLock tranche IDs.  I initially thought that
might be okay, but if every backend is retrying, you could quickly run out
of tranche IDs.

-- 
nathan