Re: md.c vs elog.c vs smgrreleaseall() in barrier

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: md.c vs elog.c vs smgrreleaseall() in barrier
Дата	18 марта 02:52:02
Msg-id	kpopzahilmjtcpeefrsbrtzvwddzbhadsfn3cvdelnpi24gbgq@5ty2xbzoyua3 обсуждение исходный текст
Ответ на	Re: md.c vs elog.c vs smgrreleaseall() in barrier (Thomas Munro <thomas.munro@gmail.com>)
Ответы	Re: md.c vs elog.c vs smgrreleaseall() in barrier Re: md.c vs elog.c vs smgrreleaseall() in barrier
Список	pgsql-hackers

Дерево обсуждения

Hi,

On 2025-03-14 12:57:51 +1300, Thomas Munro wrote:
> On Fri, Mar 14, 2025 at 12:23 PM Andres Freund <andres@anarazel.de> wrote:
> 
> > > 3.  Considering errfinish()'s stated goal, a sort of backstop to help
> > > you cancel looping message-spewing code only, I wonder if we should
> > > just restrict the kinds of interrupts it processes, so that it only
> > > calls CFI() if we're definitely throwing ERROR or FATAL?
> >
> > I'm not even sure it'd be safe to ERROR out in all the relevant places...
> >
> > E.g.
> >                 /* implementation-specific initialization */
> >                 smgrsw[reln->smgr_which].smgr_open(reln);
> >
> >                 /* it is not pinned yet */
> >                 reln->pincount = 0;
> >                 dlist_push_tail(&unpinned_relns, &reln->node);
> >
> > doesn't this mean that ->pincount is uninitialized in case smgr_open() errors
> > out and that we'd loose track of the reln because it wasn't yet added to
> > unpinned_rels?
> 
> Ugh, right.

Patch for that attached.


> > > > If we go for that, I can see an argument for doing that in smgr.c instead of
> > > > md.c, afaict any plausible smgr implementation would run into this, as the
> > > > smgrcloseall() is triggered from the smgr level.
> > >
> > > Seems like maybe not a great idea to assume that future smgrs will be
> > > OK with being non-interruptible, if it can be avoided?
> >
> > I think you'd need a fairly large surgery of smgr.c to make that viable - I
> > rather doubt that smgr.c itself is actually safe against interrupts.
> >
> > I can see some arguable uses of smgr handling interrupts, say an smgr
> > implementation based on a network backed store, but you'd need rather massive
> > changes to actually be sure that smgr.c is called while accepting interrupts -
> > e.g. today sgmrwritev() will largely be called with interrupts held. Plenty
> > reads too.  I doubt that undoing a few HOLD_INTERRUPTS() is a meaningful part
> > of the necessary work.
> 
> Right, exactly the case I was thinking of: if some hypothetical future
> network smgr wants to be able to process *some* kinds of things
> carefully while waiting for the network.  I don't think we want to
> constrain ourselves to NFS-style "pretend it's local and make it
> non-interruptible" without any escape hatches, but you're quite right
> that that's probably a whole research project of its own and we just
> haven't refined the interrupt system enough for that yet, so yeah I
> see how you arrived at that conclusion and it makes sense.

Here's a proposed patch for this. It turns out that the bug might already be
reachable, even without defining FDDEBUG. There's a debug ereport() in
register_dirty_segment() - but it's hard to reach in practice.

I don't really know whether that means we ought to backpatch or not - which
makes me want to be conservative and not backpatch.

Greetings,

Andres Freund

Вложения

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: md.c vs elog.c vs smgrreleaseall() in barrier

Вложения