Re: [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [RFC] Should smgrtruncate() avoid sending sinval message for temp relations
Дата
Msg-id 20140728231823.GA10014@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [RFC] Should smgrtruncate() avoid sending sinval message for temp relations  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
On 2014-07-26 20:20:05 +0200, Andres Freund wrote:
> On 2014-07-26 13:58:38 -0400, Tom Lane wrote:
>
> > Andres Freund <andres@2ndquadrant.com> writes:
> > > That'd require either renegging on SA_RESTART or
> > > using WaitLatchOrSocket() and nonblocking send/recv.
> >
> > Yeah, I was wondering about using WaitLatchOrSocket for client I/O too.
> > We already have a hook that lets us do the actual recv even when using
> > OpenSSL, and in principle that function could do interrupt-service-like
> > functions if it got kicked off the recv().
>
> I've started playing with this. Looks clearly worthwile.
>
> I think if we do it right we pretty much can get rid of the whole
> prepare_for_client_read() machinery and handle everything via
> ProcessInterrupts(). EnableCatchupInterrupt() et al don't really fill me
> with joy.
>
> I'm not yet entirely sure where the interrupt processing should happen,
> but I guess that'll fall out of the work at some point. The important
> bit imo is to *not* not do anything but return with BIO_set_retry_*()
> from my_sock_read/write(). That then allows us to implement stuff like
> the idle transaction timeout with much fewer problems.
>
> I probably won't finish doing this before leaving on holidays, so nobody
> should hesitate to look themselves if interested. If not, I plan to pick
> this up again.  I think it's a prerequisite to getting rid of the FATAL
> for recovery conflict interrupts which I really would like to do.

I tried to get something reviewable before leaving, but alas, there's
far too many edgecases to consider. A good part of it has to do with my
decision to always operate the underlying socket in nonblocking mode and
do all the writing using latches. I think that's the right decision,
because it allows reliable interruptions everywhere and is easier than
duplicated codepaths. But I realize that's not guaranteed to be well
liked.

a) Latches aren't ready early enough. We need to read/write to the
socket long before MyProc is initialized. To have sane behaviour during
early startup we need to be canceleable there was well.

Right now I'm just busy looping in that case, but that's obviously not
acceptable. My best idea is to have another latch that we can use during
early startup. Say *MyProcLatch. Initially that points to a process
local latch and once initialized it points to MyProc->procLatch.

b) Latches don't support WL_WRITEABLE without WL_READABLE. I've simply
changed the code in both latch implementations to poll for errors in
both cases and set both readable/writeable if an error occurs. That
seems easy enough. Are there any bigger caveats for changing this?

c) There's a couple more or less undocumented
pgwin32_waitforsinglesocket() calls in be-secure.c. Afaics they're
likely partially already not required and definitely not required after
socket handling is baded on latches.

d) prepare_for_client_read(), client_read_ended() are now really quite a
misnomer. Because interrupts interrupt send/recv appropriately and
because we do *not* want to process them inside my_sock_read|write (so
we don't recursively enter openssl to send the FATAL to the client)
they're not used from within ssl anymore. But on a higher level, just
like:
    prepare_for_client_read();
    CHECK_FOR_INTERRUPTS();
    client_read_ended();
Not sure if that's the right abstraction for just a:
    EnableNotifyInterrupt();
    EnableCatchupInterrupt();
    CHECK_FOR_INTERRUPTS();
    DisableNotifyInterrupt();
    DisableCatchupInterrupt();
where the Enable* just set a variable so CHECK_FOR_INTERRUPTS can
process the events if occuring.

Doing it that way allows to throw away *large* chunks of code from
sinval.c and async.c because they don't have to care about doing
dangerous stuff from signal handlers anymore.

But the above needs a new name.


Even though it ends up being a bit more work than I anticipated, the
result still seems like a significant improvement over the current
code.

Will get back to this once I'm back (~10 days).  Unless somebody else
wants to pick this up. I've at attached my *heavily WIP* patch.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: SKIP LOCKED DATA (work in progress)
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: B-Tree support function number 3 (strxfrm() optimization)