Re: EINTR in ftruncate()

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: EINTR in ftruncate()
Дата
Msg-id 20220701221722.os4ktbe5pnciqguv@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: EINTR in ftruncate()  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
Hi,

On 2022-07-02 09:52:33 +1200, Thomas Munro wrote:
> On Sat, Jul 2, 2022 at 9:06 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2022-07-01 13:29:44 -0700, Andres Freund wrote:
> > Chris, do you have any additional details about the machine that lead to this
> > change? OS version, whether it might have been swapping, etc?
> >
> > I wonder if what happened is that posix_fallocate() used glibc's fallback
> > implementation because the kernel was old enough to not support fallocate()
> > for tmpfs.  Looks like support for fallocate() for tmpfs was added in 3.5
> > ([1]). So e.g. a rhel 6 wouldn't have had that.
> 
> With a quick test program on my Linux 5.10 kernel I see that an
> SA_RESTART signal handler definitely causes posix_fallocate() to
> return EINTR (can post trivial program).
> 
> A drive-by look at the current/modern kernel source supports this:
> shmem_fallocate returns -EINTR directly (not -ERESTARTSYS, which seems
> to be the Linux-y way to say you want EINTR or restart as
> appropriate?), and it also undoes all partial progress too (not too
> surprising), which would explain why a perfectly timed machine gun
> stream of signals from our recovery conflict system can make an
> fallocate retry loop never terminate, for large enough sizes.

Yea :(

And even if we fix recovery to not do douse other processes in signals quite
that badly, there are plenty other sources of signals that can arrive at a
steady clip. So I think we need to do something to defuse this another way.

Ideas:

1) do the fallocate in smaller chunks, thereby making it much more likely to
   complete between two signal deliveries
2) block signals while calling posix_fallocate(). That won't work for
   everything (e.g. rapid SIGSTOP/SIGCONT), but that's not something we'd send
   ourselves, so whatever.
3) 1+2
4) ?

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nathan Bossart
Дата:
Сообщение: Re: Time to remove unparenthesized syntax for VACUUM?
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Time to remove unparenthesized syntax for VACUUM?