On Tue, Sep 18, 2018 at 1:15 AM Chris Travers <chris.travers@adjust.com> wrote:
> On Mon, Sep 17, 2018 at 2:59 PM Oleksii Kliukin <alexk@hintbits.com> wrote:
>> With the patch applied, the posix_fallocate loop terminated right away (because
>> of QueryCancelPending flag set to true) and the backend went through the
>> cleanup, showing an ERROR of cancelling due to the conflict with recovery.
>> Without the patch, it looped indefinitely in the dsm_impl_posix_resize, while
>> the startup process were looping forever, trying to send SIGUSR1.
Thanks for testing!
>> One thing I’m wondering is whether we could do the same by just blocking SIGUSR1
>> for the duration of posix_fallocate?
>
> If we were to do that, I would say we should mask all signals we can mask during the call.
>
> I don't have a problem going down that road instead if people think it is better.
We discussed that when adding posix_fallocate() and decided that
retrying is better:
https://www.postgresql.org/message-id/20170628230458.n5ehizmvhoerr5yq%40alap3.anarazel.de
Here is a patch that I propose to commit and back-patch to 9.4. I
just wrote a suitable commit message, edited the comments lightly and
fixed some whitespace.
--
Thomas Munro
http://www.enterprisedb.com