Re: buildfarm instance bichir stuck

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: buildfarm instance bichir stuck
Дата
Msg-id CA+hUKGK2eubAdK3TjrbRMf=htWegcL-09EPNh5xJrp6ZGSgPTw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: buildfarm instance bichir stuck  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
On Fri, Apr 9, 2021 at 6:11 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Wed, Apr 7, 2021 at 7:31 PM Robins Tharakan <tharakan@gmail.com> wrote:
> > Correct. This is easily reproducible on this test-instance, so let me know if you want me to test a patch.
>
> From your description it sounds like signals are not arriving at all,
> rather than some more complicated race.  Let's go back to basics...

I was looking into the portability of SIGURG and OOB socket data for
something totally different (hallway track discussion from PGCon,
could we use that for query cancel, like FTP does, instead of opening
another socket?), and lo and behold, someone has figured out a
workaround for this latch problem:

https://github.com/microsoft/WSL/issues/8619

I don't really want to add code to scrape uname() ouput detect
different kernels at runtime as shown there, but it doesn't seem to
make a difference on Linux if we just always do what was suggested.  I
didn't look too hard into whether that is the right place to put the
call, or really understand *why* it works, and since I am not a
Windows user and we don't have a WSL1 CI, I can't confirm that it
works or explore whether there is some other ordering of operations
that would be better but still work, but if that does the trick then
maybe we should just do something like the attached.

Thoughts?

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: José Neves
Дата:
Сообщение: CDC/ETL system on top of logical replication with pgoutput, custom client
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: logical decoding and replication of sequences, take 2