Обсуждение: intermittent test failure on Windows

Поиск
Список
Период
Сортировка

intermittent test failure on Windows

От
Andrew Dunstan
Дата:
Bowerbird (Visual Studio 2017 / Windows 10 pro) just had a failure on
the pg_ctl test :
<https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2019-10-21%2011%3A50%3A21>


There was a similar failure 17 days ago.


I surmise that what's happening here is that the test is trying to read
current_logfiles while the server is writing it, so there's a race
condition.


Perhaps what we need to do is have slurp_file sleep a bit and try again
on Windows if it gets EPERM, or else we need to have the pg_ctl test
wait a bit before calling slurp_file. But we have seen occasional
similar failures on other tests in Windows so a more systemic approach
might be better.


Thoughts?


cheers


andrew



-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: intermittent test failure on Windows

От
Tom Lane
Дата:
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> Bowerbird (Visual Studio 2017 / Windows 10 pro) just had a failure on
> the pg_ctl test :
> <https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2019-10-21%2011%3A50%3A21>

> I surmise that what's happening here is that the test is trying to read
> current_logfiles while the server is writing it, so there's a race
> condition.

Hmm ... the server tries to replace current_logfiles atomically
with rename(), so this says that rename isn't atomic on Windows,
which we knew already.  Previous discussion (cf. commit d611175e5)
implies that an even worse failure condition is possible: the server
might fail to rename current_logfiles.tmp into place, just because
somebody is trying to read current_logfiles.  Ugh.

I found a thread about trying to make a really bulletproof rename()
for Windows:

https://www.postgresql.org/message-id/flat/CAPpHfds7dyuGZt%2BPF2GL9qSSVV0OZnjNwqiCPjN7mirDw882tA%40mail.gmail.com

but it looks like we gave up in disgust.

            regards, tom lane



Re: intermittent test failure on Windows

От
Andrew Dunstan
Дата:
On 10/21/19 2:58 PM, Tom Lane wrote:
> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
>> Bowerbird (Visual Studio 2017 / Windows 10 pro) just had a failure on
>> the pg_ctl test :
>> <https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird&dt=2019-10-21%2011%3A50%3A21>
>> I surmise that what's happening here is that the test is trying to read
>> current_logfiles while the server is writing it, so there's a race
>> condition.
> Hmm ... the server tries to replace current_logfiles atomically
> with rename(), so this says that rename isn't atomic on Windows,
> which we knew already.  Previous discussion (cf. commit d611175e5)
> implies that an even worse failure condition is possible: the server
> might fail to rename current_logfiles.tmp into place, just because
> somebody is trying to read current_logfiles.  Ugh.
>
> I found a thread about trying to make a really bulletproof rename()
> for Windows:
>
> https://www.postgresql.org/message-id/flat/CAPpHfds7dyuGZt%2BPF2GL9qSSVV0OZnjNwqiCPjN7mirDw882tA%40mail.gmail.com
>
> but it looks like we gave up in disgust.


Yeah. Looks like Alexander revived the discussion with a patch back in
August, though, and it's in the next commitfest.
<https://commitfest.postgresql.org/25/2230/>


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services