Re: Sometimes the output to the stdout in Windows disappears

Поиск
Список
Период
Сортировка
От Alexander Lakhin
Тема Re: Sometimes the output to the stdout in Windows disappears
Дата
Msg-id ee02eaa2-03f7-74ea-bbdf-3196e506bae3@gmail.com
обсуждение исходный текст
Ответ на Re: Sometimes the output to the stdout in Windows disappears  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Sometimes the output to the stdout in Windows disappears  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hello hackers,

13.09.2020 21:37, Tom Lane wrote:
> I happened to try googling for other similar reports, and I found
> a very interesting recent thread here:
>
> https://github.com/nodejs/node/issues/33166
>
> It might not have the same underlying cause, of course, but it sure
> sounds familiar.  If Node.js are really seeing the same effect,
> that would point to an underlying Windows bug rather than anything
> Postgres is doing wrong.
>
> It doesn't look like the Node.js crew got any closer to
> understanding the issue than we have, unfortunately.  They made
> their problem mostly go away by reverting a seemingly-unrelated
> patch.  But I can't help thinking that it's a timing-related bug,
> and that patch was just unlucky enough to change the timing of
> their tests so that they saw the failure frequently.
I've managed to make a simple reproducer. Please look at the patch attached.
There are two things crucial for reproducing the bug:
    ioctlsocket(sock, FIONBIO, &ioctlsocket_ret); // from pgwin32_socket()
and
    WSACleanup();

I still can't understand what affects the effect. With this reproducer I
get:
vcregress taptest src\test\modules\connect
...
t/000_connect.pl .. # test
#
t/000_connect.pl .. 13346/100000
#   Failed test at t/000_connect.pl line 24.
t/000_connect.pl .. 16714/100000
#   Failed test at t/000_connect.pl line 24.
t/000_connect.pl .. 26216/100000
#   Failed test at t/000_connect.pl line 24.
t/000_connect.pl .. 30077/100000
#   Failed test at t/000_connect.pl line 24.
t/000_connect.pl .. 36505/100000
#   Failed test at t/000_connect.pl line 24.
t/000_connect.pl .. 43647/100000
#   Failed test at t/000_connect.pl line 24.
t/000_connect.pl .. 53070/100000
#   Failed test at t/000_connect.pl line 24.
t/000_connect.pl .. 54402/100000
#   Failed test at t/000_connect.pl line 24.
t/000_connect.pl .. 55685/100000
#   Failed test at t/000_connect.pl line 24.
t/000_connect.pl .. 83193/100000
#   Failed test at t/000_connect.pl line 24.
t/000_connect.pl .. 99992/100000 # Looks like you failed 10 tests of 100000.
t/000_connect.pl .. Dubious, test returned 10 (wstat 2560, 0xa00)
Failed 10/100000 subtests

But in our test farm the pg_bench test (from the installcheck-world
suite that we run with using msys) can fail roughly on each third run.
Perhaps it depends on I/O load. It seems, that searching files/scanning
disk in parallel increases the probability of the glitch.
I see no solution for this on the postgres side for now, but this
information about Windows quirks could be useful in case someone
stumbled upon it too.

Best regards,
Alexander

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Potential use of uninitialized context in pgcrypto
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: [Patch] Using Windows groups for SSPI authentication