Re: stats.sql might fail due to shared buffers also used by parallel tests
От | Alexander Lakhin |
---|---|
Тема | Re: stats.sql might fail due to shared buffers also used by parallel tests |
Дата | |
Msg-id | e9d78260-d401-406b-9e3a-9f939f27359b@gmail.com обсуждение исходный текст |
Ответ на | Re: stats.sql might fail due to shared buffers also used by parallel tests (Shlok Kyal <shlok.kyal.oss@gmail.com>) |
Список | pgsql-hackers |
Hello Shlok, 03.07.2025 09:54, Shlok Kyal wrote: > I have also encountered a similar buildfarm failure [1]. > > | 1/1 + subscription 142 ms FAIL > 1/1 postgresql:regress / regress/regress ERROR 284.85s exit status 1 > > diff --strip-trailing-cr -U3 > c:/build-farm-local/buildroot/HEAD/pgsql/src/test/regress/expected/subscription.out > c:/build-farm-local/buildroot/HEAD/pgsql.build/testrun/regress/regress/results/subscription.out > --- c:/build-farm-local/buildroot/HEAD/pgsql/src/test/regress/expected/subscription.out > 2025-06-28 20:13:02 +0900 > +++ c:/build-farm-local/buildroot/HEAD/pgsql.build/testrun/regress/regress/results/subscription.out > 2025-06-28 20:35:21 +0900 > @@ -70,7 +70,7 @@ > SELECT :'prev_stats_reset' < stats_reset FROM > pg_stat_subscription_stats WHERE subname = 'regress_testsub'; > ?column? > ---------- > - t > + f > (1 row) Thank you for your attention to this! Yes, I also noticed that failure, but could not reproduce it yet. > So according to me, I suspect the following causes > 1) The time difference between 'prev_stats_reset' and current > 'stats_reset' value is less than 1 microseconds. > 'stats_reset' is of type 'timestamp with time zone' and the content of > it is like: '2025-06-30 21:01:07.925253+05:30'. So if the time > difference between 'prev_stats_reset' and current 'stats_reset' is > less than 1 microseconds. The query 'SELECT :'prev_stats_reset' < > stats_reset FROM pg_stat_subscription_stats WHERE subname = > 'regress_testsub'' might return 'false' instead of 'true'. > But I was not able to reproduce such a scenario after multiple > testing. Even in high end machines, it takes at least a few > microseconds. Also there are multiple cases where we did similar > timestamp comparison in 'stats.sql' as well. And, I didn't find any > other failure related to such case. So, I feel this is not possible. Did you try that on Windows (hamerkop is a Windows animal)? IIUC, GetCurrentTimestamp() -> gettimeofday() implemented on Windows via GetSystemTimePreciseAsFileTime(), and it has 100ns resolution, according to [1]. I tried it on a Windows VM and saw the following numbers: SELECT :'prev_stats_reset' < stats_reset, :'prev_stats_reset', stats_reset FROM pg_stat_subscription_stats WHERE subname = 'regress_testsub'; ?column? | ?column? | stats_reset ----------+-------------------------------------+------------------------------------- t | Fri Jul 04 11:46:10.224323 2025 PDT | Fri Jul 04 11:46:10.224975 2025 PDT (1 row) that is, the difference is about 600 ns — if I could make the VM 6x faster or run the test on a bare metal, maybe I'd get the test failed. Although the overall duration of the subscription doesn't differ much: My VM: # parallel group (2 tests): subscription publication ok 164 + publication 806 ms ok 165 + subscription 210 ms vs hamerkop: # parallel group (2 tests): subscription publication ok 164 + publication 2383 ms not ok 165 + subscription 142 ms So probably the failure is caused by something else. > 2) pg_stat_reset_subscription_stats(oid) function did not reset the stats. > We have a shared hash 'pgStatLocal.shared_hash'. If the entry > reference (for the subscription) is not found while executing > 'pg_stat_reset_subscription_stats(oid)'. It may not be able to reset > the stats. Maybe somehow this shared hash is getting dropped.. > Also, it could be failing due to the same reason as Alexander has I don't think 2) is relevant here, because shared buffers shouldn't affect subscription's statistics. [1] https://stackoverflow.com/questions/4444090/measuring-time-with-a-resolution-of-microseconds-in-c Best regards. Alexander
В списке pgsql-hackers по дате отправления: