Обсуждение: Intermittent failure in InstallCheck-C "stat" test

Поиск
Список
Период
Сортировка

Intermittent failure in InstallCheck-C "stat" test

От
Thomas Munro
Дата:
Hi,

Just now, and also once 5-and-a-bit days ago, flaviventris failed like
this, as did filefish 41 days ago[1] (there may be more, I just
checked a random sample of InstallCheck-C failures accessible via the
web interface):

  WHERE relname like 'trunc_stats_test%' order by relname;
       relname      | n_tup_ins | n_tup_upd | n_tup_del | n_live_tup |
n_dead_tup
 -------------------+-----------+-----------+-----------+------------+------------
- trunc_stats_test  |         3 |         0 |         0 |          0 |
         0
- trunc_stats_test1 |         4 |         2 |         1 |          1 |
         0
- trunc_stats_test2 |         1 |         0 |         0 |          1 |
         0
- trunc_stats_test3 |         4 |         0 |         0 |          2 |
         2
- trunc_stats_test4 |         2 |         0 |         0 |          0 |
         2
+ trunc_stats_test  |         0 |         0 |         0 |          0 |
         0
+ trunc_stats_test1 |         0 |         0 |         0 |          0 |
         0
+ trunc_stats_test2 |         0 |         0 |         0 |          0 |
         0
+ trunc_stats_test3 |         0 |         0 |         0 |          0 |
         0
+ trunc_stats_test4 |         0 |         0 |         0 |          0 |
         0
 (5 rows)

 SELECT st.seq_scan >= pr.seq_scan + 1,
@@ -180,7 +180,7 @@
  WHERE st.relname='tenk2' AND cl.relname='tenk2';
  ?column? | ?column? | ?column? | ?column?
 ----------+----------+----------+----------
- t        | t        | t        | t
+ f        | f        | f        | f
 (1 row)

 SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages,
@@ -189,7 +189,7 @@
  WHERE st.relname='tenk2' AND cl.relname='tenk2';
  ?column? | ?column?
 ----------+----------
- t        | t
+ t        | f
 (1 row)

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=filefish&dt=2019-02-23%2009%3A53%3A11

-- 
Thomas Munro
https://enterprisedb.com



Re: Intermittent failure in InstallCheck-C "stat" test

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> Just now, and also once 5-and-a-bit days ago, flaviventris failed like
> this, as did filefish 41 days ago[1] (there may be more, I just
> checked a random sample of InstallCheck-C failures accessible via the
> web interface):

This sort of thing has pretty much always happened.  I believe it is
just down to the designed-in unreliability of the current stats collection
mechanism.  We might be able to get rid of it if we go over to
shared-memory stats, but I've yet to look at that patch :-(.  In the
meantime I don't see any reason to think that anything's worse here
than it has been for many years.

            regards, tom lane



Re: Intermittent failure in InstallCheck-C "stat" test

От
Andres Freund
Дата:
Hi,

On 2019-04-05 18:19:17 -0400, Tom Lane wrote:
> We might be able to get rid of it if we go over to shared-memory
> stats, but I've yet to look at that patch :-(.

I did a few review cycles on it, and while I believe the concept is
sound, I think it needs a good bit more time to mature. Not
realistically doable for v12.

Greetings,

Andres Freund



Re: Intermittent failure in InstallCheck-C "stat" test

От
Thomas Munro
Дата:
On Sat, Apr 6, 2019 at 11:19 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > Just now, and also once 5-and-a-bit days ago, flaviventris failed like
> > this, as did filefish 41 days ago[1] (there may be more, I just
> > checked a random sample of InstallCheck-C failures accessible via the
> > web interface):
>
> This sort of thing has pretty much always happened.  I believe it is
> just down to the designed-in unreliability of the current stats collection
> mechanism.  We might be able to get rid of it if we go over to
> shared-memory stats, but I've yet to look at that patch :-(.  In the
> meantime I don't see any reason to think that anything's worse here
> than it has been for many years.

Does it imply that the kernel dropped a UDP packet to localhost?

-- 
Thomas Munro
https://enterprisedb.com



Re: Intermittent failure in InstallCheck-C "stat" test

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Sat, Apr 6, 2019 at 11:19 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This sort of thing has pretty much always happened.  I believe it is
>> just down to the designed-in unreliability of the current stats collection
>> mechanism.  We might be able to get rid of it if we go over to
>> shared-memory stats, but I've yet to look at that patch :-(.  In the
>> meantime I don't see any reason to think that anything's worse here
>> than it has been for many years.

> Does it imply that the kernel dropped a UDP packet to localhost?

That's a possible explanation, anyway.  The problem shows up seldom enough
that it's hard to say that conclusively.  So *maybe* there's a bug here
we could actually fix, but again, without any way to repro it, it's hard
to say much.

            regards, tom lane