Обсуждение: The buildfarm is in a pretty bad way, folks

Поиск
Список
Период
Сортировка

The buildfarm is in a pretty bad way, folks

От
Tom Lane
Дата:
It sure looks like there's been a frantic push to commit stuff that
maybe wasn't quite fully baked.  I'm not terribly on board with that,
because it's likely to be hard to disentangle who broke what.
But in particular, it's clear that partition_prune and
isolation/checksum_cancel are showing big problems.

            regards, tom lane


Re: The buildfarm is in a pretty bad way, folks

От
Andres Freund
Дата:
Hi,

On 2018-04-06 16:59:11 -0400, Tom Lane wrote:
> It sure looks like there's been a frantic push to commit stuff that
> maybe wasn't quite fully baked.  I'm not terribly on board with that,
> because it's likely to be hard to disentangle who broke what.
> But in particular, it's clear that partition_prune and
> isolation/checksum_cancel are showing big problems.

While I'm obviously also unhappy about the frantic push to push semi
baked stuff, I'm not sure the two issues you point to above are that
good examples of carelessness. At least the latter seems mostly a pretty
normal portability thing around orderedness?

Greetings,

Andres Freund


Re: The buildfarm is in a pretty bad way, folks

От
Magnus Hagander
Дата:


On Fri, Apr 6, 2018 at 10:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
It sure looks like there's been a frantic push to commit stuff that
maybe wasn't quite fully baked.  I'm not terribly on board with that,
because it's likely to be hard to disentangle who broke what.
But in particular, it's clear that partition_prune and
isolation/checksum_cancel are showing big problems.

Daniel is working on investigating the isolationtester thing. See a mail on one of the threads where initial indications were the "atomics with no real atomics" (or whatever you'd call it) were to blame. We could redo that thing without atomics to get rid of that (and possibly should), but it would be good to figure out if it's actually broken first, so that part can get fixed if it is. 

--

Re: The buildfarm is in a pretty bad way, folks

От
Andres Freund
Дата:
On 2018-04-06 23:12:19 +0200, Magnus Hagander wrote:
> Daniel is working on investigating the isolationtester thing. See a mail on
> one of the threads where initial indications were the "atomics with no real
> atomics" (or whatever you'd call it) were to blame. We could redo that
> thing without atomics to get rid of that (and possibly should), but it
> would be good to figure out if it's actually broken first, so that part can
> get fixed if it is.

Is that an explanation for
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2018-04-06%2019%3A18%3A11
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lousyjack&dt=2018-04-06%2016%3A03%3A01
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2018-04-06%2015%3A46%3A16
? Those all don't seem fall under that? Having proper atomics?

Greetings,

Andres Freund


Re: The buildfarm is in a pretty bad way, folks

От
Magnus Hagander
Дата:


On Fri, Apr 6, 2018 at 11:19 PM, Andres Freund <andres@anarazel.de> wrote:
On 2018-04-06 23:12:19 +0200, Magnus Hagander wrote:
> Daniel is working on investigating the isolationtester thing. See a mail on
> one of the threads where initial indications were the "atomics with no real
> atomics" (or whatever you'd call it) were to blame. We could redo that
> thing without atomics to get rid of that (and possibly should), but it
> would be good to figure out if it's actually broken first, so that part can
> get fixed if it is.

Is that an explanation for
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2018-04-06%2019%3A18%3A11
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lousyjack&dt=2018-04-06%2016%3A03%3A01
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2018-04-06%2015%3A46%3A16
? Those all don't seem fall under that? Having proper atomics?

No, sorry, bad wording. The initial indications were that, that's not the *only* indications. There is possibly/probably more than one thing.

--

Re: The buildfarm is in a pretty bad way, folks

От
Alvaro Herrera
Дата:
Tom Lane wrote:
> It sure looks like there's been a frantic push to commit stuff that
> maybe wasn't quite fully baked.  I'm not terribly on board with that,
> because it's likely to be hard to disentangle who broke what.
> But in particular, it's clear that partition_prune and
> isolation/checksum_cancel are showing big problems.

The partition_prune failure is clearly a minor portability issue which
I'll investigate after I pick up the kids.  From where I sit, if we let
that patch bake any more, it will burn in the oven.

Partition prune also broke the sepgsql test also -- I think because one
partition is no longer scanned.  Seems a reasonable thing to me, just
need to update the expected file.  But I'll look closer.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: The buildfarm is in a pretty bad way, folks

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2018-04-06 16:59:11 -0400, Tom Lane wrote:
>> But in particular, it's clear that partition_prune and
>> isolation/checksum_cancel are showing big problems.

> While I'm obviously also unhappy about the frantic push to push semi
> baked stuff, I'm not sure the two issues you point to above are that
> good examples of carelessness. At least the latter seems mostly a pretty
> normal portability thing around orderedness?

I'm just venting, perhaps, but if there's a good reason for that
to have been left broken for ~24 hours, I don't know what it is.
It's getting in the way of testing other recent commits.

(I'm also not real happy about the amount of time the checksum-xxx
tests consume.)

            regards, tom lane


Re: The buildfarm is in a pretty bad way, folks

От
Magnus Hagander
Дата:
On Fri, Apr 6, 2018 at 11:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
> On 2018-04-06 16:59:11 -0400, Tom Lane wrote:
>> But in particular, it's clear that partition_prune and
>> isolation/checksum_cancel are showing big problems.

> While I'm obviously also unhappy about the frantic push to push semi
> baked stuff, I'm not sure the two issues you point to above are that
> good examples of carelessness. At least the latter seems mostly a pretty
> normal portability thing around orderedness?

I'm just venting, perhaps, but if there's a good reason for that
to have been left broken for ~24 hours, I don't know what it is.
It's getting in the way of testing other recent commits.

(I'm also not real happy about the amount of time the checksum-xxx
tests consume.)

The isolation tester ones, or the regular ones? Because the regular ones finish in << 30 seconds here, just wondering if that actually counts as too time consuming in this type of tests? 

--

Re: The buildfarm is in a pretty bad way, folks

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> On Fri, Apr 6, 2018 at 11:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> (I'm also not real happy about the amount of time the checksum-xxx
>> tests consume.)

> The isolation tester ones, or the regular ones? Because the regular ones
> finish in << 30 seconds here, just wondering if that actually counts as too
> time consuming in this type of tests?

The isolationtester ones.  Looking at longfin, which while not a speed
demon isn't real slow either, the isolation-check step was taking 2:05
two days ago and now it's at 2:48.   That's a pretty big incremental
jump for one feature.

            regards, tom lane