Обсуждение: Re: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table,log i

Поиск
Список
Период
Сортировка

Re: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table,log i

От
Robert Haas
Дата:
On Wed, Dec 6, 2017 at 12:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> What appears to be happening is that a database-wide ANALYZE takes more
> than a minute under CLOBBER_CACHE_ALWAYS, causing isolationtester.c's
> hardwired one-minute timeout to trigger.
>
> While you could imagine doing something to get around that, I do not
> believe that this test is worth memorializing in perpetuity to begin
> with.  I'd recommend just taking it out again.

Mumble.  I don't really mind that, but I'll bet $0.05 that this will
get broken at some point and we won't notice right away without the
isolation test.

Is it really our policy that no isolation test can take more than a
minute on the slowest buildfarm critter?  If somebody decides to start
running CLOBBER_CACHE_ALWAYS on an even-slower critter, will we just
nuke isolation tests from orbit until the tests pass there?  I have
difficulty seeing that as a sound approach.

Another thought is that it might not be necessary to have a
database-wide ANALYZE to trigger this.  I managed to reproduce it
locally by doing VACUUM a, b while alternately locking a and b, so
that I let the name lookups complete, but then blocked trying to
vacuum a, and then at that point dropped b, then released the VACUUM.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table, log i

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Dec 6, 2017 at 12:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> What appears to be happening is that a database-wide ANALYZE takes more
>> than a minute under CLOBBER_CACHE_ALWAYS, causing isolationtester.c's
>> hardwired one-minute timeout to trigger.

> Is it really our policy that no isolation test can take more than a
> minute on the slowest buildfarm critter?

Well, I think it's a minute per query not per whole test script.  But in
any case, if it's taking a longer time than any other isolation test on
the CLOBBER_CACHE_ALWAYS critters, then it's also taking a proportionately
longer time than any other test on every other platform, and is therefore
costing every developer precious time today and indefinitely far into the
future.  I continue to say that this test ain't worth it.

It's possible that we could compromise on dropping the steps that test
whole-database VACUUM/ANALYZE; the incremental gain from testing those
scenarios is certainly even less worth its cost than the basic cases.

            regards, tom lane


Re: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table,log i

От
Robert Haas
Дата:
On Wed, Dec 6, 2017 at 4:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Well, I think it's a minute per query not per whole test script.  But in
> any case, if it's taking a longer time than any other isolation test on
> the CLOBBER_CACHE_ALWAYS critters, then it's also taking a proportionately
> longer time than any other test on every other platform, and is therefore
> costing every developer precious time today and indefinitely far into the
> future.  I continue to say that this test ain't worth it.

Sure.  But, you also continue to not respond to my arguments about why
it IS worth it.  I don't want to spend a lot of time fighting about
this, but it looks to me like your preferences here are purely
arbitrary.  Yesterday, you added - without discussion - a test that I
had "obviously" left out by accident.  Today, you want a test removed
that I added on purpose but which you assert has insufficient value.
So, sometimes you think it's worth adding tests that make the test
suite longer, and other times you think it isn't.  That's fair enough
-- everyone comes down in different places on this at different times
-- but the only actual reason you've offered is that the script
contains a command that runs for over a minute on very slow machines
that have been artificially slowed down 100x.  That's a silly reason:
it means that on real machines we're talking less than a second of
runtime even without modifying the test case, and if we do modify the
test, it can probably be made much less.

Please give me a little time to see if I can speed up this test enough
to fix this problem.  If that doesn't work out, then we can rip this
out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table,log i

От
"Bossart, Nathan"
Дата:
On 12/6/17, 8:25 PM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> Please give me a little time to see if I can speed up this test enough
> to fix this problem.  If that doesn't work out, then we can rip this
> out.

Just in case it got missed earlier, here’s a patch that speeds it
up enough to pass with CLOBBER_CACHE_ALWAYS enabled.  Instead of
doing database-wide operations, it just uses a partitioned table.

Nathan


Вложения

Re: pgsql: When VACUUM or ANALYZE skips a concurrently dropped table,log i

От
Robert Haas
Дата:
On Wed, Dec 6, 2017 at 9:42 PM, Bossart, Nathan <bossartn@amazon.com> wrote:
> On 12/6/17, 8:25 PM, "Robert Haas" <robertmhaas@gmail.com> wrote:
>> Please give me a little time to see if I can speed up this test enough
>> to fix this problem.  If that doesn't work out, then we can rip this
>> out.
>
> Just in case it got missed earlier, here’s a patch that speeds it
> up enough to pass with CLOBBER_CACHE_ALWAYS enabled.  Instead of
> doing database-wide operations, it just uses a partitioned table.

Yeah, that looks like a reasonable approach to try.  Committed, thanks.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company