Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

Поиск
Список
Период
Сортировка
Melanie Plageman <melanieplageman@gmail.com> writes:
> When I run it on my machine with some added logging, the space taken
> by dead items is about 330 kB more than maintenance_work_mem (which is
> set to 1 MB). I could roughly double the excess by increasing the
> number of inserted tuples from 400000 to 600000. I'll do this.

So, after about two days in the buildfarm, we have failure reports
from this test on gull, mamba, mereswine, and copperhead.  mamba
is mine, and I was able to reproduce the failure in a manual run.
The problem seems to be that the test simply takes too long and
we hit the default 180-second timeout on one step or another.
I was able to make it pass by dint of

$ export PG_TEST_TIMEOUT_DEFAULT=1800

However, the test then took 908 seconds:

$ time make installcheck PROVE_TESTS=t/043_vacuum_horizon_floor.pl
...
# +++ tap install-check in src/test/recovery +++
t/043_vacuum_horizon_floor.pl .. ok
All tests successful.
Files=1, Tests=3, 908 wallclock secs ( 0.17 usr  0.01 sys + 21.42 cusr 35.03 csys = 56.63 CPU)
Result: PASS
      909.26 real        22.10 user        35.21 sys

This is even slower than the 027_stream_regress.pl test, which
currently takes around 847 seconds on that machine.

mamba, gull, and mereswine are 32-bit machines, which aside from
being old and slow suffer an immediate 2x size-of-test penalty:

>> # The TIDStore vacuum uses to store dead items is optimized for its target
>> # system. On a 32-bit system, our example requires twice as many pages with
>> # the same number of dead items per page to fill the TIDStore and trigger a
>> # second round of index vacuuming.
>> my $is_64bit = $node_primary->safe_psql($test_db,
>>     qq[SELECT typbyval FROM pg_type WHERE typname = 'int8';]);
>>
>> my $nrows = $is_64bit eq 't' ? 400000 : 800000;

copperhead is 64-bit but is nonetheless even slower than the
other three, so the fact that it's also timing out isn't
that surprising.

I do not think the answer to this is to nag the respective animal
owners to raise PG_TEST_TIMEOUT_DEFAULT.  IMV this test is simply
not worth the cycles it takes, at least not for these machines.
I'm not sure whether to propose reverting it entirely or just
disabling it on 32-bit hardware.  I don't think we'd lose anything
meaningful in test coverage if we did the latter; but that won't be
enough to make copperhead happy.  I am also suspicious that we'll
get bad news from other very slow animals such as dikkop.

I wonder if there is a less expensive way to trigger the test
situation than brute-forcing things with a large index.
Maybe the injection point infrastructure could help?

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: xid_wraparound tests intermittent failure.
Следующее
От: Tom Lane
Дата:
Сообщение: Re: xid_wraparound tests intermittent failure.