Re: Test to dump and restore objects left behind by regression

Поиск

Список

Период

Сортировка

От	Ashutosh Bapat
Тема	Re: Test to dump and restore objects left behind by regression
Дата	24 марта 12:54:30
Msg-id	CAExHW5vw_KaZrjWSNJx-QHF12D4KCmV=AAii3Zh3RHmY43gesw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Test to dump and restore objects left behind by regression (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Ответы	Re: Test to dump and restore objects left behind by regression Re: Test to dump and restore objects left behind by regression
Список	pgsql-hackers

Дерево обсуждения

On Fri, Mar 21, 2025 at 11:38 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2025-Mar-21, Ashutosh Bapat wrote:
>
> > I used the same parallelism in pg_restore and pg_dump too. And your
> > numbers seem to be similar to mine; slightly less than 20% slowdown.
> > But is that slowdown acceptable? From the earlier discussions, it
> > seems the answer is No. Haven't heard otherwise.
>
> I don't think we need to see slowdown this in relative terms, the way we
> would discuss a change in the executor.  This is not a change that
> would affect user-level stuff in any way.  We need to see it in absolute
> terms: in machines similar to mine, the pg_upgrade test would go from
> taking 23s to taking 27s.  This is 4s slower, but this isn't an increase
> in total test runtime, because decently run test suites run multiple
> tests in parallel.  This is the same that Peter said in [1].  The total
> test runtime change might not be *that* large.  I'll take a few numbers
> and report back.

 Using -j2 in pg_dump and -j3 in pg_restore does not improve timing
much on my laptop. I have used -j2 for both pg_dump and restore
instead of -j3 so as to avoid using more cores when tests are run in
parallel.

Further to reduce run time, I tried -1/--single-transaction but that's
not allowed with --create. I also tried --transaction-size=1000 but
that doesn't affect the run time of the test. Next I thought of using
standard output and input instead of files but it doesn't help since
1. directory format cannot use those and it's the only format allowing
parallelism, 2. that's slower than using files with --no-sync. Didn't
find any other way which can help us reduce the test time.

Please note that the dumps taken for comparison cannot use -j since
they are required to be in "plain" format so that text manipulation
comparison works on them.

One concern I have with directory format is the dumped database is not
readable. This might make investigating a but identified the test a
bit more complex. But I guess, in such a case investigator can either
use the dumps taken for comparison or change the code to use plain
format for investigation. So it's a price we pay for making test
faster.

Here's next patchset:
0001 - it's the same 0001 patch as previous one, includes the test
with all formats and also the PG_TEST_EXTRA option

0002 - removes PG_TEST_EXTRA and also tests only one format
--directory with -j2 with default compression. It should be merged
into 0001 before committing. This is a separate patch for now in case
we decide to go back to 0001.

0003 - same as 0002 in the previous patch set. It excludes statistics
from comparison, otherwise the test will fail because of bug reported
at [1]. Ideally we shouldn't commit this patch so as to test
statistics dump and restore, but in case we need the test to pass till
the bug is fixed, we should merge this patch to 0001 before
committing.

[1] https://www.postgresql.org/message-id/CAExHW5s47kmubpbbRJzSM-Zfe0Tj2O3GBagB7YAyE8rQ-V24Uw@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat

Вложения

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Test to dump and restore objects left behind by regression

Вложения