Re: Progress reporting for pg_verify_checksums

Поиск
Список
Период
Сортировка
От Michael Banck
Тема Re: Progress reporting for pg_verify_checksums
Дата
Msg-id 1553976461.4884.66.camel@credativ.de
обсуждение исходный текст
Ответ на Re: Progress reporting for pg_verify_checksums  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: Progress reporting for pg_verify_checksums  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
Hi,

so we are basically back at the original patch as written by Bernd :)

> +/*
> + * Report current progress status. Parts borrowed from
> + * PostgreSQLs' src/bin/pg_basebackup.c
> + */
> +static void
> +progress_report(bool force)
> +{
> +     int                     percent;
> +     char            total_size_str[32];
> +     char            current_size_str[32];
> +     pg_time_t       now;
> +
> +     if (!showprogress)
> +             return;

The way you've now changed this is that there's a function call into
progress_report() for every block that's being read, even if there is no
progress reporting requested. That looks like a pretty severe
performance problem so I suggest to at least stick with checking
showprogress before calling progress_report() and not the other way
round.

I guess there is still some noticeable performance penalty to pay when
reporting progress, even more so now that you reverted to only doing it
once per second - if we report progress, we presumably call
report_progress() thousands of times and return early 99,9% of the time.

So my vote is in favour of only calling progress_report() every once in
a while - I saw quite a speedup (or removal of slowdown) due to this in
my tests, this was not just some unwarranted microoptimization.

> +     fprintf(stderr, "\r");

I think the isatty() check that was in our versions of the patch is
useful here, did you check how this looks when piping the output to a
file?

> @@ -195,12 +254,23 @@ scan_file(const char *fn, BlockNumber segmentno)
>                                       _("%s: checksums enabled in file
\"%s\"\n"), progname, fn);
>       }
>  
> +     /* Make sure progress is reported at least once per file */
> +     progress_report(false);
> +
>       close(f);
>  }

This hunk is from the performance optimization of calling
progress_report only every 1024 blocks and could be removed if we stick
with calling it for every block anyway (but see above).

> -static void
> -scan_directory(const char *basedir, const char *subdir)
> +/*
> + * Scan the given directory for items which can be checksummed and
> + * operate on each one of them.  If "sizeonly" is true, the size of
> + * all the items which have checksums is computed and returned back

"computed" is maybe a bit strong a word here, how about "added up"?


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Enable data checksums by default
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: [HACKERS] PATCH: multivariate histograms and MCV lists