Re: [PROPOSAL] VACUUM Progress Checker.

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: [PROPOSAL] VACUUM Progress Checker.
Дата
Msg-id 20160215.202125.83820140.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [PROPOSAL] VACUUM Progress Checker.  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Ответы Re: [PROPOSAL] VACUUM Progress Checker.  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Список pgsql-hackers
Hello,

At Mon, 8 Feb 2016 11:37:17 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<56B7FF5D.7030108@lab.ntt.co.jp>
> 
> Hi Vinayak,
> 
> Thanks for updating the patch, a couple of comments:
> 
> On 2016/02/05 17:15, pokurev@pm.nttdata.co.jp wrote:
> > Hello,
> > 
> > Please find attached updated patch.
> >> The point of having pgstat_report_progress_update_counter() is so that 
> >> you can efficiently update a single counter without having to update 
> >> everything, when only one counter has changed.  But here you are 
> >> calling this function a whole bunch of times in a row, which 
> >> completely misses the point - if you are updating all the counters, 
> >> it's more efficient to use an interface that does them all at once 
> >> instead of one at a time.
> > 
> > The pgstat_report_progress_update_counter() is called at appropriate places in the attached patch.
> 
> +     char    progress_message[N_PROGRESS_PARAM][PROGRESS_MESSAGE_LENGTH];
> 
> [ ... ]
> 
> +     snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", phase1);
> +     pgstat_report_progress_update_message(0, progress_message);
> 
> [ ... ]
> 
> +                     snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", phase2);
> +                     pgstat_report_progress_update_message(0, progress_message);
> 
> Instead of passing the array of char *'s, why not just pass a single char
> *, because that's what it's doing - updating a single message. So,
> something like:
> 
> + char progress_message[PROGRESS_MESSAGE_LENGTH];
> 
> [ ... ]
> 
> + snprintf(progress_message, PROGRESS_MESSAGE_LENGTH, "%s", phase1);
> + pgstat_report_progress_update_message(0, progress_message);
> 
> [ ... ]
> 
> + snprintf(progress_message, PROGRESS_MESSAGE_LENGTH, "%s", phase2);
> + pgstat_report_progress_update_message(0, progress_message);
> 
> And also:
> 
> +/*-----------
> + * pgstat_report_progress_update_message()-
> + *
> + *Called to update phase of VACUUM progress
> + *-----------
> + */
> +void
> +pgstat_report_progress_update_message(int index, char *msg)
> +{
> 
> [ ... ]
> 
> +     pgstat_increment_changecount_before(beentry);
> +     strncpy((char *)beentry->st_progress_message[index], msg,
> PROGRESS_MESSAGE_LENGTH);
> +     pgstat_increment_changecount_after(beentry);


As I might have written upthread, transferring the whole string
as a progress message is useless at least in this scenario. Since
they are a set of fixed messages, each of them can be represented
by an identifier, an integer number. I don't see a reason for
sending the whole of a string beyond a backend.

Next, the function pg_stat_get_command_progress() has a somewhat
generic name, but it seems to reuturn the data only for the
backends with beentry->st_command = COMMAND_LAZY_VACUUM and has
the column names specific for vucuum like process. If the
function is intended to be generic, it might be better to return
a set of integer[] for given type. Otherwise it should have a
name represents its objective.

CREATE FUNCTIONpg_stat_get_command_progress(IN cmdtype integer)RETURNS SETOF integer[] as $$....

SELECT * from pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as x        x
---------------------_
{1233, 16233, 1, ....}
{3244, 16236, 2, ....}
....

CREATE VIEW pg_stat_vacuum_progress AS SELECT S.s[1] as pid,        S.s[2] as relid,        CASE S.s[3]           WHEN
1THEN 'Scanning Heap'          WHEN 2 THEN 'Vacuuming Index and Heap'          ELSE 'Unknown phase'        END,  ....
FROMpg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S;
 

# The name of the function could be other than *_command_progress.

Any thoughts or opinions?


> One more comment:
> 
> @@ -1120,14 +1157,23 @@ lazy_scan_heap(Relation onerel, LVRelStats
> *vacrelstats,
>               /* Log cleanup info before we touch indexes */
>               vacuum_log_cleanup_info(onerel, vacrelstats);
> 
> +             snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", phase2);
> +             pgstat_report_progress_update_message(0, progress_message);
>               /* Remove index entries */
>               for (i = 0; i < nindexes; i++)
> +             {
>                       lazy_vacuum_index(Irel[i],
>                                                         &indstats[i],
>                                                         vacrelstats);
> +                     scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
> +                     /* Update the scanned index pages and number of index scan */
> +                     pgstat_report_progress_update_counter(3, scanned_index_pages);
> +                     pgstat_report_progress_update_counter(4, vacrelstats->num_index_scans
> + 1);
> +             }
>               /* Remove tuples from heap */
>               lazy_vacuum_heap(onerel, vacrelstats);
>               vacrelstats->num_index_scans++;
> +             scanned_index_pages = 0;
> 
> I guess num_index_scans could better be reported after all the indexes are
> done, that is, after the for loop ends.

Precise reporting would be valuable if vacuuming indexes takes a
long time. It seems to me to be fine as it is since updating of
stat counters wouldn't add any significant overhead.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: innocuous: pgbench does FD_ISSET on invalid socket
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: Incorrect formula for SysV IPC parameters