Re: progress report for ANALYZE

Поиск
Список
Период
Сортировка
От Tatsuro Yamada
Тема Re: progress report for ANALYZE
Дата
Msg-id 717424c6-0e8c-5995-d96b-a8bcfbbb0336@nttcom.co.jp_1
обсуждение исходный текст
Ответ на Re: progress report for ANALYZE  (Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp>)
Ответы Re: progress report for ANALYZE  (Amit Langote <amitlangote09@gmail.com>)
Список pgsql-hackers
Hi Amit-san,


>>> I wonder two things below. What do you think?
>>>
>>> 1)
>>> For now, I'm not sure it should be set current_child_table_relid to zero
>>> when the current phase is changed from "acquiring inherited sample rows" to
>>> "computing stats". See <Test result> bellow.
>>
>> In the upthread discussion [1], Robert asked to *not* do such things,
>> that is, resetting some values due to phase change.  I'm not sure his
>> point applies to this case too though.
> 
> Yeah, I understood.
> I'll check target relid of "computing stats" to re-read a code of
> analyze command later. :)


Finally, I understood after investigation of the code. :)
Call stack is the following, and analyze_rel() calls "N + 1" times
for partitioned table and each partitions.

analyze_rel start
  do_analyze_rel inh==true start
   onerel: hoge2
    acq_inh_sample_rows start
     childrel: hoge2_10000
     childrel: hoge2_20000
     childrel: hoge2_30000
     childrel: hoge2_default
    acq_inh_sample_rows end
    compute_stats start
    compute_stats end
    compute_index_stats start
    compute_index_stats end
    finalizing start
    finalizing end
  do_analyze_rel inh==true end
analyze_rel end
...


Also, I checked my test result. ("//" is my comments)


# select oid,relname,relkind from pg_class where relname like 'hoge2%';
   oid  |    relname    | relkind
-------+---------------+---------
  36081 | hoge2         | p
  36084 | hoge2_10000   | r
  36087 | hoge2_20000   | r
  36090 | hoge2_30000   | r
  36093 | hoge2_default | r
(6 rows)

# select relid,
          current_child_table_relid,
          phase,
          sample_blks_total,
          sample_blks_scanned,
          ext_stats_total,
          ext_stats_computed,
          child_tables_total,
          child_tables_done
   from pg_stat_progress_analyze; \watch 0.00001

== for partitioned table hoge2 ==
//hoge2_10000
36081|36084|acquiring inherited sample rows|45|20|0|0|4|0
36081|36084|acquiring inherited sample rows|45|42|0|0|4|0
36081|36084|acquiring inherited sample rows|45|45|0|0|4|0
36081|36084|acquiring inherited sample rows|45|45|0|0|4|0

//hoge2_20000
36081|36087|acquiring inherited sample rows|45|3|0|0|4|1
36081|36087|acquiring inherited sample rows|45|31|0|0|4|1
36081|36087|acquiring inherited sample rows|45|45|0|0|4|1
36081|36087|acquiring inherited sample rows|45|45|0|0|4|1

//hoge2_30000
36081|36090|acquiring inherited sample rows|45|12|0|0|4|2
36081|36090|acquiring inherited sample rows|45|35|0|0|4|2
36081|36090|acquiring inherited sample rows|45|45|0|0|4|2
36081|36090|acquiring inherited sample rows|45|45|0|0|4|2

//hoge2_default
36081|36093|acquiring inherited sample rows|45|18|0|0|4|3
36081|36093|acquiring inherited sample rows|45|38|0|0|4|3
36081|36093|acquiring inherited sample rows|45|45|0|0|4|3
36081|36093|acquiring inherited sample rows|45|45|0|0|4|3

//Below "computing stats" is for the partitioned table hoge,
//therefore the second column from the left side would be
//better to set Zero to easy to understand.
//I guessd that user think which relid is the target of
//"computing stats"?!
//Of course, other option is to write it on document.

36081|36093|computing stats     |45|45|0|0|4|4
36081|36093|computing stats     |45|45|0|0|4|4
36081|36093|computing stats     |45|45|0|0|4|4
36081|36093|computing stats     |45|45|0|0|4|4
36081|36093|finalizing analyze  |45|45|0|0|4|4

== for each partitions such as hoge2_10000 ... hoge2_default ==

//hoge2_10000
36084|0|acquiring sample rows   |45|25|0|0|0|0
36084|0|computing stats         |45|45|0|0|0|0
36084|0|computing extended stats|45|45|0|0|0|0
36084|0|finalizing analyze      |45|45|0|0|0|0

//hoge2_20000
36087|0|acquiring sample rows   |45|14|0|0|0|0
36087|0|computing stats         |45|45|0|0|0|0
36087|0|computing extended stats|45|45|0|0|0|0
36087|0|finalizing analyze      |45|45|0|0|0|0

//hoge2_30000
36090|0|acquiring sample rows   |45|12|0|0|0|0
36090|0|acquiring sample rows   |45|44|0|0|0|0
36090|0|computing extended stats|45|45|0|0|0|0
36090|0|finalizing analyze      |45|45|0|0|0|0

//hoge2_default
36093|0|acquiring sample rows   |45|10|0|0|0|0
36093|0|acquiring sample rows   |45|43|0|0|0|0
36093|0|computing extended stats|45|45|0|0|0|0
36093|0|finalizing analyze      |45|45|0|0|0|0



>>> 2)
>>> There are many "finalizing analyze" phases based on relids in the case
>>> of partitioning tables. Would it better to fix the document? or it
>>> would be better to reduce it to one?
>>>
>>> <Document>
>>> ---------------------------------------------------------
>>>        <entry><literal>finalizing analyze</literal></entry>
>>>        <entry>
>>>          The command is updating pg_class. When this phase is completed,
>>>          <command>ANALYZE</command> will end.
>>> ---------------------------------------------------------
>>
>> When a partitioned table is analyzed, its partitions are analyzed too.
>> So, the ANALYZE command effectively runs N + 1 times if there are N
>> partitions -- first analyze partitioned table to collect "inherited"
>> statistics by collecting row samples using
>> acquire_inherited_sample_rows(), then each partition to collect its
>> own statistics.  Note that this recursive application to ANALYZE to
>> partitions (child tables) only occurs for partitioned tables, not for
>> legacy inheritance.
> 
> Thanks for your explanation.
> I understand Analyzing Partitioned table a little.


It would be better to modify the document of "finalizing analyze" phase.

   # Before modify
    The command is updating pg_class. When this phase is completed,
    <command>ANALYZE</command> will end.

   # Modified
    The command is updating pg_class. When this phase is completed,
    <command>ANALYZE</command> will end. In the case of partitioned table,
    it might be shown on each partitions.

What do you think that? I'm going to fix it, if you agreed. :)

Thanks,
Tatsuro Yamada






В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Removal of support for OpenSSL 0.9.8 and 1.0.0
Следующее
От: Noah Misch
Дата:
Сообщение: Windows UTF-8, non-ICU collation trouble