Re: [HACKERS] CLUSTER command progress monitor

Поиск
Список
Период
Сортировка
От Tatsuro Yamada
Тема Re: [HACKERS] CLUSTER command progress monitor
Дата
Msg-id e73dc8f7-1bb8-efb2-0c20-45f1b7d995d9@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [HACKERS] CLUSTER command progress monitor  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [HACKERS] CLUSTER command progress monitor
Re: [HACKERS] CLUSTER command progress monitor
Список pgsql-hackers
Hi Robert!

On 2019/03/05 11:35, Robert Haas wrote:
> On Mon, Mar 4, 2019 at 5:38 AM Tatsuro Yamada
> <yamada.tatsuro@lab.ntt.co.jp> wrote:
>> === Current design ===
>>
>> CLUSTER command uses Index Scan or Seq Scan when scanning the heap.
>> Depending on which one is chosen, the command will proceed in the
>> following sequence of phases:
>>
>>     * Scan method: Seq Scan
>>       0. initializing                 (*2)
>>       1. seq scanning heap            (*1)
>>       3. sorting tuples               (*2)
>>       4. writing new heap             (*1)
>>       5. swapping relation files      (*2)
>>       6. rebuilding index             (*2)
>>       7. performing final cleanup     (*2)
>>
>>     * Scan method: Index Scan
>>       0. initializing                 (*2)
>>       2. index scanning heap          (*1)
>>       5. swapping relation files      (*2)
>>       6. rebuilding index             (*2)
>>       7. performing final cleanup     (*2)
>>
>> VACUUM FULL command will proceed in the following sequence of phases:
>>
>>       1. seq scanning heap            (*1)
>>       5. swapping relation files      (*2)
>>       6. rebuilding index             (*2)
>>       7. performing final cleanup     (*2)
>>
>> (*1): increasing the value in heap_tuples_scanned column
>> (*2): only shows the phase in the phase column
> 
> All of that sounds good.
> 
>> The view provides the information of CLUSTER command progress details as follows
>> # \d pg_stat_progress_cluster
>>                 View "pg_catalog.pg_stat_progress_cluster"
>>             Column           |  Type   | Collation | Nullable | Default
>> ---------------------------+---------+-----------+----------+---------
>>    pid                       | integer |           |          |
>>    datid                     | oid     |           |          |
>>    datname                   | name    |           |          |
>>    relid                     | oid     |           |          |
>>    command                   | text    |           |          |
>>    phase                     | text    |           |          |
>>    cluster_index_relid       | bigint  |           |          |
>>    heap_tuples_scanned       | bigint  |           |          |
>>    heap_tuples_vacuumed      | bigint  |           |          |
> 
> Still not sure if we need heap_tuples_vacuumed.  We could try to
> report heap_blks_scanned and heap_blks_total like we do for VACUUM, if
> we're using a Seq Scan.

I have no strong opinion to add heap_tuples_vacuumed, so I'll remove that in
next patch.

Regarding heap_blks_scanned and heap_blks_total, I suppose that it is able to
get those from initscan(). I'll investigate it more.

cluster.c
   copy_heap_data()
     heap_beginscan()
       heap_beginscan_internal()
         initscan()



>> === Discussion points ===
>>
>>    - Progress counter for "3. sorting tuples" phase
>>       - Should we add pgstat_progress_update_param() in tuplesort.c like a
>>         "trace_sort"?
>>         Thanks to Peter Geoghegan for the useful advice!
> 
> How would we avoid an abstraction violation?

Hmm... What do you mean an abstraction violation?
If it is difficult to solve, I'd not like to add the progress counter for the sorting tuples.


>>    - Progress counter for "6. rebuilding index" phase
>>       - Should we add "index_vacuum_count" in the view like a vacuum progress monitor?
>>         If yes, I'll add pgstat_progress_update_param() to reindex_relation() of index.c.
>>         However, I'm not sure whether it is okay or not.
> 
> Doesn't seem unreasonable to me.

I see, I'll add it later.


Regards,
Tatsuro Yamada






В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Iwata, Aya"
Дата:
Сообщение: RE: libpq debug log
Следующее
От: David Steele
Дата:
Сообщение: Re: NOT IN subquery optimization