Re: [HACKERS] CLUSTER command progress monitor

Поиск
Список
Период
Сортировка
От Rafia Sabih
Тема Re: [HACKERS] CLUSTER command progress monitor
Дата
Msg-id CA+FpmFeMbXHpOW3oX83OU=76eiE05Kw=qBMWcPPG84LYKsM35g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] CLUSTER command progress monitor  (Tatsuro Yamada <yamada.tatsuro@lab.ntt.co.jp>)
Ответы Re: [HACKERS] CLUSTER command progress monitor
Список pgsql-hackers
On Fri, 8 Mar 2019 at 09:14, Tatsuro Yamada
<yamada.tatsuro@lab.ntt.co.jp> wrote:
>
> On 2019/03/06 15:38, Tatsuro Yamada wrote:
> > On 2019/03/05 17:56, Tatsuro Yamada wrote:
> >> On 2019/03/05 11:35, Robert Haas wrote:
> >>> On Mon, Mar 4, 2019 at 5:38 AM Tatsuro Yamada
> >>> <yamada.tatsuro@lab.ntt.co.jp> wrote:
> >>>> === Current design ===
> >>>>
> >>>> CLUSTER command uses Index Scan or Seq Scan when scanning the heap.
> >>>> Depending on which one is chosen, the command will proceed in the
> >>>> following sequence of phases:
> >>>>
> >>>>     * Scan method: Seq Scan
> >>>>       0. initializing                 (*2)
> >>>>       1. seq scanning heap            (*1)
> >>>>       3. sorting tuples               (*2)
> >>>>       4. writing new heap             (*1)
> >>>>       5. swapping relation files      (*2)
> >>>>       6. rebuilding index             (*2)
> >>>>       7. performing final cleanup     (*2)
> >>>>
> >>>>     * Scan method: Index Scan
> >>>>       0. initializing                 (*2)
> >>>>       2. index scanning heap          (*1)
> >>>>       5. swapping relation files      (*2)
> >>>>       6. rebuilding index             (*2)
> >>>>       7. performing final cleanup     (*2)
> >>>>
> >>>> VACUUM FULL command will proceed in the following sequence of phases:
> >>>>
> >>>>       1. seq scanning heap            (*1)
> >>>>       5. swapping relation files      (*2)
> >>>>       6. rebuilding index             (*2)
> >>>>       7. performing final cleanup     (*2)
> >>>>
> >>>> (*1): increasing the value in heap_tuples_scanned column
> >>>> (*2): only shows the phase in the phase column
> >>>
> >>> All of that sounds good.
> >>>
> >>>> The view provides the information of CLUSTER command progress details as follows
> >>>> # \d pg_stat_progress_cluster
> >>>>                 View "pg_catalog.pg_stat_progress_cluster"
> >>>>             Column           |  Type   | Collation | Nullable | Default
> >>>> ---------------------------+---------+-----------+----------+---------
> >>>>    pid                       | integer |           |          |
> >>>>    datid                     | oid     |           |          |
> >>>>    datname                   | name    |           |          |
> >>>>    relid                     | oid     |           |          |
> >>>>    command                   | text    |           |          |
> >>>>    phase                     | text    |           |          |
> >>>>    cluster_index_relid       | bigint  |           |          |
> >>>>    heap_tuples_scanned       | bigint  |           |          |
> >>>>    heap_tuples_vacuumed      | bigint  |           |          |
> >>>
> >>> Still not sure if we need heap_tuples_vacuumed.  We could try to
> >>> report heap_blks_scanned and heap_blks_total like we do for VACUUM, if
> >>> we're using a Seq Scan.
> >>
> >> I have no strong opinion to add heap_tuples_vacuumed, so I'll remove that in
> >> next patch.
> >>
> >> Regarding heap_blks_scanned and heap_blks_total, I suppose that it is able to
> >> get those from initscan(). I'll investigate it more.
> >>
> >> cluster.c
> >>    copy_heap_data()
> >>      heap_beginscan()
> >>        heap_beginscan_internal()
> >>          initscan()
> >>
> >>
> >>
> >>>> === Discussion points ===
> >>>>
> >>>>    - Progress counter for "3. sorting tuples" phase
> >>>>       - Should we add pgstat_progress_update_param() in tuplesort.c like a
> >>>>         "trace_sort"?
> >>>>         Thanks to Peter Geoghegan for the useful advice!
> >>>
> >>> How would we avoid an abstraction violation?
> >>
> >> Hmm... What do you mean an abstraction violation?
> >> If it is difficult to solve, I'd not like to add the progress counter for the sorting tuples.
> >>
> >>
> >>>>    - Progress counter for "6. rebuilding index" phase
> >>>>       - Should we add "index_vacuum_count" in the view like a vacuum progress monitor?
> >>>>         If yes, I'll add pgstat_progress_update_param() to reindex_relation() of index.c.
> >>>>         However, I'm not sure whether it is okay or not.
> >>>
> >>> Doesn't seem unreasonable to me.
> >>
> >> I see, I'll add it later.
> >
> >
> > Attached file is revised and WIP patch including:
> >
> >    - Remove heap_tuples_vacuumed
> >    - Add heap_blks_scanned and heap_blks_total
> >    - Add index_vacuum_count
> >
> > I tried to "add heap_blks_scanned and heap_blks_total" columns and I realized that
> > "heap_tuples_scanned" column is suitable as a counter when a scan method is
> > both index-scan and seq-scan because CLUSTER is on a tuple basis.
>
>
> Attached file is rebased patch on current HEAD.
> I changed a status. :)
>
>
Looks like the patch needs a rebase.
I was on the commit fb5806533f9fe0433290d84c9b019399cd69e9c2

PFA reject file in case you want to have a look.
> Regards,
> Tatsuro Yamada
>
>
>


-- 
Regards,
Rafia Sabih

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Rafia Sabih
Дата:
Сообщение: Re: explain plans with information about (modified) gucs
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Making all nbtree entries unique by having heap TIDs participatein comparisons