Re: monitoring CREATE INDEX [CONCURRENTLY]

Поиск
Список
Период
Сортировка
От Rahila Syed
Тема Re: monitoring CREATE INDEX [CONCURRENTLY]
Дата
Msg-id CAOajBXRSqA4hDDJRov+gmYxJNocULphv-PO74JRwQEWGrYcjfA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: monitoring CREATE INDEX [CONCURRENTLY]  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы Re: monitoring CREATE INDEX [CONCURRENTLY]  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: monitoring CREATE INDEX [CONCURRENTLY]  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
Hi,

On Mon, 1 Apr 2019 at 21:40, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Hi Rahila, thanks for reviewing.

On 2019-Mar-25, Rahila Syed wrote:

> Please see few comments below:
>
> 1. Makecheck fails currently as view definition of expected rules.out does
> not reflect latest changes in progress metrics numbering.

Ouch ... fixed.

> 2. +      <entry>
> I think there is a typo here 's/partitioned/partitioned table/'

Ah, so there is.  Fixed.

> 3.
> I think parallel scan equivalent bpscan->phs_nblocks along with
> hscan->rs_nblocks should be used similar to startblock computation above.

Hmm, yeah.  I think the way things are setup currently it doesn't matter
much, but it seems fragile to rely on that.


Thank you for incorporating the review comments.
 
I also moved the reporting of total blocks to scan in the initial table
scan so that it occurs wholly in heapam_index_build_range_scan; I had
originally put that code in _bt_spools_heapscan, but that was a
modularity violation I think.  (It didn't make a practical difference,
but it made no sense to have the two cases report the number in wildly
different locations.)  Also added a final nblocks metric update after
the scan is done.

I think this patch is done.

I tested the v8 patch by running plain CREATE INDEX, CIC, and for partitioned tables
and the results are as expected.  Please see few observations below. 

1.  FWIW, below results for CIC show that blocks_done does not become equal to blocks_total at the end of the phase or it processes last 800 blocks so fast that 
the update is not visible in less than 1 secs interval. 

Mon Mar 25 11:06:31 IST 2019
  pid  | datid | datname  | relid |           phase            | lockers_total | lockers_done | current_locker_pid | blocks_total | blocks_done | tuples_total | tuples_done | partitions_total | partitions_done
-------+-------+----------+-------+----------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
 10630 | 13533 | postgres | 16384 | building index: table scan |             0 |            0 |                  0 |      1293366 |     1292535 |            0 |           0 |                0 |               0
(1 row)

Mon Mar 25 11:06:31 IST 2019
  pid  | datid | datname  | relid |                  phase                  | lockers_total | lockers_done | current_locker_pid | blocks_total | blocks_done | tuples_total | tuples_done | partitions_total | partitions_done
-------+-------+----------+-------+-----------------------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
 10630 | 13533 | postgres | 16384 | building index: sorting tuples, spool 1 |             0 |            0 |                  0 |            0 |           0 |    200000000 |           0 |                0 |               0
(1 row)
  
2. However in case of partitioned tables, the following difference in blocks_done versus blocks_total at the end of phase is notably high for the first partition . Subsequent partitions show negligible difference.
Partition 1: 
Mon Mar 25 14:27:57 IST 2019
  pid  | datid | datname  | relid |           phase            | lockers_total | lockers_done | current_locker_pid | blocks_total | blocks_done | tuples_total | tuples_done | partitions_total | partitions_done
-------+-------+----------+-------+----------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
 10630 | 13533 | postgres | 16394 | building index: table scan |             0 |            0 |                  0 |       381342 |      221233 |            0 |           0 |                3 |               0
(1 row)

Mon Mar 25 14:27:57 IST 2019
  pid  | datid | datname  | relid |                  phase                  | lockers_total | lockers_done | current_locker_pid | blocks_total | blocks_done | tuples_total | tuples_done | partitions_total | partitions_done
-------+-------+----------+-------+-----------------------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
 10630 | 13533 | postgres | 16394 | building index: sorting tuples, spool 1 |             0 |            0 |                  0 |            0 |           0 |     49999999 |           0 |                3 |               0
 
The partitions are equal in size and the other two partitions have blocks_done and blocks_total to be approx. 221233. The blocks_total for partition 1 is reported higher.

3. Sorry for nitpicking, I think following phase name can be made more consistent with the others.
The non-am specific phase for scanning a table is named as scan heap while am-specific one is named as table scan. 
Can we use heap for am-specific one as well since heap is used elsewhere in progress reporting too?

4. -       scan = table_beginscan_parallel(btspool->heap, ParallelTableScanFromBTShared(btshared));
+       scan = table_beginscan_parallel(btspool->heap,
+                                                                       ParallelTableScanFromBTShared(btshared));

Is this change required?

Besides the above comments ,the patch looks good to me.

Thank you,
-- 
Rahila Syed
Performance Engineer
2ndQuadrant 
http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: C_C_A animal on HEAD gets stuck in initdb
Следующее
От: legrand legrand
Дата:
Сообщение: Re: DWIM mode for psql