Обсуждение: monitoring CREATE INDEX [CONCURRENTLY]
Monitoring progress of CREATE INDEX [CONCURRENTLY] is sure to be welcome, so here's a proposal. There are three distinct interesting cases. One is straight CREATE INDEX of a standalone table; then we have CREATE INDEX CONCURRENTLY; finally, CREATE INDEX on a partitioned table. Note that there's no CONCURRENTLY for partitioned tables. A non-concurrent build is a very straightforward: we call create_index, which does index_build, done. See below for how to report for index_build, which is the interesting part. I propose not to report anything else than that for non-concurrent build. There's some preparatory work that's identical than for CIC (see below). Like VACUUM, it seems a bit pointless to report an initial phase that's almost immediate, so I propose we just don't report anything until the actual index building starts. CREATE INDEX CONCURRENTLY does these things first, which we would not report (this is just like VACUUM, which only starts reporting once it starts scanning blocks): a. lock rel. No metrics to report. b. other prep; includes lots of catalog access. Unlikely to lock, but not impossible. No metrics to report. c. create_index. CIC skips index_build here, so there's no reason to report it either. We would start reporting at this point, with these phases: 1. WaitForLockers 1. Report how many xacts do we need to wait for, how many are done. 2. index_build. See below. 3. WaitForLockers 2. Report how many xacts do we need to wait for, how many are done. 4. validate_index. Scans the whole rel again. Report number of blocks scanned. 5. wait for virtual XIDs. Like WaitForLockers: report how many xacts we need to wait for, how many are done. We're done. (Alternatively, we could have an initial "prep" phase for a/b/c for the concurrent case and a/b for non-concurrent. I'm just not sure it's useful.) index_build ----------- The actual index building is an AM-specific undertaking, and we report its progress separately from the AM-agnostic code. That is, each AM has freedom to define its own list of phases and counters, separate from the generic code. This avoids the need to provide a new AM method or invoke callbacks. So when you see that CREATE_INDEX_PHASE is either "index build" you'll have a separate BTREE_CREATE_PHASE value set to either "scanning heap" or "sorting" or "building upper layers"; equivalently for other AMs. Partitioned indexes ------------------- For partitioned indexes, we only have the index build phase, but we repeat it for each partition. In addition to the index_build metrics described above, we should report how many partitions we need to handle in total and how many partitions are already done. (I'm avoiding getting in the trouble of reporting *which* partition we're currently handling and have already handled.) Thoughts? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
For discussion, here's an preliminary patch. This is just a first skeleton; needs to grow a lot of flesh yet, per my previous proposal. As far as the generic CREATE INDEX stuff goes, I think this is complete; it's missing the AM-specific bits. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On Tue, Jan 1, 2019 at 6:09 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
For discussion, here's an preliminary patch. This is just a first
skeleton; needs to grow a lot of flesh yet, per my previous proposal.
As far as the generic CREATE INDEX stuff goes, I think this is complete;
it's missing the AM-specific bits.
Looks like it's missing the validate_index blocks-scanned report, which is not AM-specific and your original proposal does mention that.
I thought a bit about index_build part. If most AMs follow a somewhat standard phases while building an index, it might be simpler to define those phases and have AM agnostic code report those phases. Or may be just report the most significant information, instead of reporting each sub-phase of index_build. I think the most important progress to know would be how far the heap is scanned for to-be-indexed tuples. AFAICS all AMs use IndexBuildHeapScan() to scan the heap. Can we simply do some reporting from that routine? Like number of blocks scanned against the total number of blocks requested?
Some minor comments on the patch, though I suspect you might be already updating the patch since you marked it as WIP.
+CREATE VIEW pg_stat_progress_create_index AS
+ SELECT
+ s.pid AS pid, S.datid AS datid, D.datname AS datname,
+ S.relid AS relid,
+ CASE s.param1 WHEN 0 THEN 'initializing'
+ WHEN 1 THEN 'waiting for lockers 1'
+ WHEN 2 THEN 'building index'
+ WHEN 3 THEN 'waiting for lockers 2'
+ WHEN 4 THEN 'validating index'
+ WHEN 5 THEN 'waiting for lockers 3'
Can we have more descriptive text for waiters? Such as "waiting for existing
writers", "waiting for intermediate writers" and "waiting for old readers". Not
sure if I got those correct, something of that sort will definitely give more
insight into what the transaction is waiting for.
Can we actually also report the list of transactions the command is waiting on?
That could be useful to the user if CIC appears to be stuck too long.
+ pgstat_progress_update_param(PROGRESS_CREATEIDX_PARTITIONS_TOTAL,
+ nparts);
+
IMHO we should just use full term INDEX instead of IDX, such as
PROGRESS_CREATE_INDEX_PARTITIONS_TOTAL. It's already a long name, so couple of
extra characters won't make a difference. I did not see much precedence to
shortern to IDX for INDEX elsewhere in the code (though we tend to do that for
variable names etc).
@@ -1282,6 +1318,9 @@ DefineIndex(Oid relationId,
old_snapshots = GetCurrentVirtualXIDs(limitXmin, true, false,
PROC_IS_AUTOVACUUM | PROC_IN_VACUUM,
&n_old_snapshots);
+ pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
+ PROGRESS_CREATEIDX_PHASE_WAIT_3);
+ pgstat_progress_update_param(PROGRESS_WAITFOR_TOTAL, n_old_snapshots);
I think we should clear the PROGRESS_WAITFOR_TOTAL and PROGRESS_WAITFOR_DONE
when the wait phase is over, to avoid any confusion. For example, I noticed that the counters from WAIT_1 are reported as-is if WAIT_2 had no lockers.
Shouldn't PROGRESS_WAITFOR_DONE be updated when we skip a snapshot in the code below?
if (!VirtualTransactionIdIsValid(old_snapshots[i]))
continue; /* found uninteresting in previous cycle */
@@ -2817,7 +2818,7 @@ FastPathGetRelationLockEntry(LOCALLOCK *locallock)
* uses of the result.
*/
VirtualTransactionId *
-GetLockConflicts(const LOCKTAG *locktag, LOCKMODE lockmode)
+GetLockConflicts(const LOCKTAG *locktag, LOCKMODE lockmode, int *ocount)
Could that out variable be named something differently? "countp" or something
like that. I did not check if there is some practice that we follow, but I
remember suffixing with "p" rather than prefixing with "o" (for out I assume)
+
+/* Progress parameters for CREATE INDEX */
+#define PROGRESS_CREATEIDX_PHASE 0
+/* 1 and 2 reserved for "waitfor" metrics */
+#define PROGRESS_CREATEIDX_PARTITIONS_TOTAL 3
+#define PROGRESS_CREATEIDX_PARTITIONS_DONE 4
+
Is there a reason to leave those reserve placeholders, only to fill them a few
lines down?
Thanks,
Pavan
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
PostgreSQL Development, 24x7 Support, Training & Services
Hi Alvaro,
On Fri, Dec 21, 2018 at 3:30 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Monitoring progress of CREATE INDEX [CONCURRENTLY] is sure to be welcome,
so here's a proposal.
There are three distinct interesting cases. One is straight CREATE
INDEX of a standalone table; then we have CREATE INDEX CONCURRENTLY;
finally, CREATE INDEX on a partitioned table. Note that there's no
CONCURRENTLY for partitioned tables.
A non-concurrent build is a very straightforward: we call create_index,
which does index_build, done. See below for how to report for
index_build, which is the interesting part. I propose not to report
anything else than that for non-concurrent build. There's some
preparatory work that's identical than for CIC (see below). Like
VACUUM, it seems a bit pointless to report an initial phase that's
almost immediate, so I propose we just don't report anything until the
actual index building starts.
Aren't we reporting this initial preparatory work in the form of 'initializing' phase that you
have in current patch? IIUC, there are no metrics but the name of the phase.
CREATE INDEX CONCURRENTLY does these things first, which we would not
report (this is just like VACUUM, which only starts reporting once it
starts scanning blocks):
a. lock rel. No metrics to report.
b. other prep; includes lots of catalog access. Unlikely to lock, but
not impossible. No metrics to report.
c. create_index. CIC skips index_build here, so there's no reason to
report it either.
We would start reporting at this point, with these phases:
1. WaitForLockers 1. Report how many xacts do we need to wait for, how
many are done.
2. index_build. See below.
3. WaitForLockers 2. Report how many xacts do we need to wait for, how
many are done.
4. validate_index. Scans the whole rel again. Report number of blocks
scanned.
5. wait for virtual XIDs. Like WaitForLockers: report how many xacts we
need to wait for, how many are done.
We're done.
(Alternatively, we could have an initial "prep" phase for a/b/c for the
concurrent case and a/b for non-concurrent. I'm just not sure it's
useful.)
index_build
-----------
The actual index building is an AM-specific undertaking, and we report
its progress separately from the AM-agnostic code. That is, each AM has
freedom to define its own list of phases and counters, separate from the
generic code. This avoids the need to provide a new AM method or invoke
callbacks. So when you see that CREATE_INDEX_PHASE is either "index
build" you'll have a separate BTREE_CREATE_PHASE value set to either
"scanning heap" or "sorting" or "building upper layers"; equivalently
for other AMs.
OK.
I think the main phases in which index_build for most AMs can be divided is as follows:
1. Scanning heap tuples for building index which is common for all AMs
I think the main phases in which index_build for most AMs can be divided is as follows:
1. Scanning heap tuples for building index which is common for all AMs
2. Forming index tuples which is AM-specific
3. Writing tuples into the index which is AM-specific.
Out of these, metrics for phase 1 can be heap_tuples_scanned / total_heap_tuples and
3. Writing tuples into the index which is AM-specific.
Out of these, metrics for phase 1 can be heap_tuples_scanned / total_heap_tuples and
for phase 3, it can be index_tuples_computed/ total_index_tuples.
I am not sure about metrics for phase 2 especially for Btree which involves
reporting progress of sorting.
reporting progress of sorting.
Partitioned indexes
-------------------
For partitioned indexes, we only have the index build phase, but we
repeat it for each partition. In addition to the index_build metrics
described above, we should report how many partitions we need to handle
in total and how many partitions are already done. (I'm avoiding
getting in the trouble of reporting *which* partition we're currently
handling and have already handled.)
Thoughts?
+CREATE VIEW pg_stat_progress_create_index AS
+ SELECT
+ s.pid AS pid, S.datid AS datid, D.datname AS datname,
+ S.relid AS relid,
+ CASE s.param1 WHEN 0 THEN 'initializing'
+ WHEN 1 THEN 'waiting for lockers 1'
+ WHEN 2 THEN 'building index'
+ WHEN 3 THEN 'waiting for lockers 2'
+ WHEN 4 THEN 'validating index'
+ WHEN 5 THEN 'waiting for lockers 3'
+ END as phase,
+ S.param2 AS procs_to_wait_for,
+ S.param3 AS procs_waited_for,
+ S.param4 AS partitions_to_build,
+ S.param5 AS partitions_built
+ FROM pg_stat_get_progress_info('CREATE INDEX') AS S
+ LEFT JOIN pg_database D ON S.datid = D.oid;
+
1. In the above code, I think it will be useful if we report phases as
'Initializing phase 1 of n'
'Waiting for lockers phase 2 of n' etc. i.e reporting total number of phases as well.
'Initializing phase 1 of n'
'Waiting for lockers phase 2 of n' etc. i.e reporting total number of phases as well.
+ holders = lappend(holders,
+ GetLockConflicts(locktag, lockmode, &count));
+ total += count;
2. IIUC, the above code in WaitForLockersMultiple can be written under
if(progress) condition like rest of the progress checking code in that function
and pass NULL for count otherwise.
+ GetLockConflicts(locktag, lockmode, &count));
+ total += count;
2. IIUC, the above code in WaitForLockersMultiple can be written under
if(progress) condition like rest of the progress checking code in that function
and pass NULL for count otherwise.
3. Will it be useful to report pid's of the backend's
for the transactions which CREATE INDEX concurrently is waiting for?
I think it can be used to debug long running transactions.
4. Should we have additional statistics update phase before index_update_stats
like PROGRESS_VACUUM_PHASE_FINAL_CLEANUP?
4. Should we have additional statistics update phase before index_update_stats
like PROGRESS_VACUUM_PHASE_FINAL_CLEANUP?
5. I think it may be useful to report number of parallel workers requested and number of workers
actually running index build in case of btree.
6. Also, this can be reported as an additional validation phase for exclusion constraint
in index_build function as it involves scanning all live tuples of heap relation.
/*
* If it's for an exclusion constraint, make a second pass over the heap
* to verify that the constraint is satisfied. We must not do this until
* the index is fully valid. (Broken HOT chains shouldn't matter, though;
* see comments for IndexCheckExclusion.)
*/
if (indexInfo->ii_ExclusionOps != NULL)
IndexCheckExclusion(heapRelation, indexRelation, indexInfo);
*/
* to verify that the constraint is satisfied. We must not do this until
* the index is fully valid. (Broken HOT chains shouldn't matter, though;
* see comments for IndexCheckExclusion.)
*/
if (indexInfo->ii_ExclusionOps != NULL)
IndexCheckExclusion(heapRelation, indexRelation, indexInfo);
*/
Thank you,
Rahila Syed
On 2019-Jan-09, Pavan Deolasee wrote: > Looks like it's missing the validate_index blocks-scanned report, which is > not AM-specific and your original proposal does mention that. True. That phase is actually 3 phases, which would be reported separately: 5. index_bulk_delete() scan 6. performsort 7. validate_index_heapscan > I thought a bit about index_build part. If most AMs follow a somewhat > standard phases while building an index, it might be simpler to define > those phases and have AM agnostic code report those phases. Well, think about btrees. We first scan the whole table putting all tuples in a spool (two spools actually), then we tell the spools to get sorted, then we extract tuples from the spools, and finally we read the spool to produce the leaf pages. If we just report the table scan, the reports gets to 100% complete in that phase and then waits for a very long time during which nothing seems to happen. That's not cool. I'm adding a new AM support routine which turns the subphase number into a textual description, so that we don't have to hardcode phase names in the agnostic code. > Can we have more descriptive text for waiters? Such as "waiting for > existing writers", "waiting for intermediate writers" and "waiting for > old readers". Not sure if I got those correct, something of that sort > will definitely give more insight into what the transaction is waiting > for. Can do. > Can we actually also report the list of transactions the command is waiting > on? > That could be useful to the user if CIC appears to be stuck too long. I'm afraid this is not possible, because the progress report interface doesn't allow for something like that. > IMHO we should just use full term INDEX instead of IDX, such as > PROGRESS_CREATE_INDEX_PARTITIONS_TOTAL. It's already a long name, so couple > of extra characters won't make a difference. Really? I though it was clear enough and it's *three* characters saved even. > I think we should clear the PROGRESS_WAITFOR_TOTAL and PROGRESS_WAITFOR_DONE > when the wait phase is over, to avoid any confusion. For example, I noticed > that the counters from WAIT_1 are reported as-is if WAIT_2 had no lockers. Yes, absolutely. > +GetLockConflicts(const LOCKTAG *locktag, LOCKMODE lockmode, int *ocount) > > Could that out variable be named something differently? "countp" or > something like that. I did not check if there is some practice that we > follow, but I remember suffixing with "p" rather than prefixing with > "o" (for out I assume) Sure. > +/* Progress parameters for CREATE INDEX */ > +#define PROGRESS_CREATEIDX_PHASE 0 > +/* 1 and 2 reserved for "waitfor" metrics */ > +#define PROGRESS_CREATEIDX_PARTITIONS_TOTAL 3 > +#define PROGRESS_CREATEIDX_PARTITIONS_DONE 4 > + > > Is there a reason to leave those reserve placeholders, only to fill them a > few lines down? Yes -- those are going to be used by other reports that also use similar metrics, such as the CLUSTER report. I'm running out of columns to put the numbers into :-( Right now I have 1. phase 2. subphase (for index_build) 3. lockers total (to wait for) 4. lockers done 5. blocks total 6. blocks done 7. tapes total 8. tapes done 9. partitions total 10. partitions done The "tapes total/done" bit is about reporting the performsort steps; I'm not really sure yet it'll be tapes, but I hope it can be done with two numbers. Anyway, in btree build I have these subphases 1. spool heapscan using IndexBuildHeapScan 2. _bt_parallel_heapscan stuff (only one of these in a build) 3. bt_leafbuild, performsort spool 1 4. bt_leafbuild, performsort spool 2 5. bt_load and I don't have columns to put the metrics for the btload thing, assuming I figure out what those are. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi all, Based on the latest emails exchanged, the patch got some feedback from Pavan which has not been addressed. So I am marking the patch as returned with feedback. -- Michael
Вложения
Hi Rahila, Pavan, Thanks for the review. I incorporated some fixes per your comments. More fixes are needed still. That said, the patch in attachment gives good insight into how I think this should turn out. > > index_build > > ----------- > OK. > I think the main phases in which index_build for most AMs can be divided is > as follows: > [...] I ended up defining phases for the index_build phase in the AM itself; the code reports a phase number using the regular API, and then the pgstat_progress view obtains the name of each phase using a support method. For btree, I ended up not reporting anything about the sort per se; we just scan the heap (reporting block numbers, which is easy because we know the heap size in advance) and count heap tuples while scanning; once that's done, we know how many tuples we need to write out to the index, so that total becomes the next stage's target total. That's simpler. (It is wrong for partial indexes currently, though.) So for btree we have this: /* * btphasename() -- Return name of phase, for index build progress report */ char * btphasename(int64 phasenum) { switch (phasenum) { case PROGRESS_CREATEIDX_SUBPHASE_INITIALIZE: return "initializing (1/5)"; case PROGRESS_BTREE_PHASE_INDEXBUILD_HEAPSCAN: return "table scan (2/5)"; case PROGRESS_BTREE_PHASE_PERFORMSORT_1: return "sorting tuples, spool 1 (3/5)"; case PROGRESS_BTREE_PHASE_PERFORMSORT_2: return "sorting tuples, spool 2 (4/5)"; case PROGRESS_BTREE_PHASE_LEAF_LOAD: return "final btree sort & load (5/5)"; default: return NULL; } } Now this is a bit strange, because the report looks like this: phase | building index (3 of 8): initializing (1/5) [...] blocks total | 442478 blocks done | 3267 So for phase 3, we always have phase and subphase counters in the phase name. However, I don't think there's any clean way to know from the very beginning how many subphases are there going to be, and it would be even more confusing to have the total "of N" number vary each time depending on the access method. So this leaves the phase counter going from 1 to 8, and then for phase 3 you have a second part that goes from 1 to N. Anyway, at some point it completes the heap scan, and the phase changes to this: phase | building index (3 of 8): heap scan(2/5) I think I should take Rahila's suggestion to report the number of workers running (but I'm not sure I see the point in reporting number of workers planned). The heap scan quickly gives way to the next one, phase | building index (3 of 8): final btree sort & load (5/5) [...] tuples total | 100000000 tuples done | 58103713 Finally, phase | validating index scan (phase 5 of 8) and phase | validate index heapscan (phase 7 of 8) and then it's over. Now, it's clear that I haven't yet nailed all the progress updates correctly, because there are some stalls where nothing seems to be happening. The final index_validate phases have been ignored so far. > 1. In the above code, I think it will be useful if we report phases as > 'Initializing phase 1 of n' > 'Waiting for lockers phase 2 of n' etc. i.e reporting total number of > phases as well. Great idea, done. > 2. IIUC, the above code in WaitForLockersMultiple can be written under > if(progress) condition like rest of the progress checking code in that > function > and pass NULL for count otherwise. Yep, good point. > 3. Will it be useful to report pid's of the backend's > for the transactions which CREATE INDEX concurrently is waiting for? > I think it can be used to debug long running transactions. Done. > 4. Should we have additional statistics update phase before > index_update_stats > like PROGRESS_VACUUM_PHASE_FINAL_CLEANUP? Not sure about this one ... it's supposed to be a fairly quick operation. > 5. I think it may be useful to report number of parallel workers requested > and number of workers > actually running index build in case of btree. True, I haven't done this one. I'll add it next. > 6. Also, this can be reported as an additional validation phase for > exclusion constraint > in index_build function as it involves scanning all live tuples of heap > relation. > > /* > * If it's for an exclusion constraint, make a second pass over the > heap > * to verify that the constraint is satisfied. We must not do this > until > * the index is fully valid. (Broken HOT chains shouldn't matter, > though; > * see comments for IndexCheckExclusion.) > */ > if (indexInfo->ii_ExclusionOps != NULL) > IndexCheckExclusion(heapRelation, indexRelation, indexInfo); > */ Hmm, I haven't looked into exclusion constraints thus far ... I suppose this is a critical point to keep in mind. Thanks for the review! -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
I added metrics for the validate_index phases; patch attached. This is still a bit raw, but it looks much better now. Here's a sample concurrent index build on a 100M tuple table. There are no concurrent transactions, so phases that wait for lockers don't appear. I'm not making any effort to report metrics during phase 6, note. In phase 7 I'm currently reporting only tuple counts; I think it'd be better to report block numbers. I don't show it here, but when an index is built on a partitioned table, the "partitions done" number goes up all the way to "partitions total" and the phases come and go for each partition. One thing is clear: given the phase mechanics of varying durations, it seems hard to use these numbers to predict total index build time. now | phase | blocks total | blocks done | tuples total | tuplesdone | partitions total | partitions done --------------+--------------------------------------------------------+--------------+-------------+--------------+-------------+------------------+----------------- 15:56:33.792 | building index (3 of 8): initializing (1/5) | 442478 | 32 | 0 | 0 | 0 | 0 15:56:33.805 | building index (3 of 8): initializing (1/5) | 442478 | 188 | 0 | 0 | 0 | 0 [snip about 500 lines] 15:56:41.345 | building index (3 of 8): initializing (1/5) | 442478 | 439855 | 0 | 0 | 0 | 0 15:56:41.356 | building index (3 of 8): initializing (1/5) | 442478 | 440288 | 0 | 0 | 0 | 0 15:56:41.367 | building index (3 of 8): initializing (1/5) | 442478 | 440778 | 0 | 0 | 0 | 0 15:56:41.378 | building index (3 of 8): initializing (1/5) | 442478 | 441706 | 0 | 0 | 0 | 0 15:56:41.389 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0 | 0 | 0 | 0 15:56:41.4 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0 | 0 | 0 | 0 [snip 300 lines] ... I'm not sure what happened in this 3 seconds period. No metrics were being updated. 15:56:44.65 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0 | 0 | 0 | 0 15:56:44.661 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0 | 0 | 0 | 0 15:56:44.672 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0 | 0 | 0 | 0 15:56:44.682 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0 | 0 | 0 | 0 15:56:44.694 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0 | 0 | 0 | 0 15:56:44.705 | building index (3 of 8): sorting tuples, spool 1 (3/5) | 442478 | 442399 | 100000000 | 0 | 0 | 0 15:56:44.716 | building index (3 of 8): sorting tuples, spool 1 (3/5) | 442478 | 442399 | 100000000 | 0 | 0 | 0 15:56:44.727 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 79057 | 0 | 0 15:56:44.738 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 217018 | 0 | 0 15:56:44.75 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 353804 | 0 | 0 15:56:44.761 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 487892 | 0 | 0 15:56:44.773 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 634136 | 0 | 0 [snip 660 lines] 15:56:52.47 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 99111337| 0 | 0 15:56:52.482 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 99285701| 0 | 0 15:56:52.493 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 99444763| 0 | 0 15:56:52.505 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 99612313| 0 | 0 15:56:52.516 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 99782666| 0 | 0 15:56:52.528 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 99937524| 0 | 0 15:56:52.54 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.551 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.561 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.572 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.583 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.594 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.604 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.615 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.626 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.637 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.648 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.658 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.669 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.68 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.691 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.701 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.712 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 100000000| 0 | 0 15:56:52.723 | validating index scan (phase 5 of 8) | 274194 | 1049 | 0 | 0 | 0 | 0 15:56:52.734 | validating index scan (phase 5 of 8) | 274194 | 2676 | 0 | 0 | 0 | 0 15:56:52.744 | validating index scan (phase 5 of 8) | 274194 | 2876 | 0 | 0 | 0 | 0 15:56:52.755 | validating index scan (phase 5 of 8) | 274194 | 4194 | 0 | 0 | 0 | 0 [snip 400 lines] 15:56:57.031 | validating index scan (phase 5 of 8) | 274194 | 268343 | 0 | 0 | 0 | 0 15:56:57.042 | validating index scan (phase 5 of 8) | 274194 | 268479 | 0 | 0 | 0 | 0 15:56:57.053 | validating index scan (phase 5 of 8) | 274194 | 270027 | 0 | 0 | 0 | 0 15:56:57.064 | validating index scan (phase 5 of 8) | 274194 | 271580 | 0 | 0 | 0 | 0 15:56:57.075 | validating index scan (phase 5 of 8) | 274194 | 273121 | 0 | 0 | 0 | 0 15:56:57.085 | sorting index scan results (phase 6 of 8) | 274194 | 274193 | 0 | 0 | 0 | 0 15:56:57.096 | sorting index scan results (phase 6 of 8) | 274194 | 274193 | 0 | 0 | 0 | 0 15:56:57.107 | sorting index scan results (phase 6 of 8) | 274194 | 274193 | 0 | 0 | 0 | 0 15:56:57.118 | sorting index scan results (phase 6 of 8) | 274194 | 274193 | 0 | 0 | 0 | 0 15:56:57.128 | sorting index scan results (phase 6 of 8) | 274194 | 274193 | 0 | 0 | 0 | 0 15:56:57.139 | sorting index scan results (phase 6 of 8) | 274194 | 274193 | 0 | 0 | 0 | 0 15:56:57.15 | sorting index scan results (phase 6 of 8) | 274194 | 274193 | 0 | 0 | 0 | 0 15:56:57.161 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 50152 | 0 | 0 15:56:57.172 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 175602 | 0 | 0 15:56:57.182 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 305326 | 0 | 0 15:56:57.193 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 430142 | 0 | 0 [snip 730 lines] 15:57:05.003 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 99125556| 0 | 0 15:57:05.013 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 99276471| 0 | 0 15:57:05.024 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 99425041| 0 | 0 15:57:05.035 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 99580174| 0 | 0 15:57:05.045 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 99720505| 0 | 0 15:57:05.056 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 99862311| 0 | 0 15:57:05.067 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 100000000| 0 | 0 15:57:05.077 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 100000000| 0 | 0 15:57:05.088 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 100000000| 0 | 0 15:57:05.099 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 100000000| 0 | 0 15:57:05.109 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 100000000| 0 | 0 15:57:05.12 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 100000000| 0 | 0 15:57:05.131 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 100000000| 0 | 0 15:57:05.142 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 100000000| 0 | 0 15:57:05.152 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 100000000| 0 | 0 15:57:05.163 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 100000000| 0 | 0 15:57:05.174 | validate index heapscan (phase 7 of 8) | 0 | 0 | 100000000 | 100000000| 0 | 0 (2862 filas) and then it's done. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On Wed, 13 Feb 2019 at 00:46, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Here's a sample
concurrent index build on a 100M tuple table.
Cool
There are no concurrent
transactions, so phases that wait for lockers don't appear.
Would prefer to see them, so we know they are zero.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019/02/13 11:59, Simon Riggs wrote: > On Wed, 13 Feb 2019 at 00:46, Alvaro Herrera <alvherre@2ndquadrant.com> > wrote: > >> Here's a sample >> concurrent index build on a 100M tuple table. > > > Cool +1 Looking at the "([phase] x of x)" in the sample output, I thought pg_stat_progress_vacuum's output should've had it too (not a concern of Alvaro's patch though.) Thanks, Amit
Hi Alvaro, On 2019/02/12 12:18, Alvaro Herrera wrote: > I ended up defining phases for the index_build phase in the AM itself; > the code reports a phase number using the regular API, and then the > pgstat_progress view obtains the name of each phase using a support > method. diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml index 05102724ead..189179a5667 100644 --- a/doc/src/sgml/indexam.sgml +++ b/doc/src/sgml/indexam.sgml @@ -127,6 +127,7 @@ typedef struct IndexAmRoutine amcostestimate_function amcostestimate; amoptions_function amoptions; amproperty_function amproperty; /* can be NULL */ + amphasename_function amphasename; /* can be NULL */ Doesn't the name amphasename sound a bit too generic, given that it can only describe the phases of ambuild? Maybe ambuildphase? Thanks, Amit
On 2019/02/13 4:16, Alvaro Herrera wrote: > I added metrics for the validate_index phases; patch attached. This is > still a bit raw, but it looks much better now. Here's a sample > concurrent index build on a 100M tuple table. There are no concurrent Great! + s.pid AS pid, S.datid AS datid, D.datname AS datname, + S.relid AS relid, + CASE s.param2 WHEN 0 THEN 'initializing (phase 1 of 8)' + WHEN 1 THEN 'waiting for lockers 1 (phase 2 of 8)' + WHEN 2 THEN 'building index (3 of 8): ' || + pg_indexam_progress_phasename(s.param1::oid, s.param3) It would be better to replace "s" with "S". s/s.pid/S.pid/ s/s.param2/S.param2/ s/s.param1::oid, s.param3/S.param1::oid, S.param3/ These are not important comments but for readability. :) Thanks, Tatsuro Yamada
On 2019-Feb-13, Amit Langote wrote: > Doesn't the name amphasename sound a bit too generic, given that it can > only describe the phases of ambuild? Maybe ambuildphase? Hmm, yeah, maybe it does. I renamed it "ambuildphasename", since it's not about reporting the phase itself -- it's about translating the phase number to the string that's reported to the user. The attached patch does it that way. Also, when an index build uses an AM that doesn't support progress reporting, it no longer reports a NULL phase name while building. I also changed it to report the progress of phase 7 (heap scan validation) using block numbers rather than tuple counts. I also tweaked the strings reported in the view. They're clearer now IMO. One slight annoyance is that when parallel workers are used, the last block number reported in phase 3/subphase 2 (IndexBuildHeapScan stuff) is not necessarily accurate, since the tail of the table could well be scanned by a worker that's not the leader, and we only report in the leader when it gets a new block. When the AM does not support progress reporting, everything stays as zeros during the index build phase. It's easy to notice how slow hash indexes are to build compared to btrees this way! Maybe it'd be better fallback on reporting block numbers in IndexBuildHeapScan when this happens. Thoughts? I added docs to the monitoring section -- that's the bulkiest part of the patch. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
Hmm, looks like a very bare-bones support for hash indexes does not require a lot of code, and gives a clear picture (you can sit all night watching the numbers go up, instead of biting your fingernails wondering if it'll be completed by dawn.) This part isn't 100% done -- it we would better to have ambuildphasename support. (I'm a bit confused about phase 5 not reporting anything for hash indexes in CIC, though. That's part is supposed to be AM agnostic.) I think it was a mistake to define the progress constants in one header file commands/progress.h and the associated functions in pgstat.h. I think it would be better to move the function decls to commands/progress.h. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On 2019-Feb-13, Amit Langote wrote:
> Doesn't the name amphasename sound a bit too generic, given that it can
> only describe the phases of ambuild? Maybe ambuildphase?
Hmm, yeah, maybe it does. I renamed it "ambuildphasename", since it's
not about reporting the phase itself -- it's about translating the phase
number to the string that's reported to the user.
The attached patch does it that way. Also, when an index build uses an
AM that doesn't support progress reporting, it no longer reports a NULL
phase name while building. I also changed it to report the progress of
phase 7 (heap scan validation) using block numbers rather than tuple
counts. I also tweaked the strings reported in the view. They're
clearer now IMO.
One slight annoyance is that when parallel workers are used, the last
block number reported in phase 3/subphase 2 (IndexBuildHeapScan stuff)
is not necessarily accurate, since the tail of the table could well be
scanned by a worker that's not the leader, and we only report in the
leader when it gets a new block.
When the AM does not support progress reporting, everything stays as
zeros during the index build phase. It's easy to notice how slow hash
indexes are to build compared to btrees this way! Maybe it'd be
better fallback on reporting block numbers in IndexBuildHeapScan when
this happens. Thoughts?
I added docs to the monitoring section -- that's the bulkiest part of
the patch.
1. Thank you for incorporating review comments.
Can you please rebase the latest 0001-Report-progress-of-CREATE-INDEX-operations.patch on master? Currently it does not apply on 754b90f657bd54b482524b73726dae4a9165031c
15:56:44.694 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0 | 0 | 0 | 015:56:44.705 | building index (3 of 8): sorting tuples, spool 1 (3/5) | 442478 | 442399 | 100000000 | 0 | 0 | 0
15:56:44.716 | building index (3 of 8): sorting tuples, spool 1 (3/5) | 442478 | 442399 | 100000000 | 0 | 0 | 0
15:56:44.727 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 79057 | 0 | 0
2. In the above report, even though we are reporting progress in terms of tuples_done for final btree sort & load phase we have not cleared
the blocks_done entry from previous phases. I think this can be confusing as the blocks_done does not correspond to the tuples_done in the current phase.
Rahila Syed
Performance Engineer
2ndQuadrant
http://www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi Alvaro,
Resending the email as earlier one didn't get sent on pgsql-hackers.
On 2/23/19 3:24 AM, Alvaro Herrera wrote:
On 2019-Feb-13, Amit Langote wrote:Doesn't the name amphasename sound a bit too generic, given that it can only describe the phases of ambuild? Maybe ambuildphase?Hmm, yeah, maybe it does. I renamed it "ambuildphasename", since it's not about reporting the phase itself -- it's about translating the phase number to the string that's reported to the user. The attached patch does it that way. Also, when an index build uses an AM that doesn't support progress reporting, it no longer reports a NULL phase name while building. I also changed it to report the progress of phase 7 (heap scan validation) using block numbers rather than tuple counts. I also tweaked the strings reported in the view. They're clearer now IMO. One slight annoyance is that when parallel workers are used, the last block number reported in phase 3/subphase 2 (IndexBuildHeapScan stuff) is not necessarily accurate, since the tail of the table could well be scanned by a worker that's not the leader, and we only report in the leader when it gets a new block. When the AM does not support progress reporting, everything stays as zeros during the index build phase. It's easy to notice how slow hash indexes are to build compared to btrees this way! Maybe it'd be better fallback on reporting block numbers in IndexBuildHeapScan when this happens. Thoughts? I added docs to the monitoring section -- that's the bulkiest part of the patch.
1. Thank you for incorporating review comments.
Can you please rebase the latest 0001-Report-progress-of- 15:56:44.694 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0 | 0 | 0 | 015:56:44.705 | building index (3 of 8): sorting tuples, spool 1 (3/5) | 442478 | 442399 | 100000000 | 0 | 0 | 015:56:44.716 | building index (3 of 8): sorting tuples, spool 1 (3/5) | 442478 | 442399 | 100000000 | 0 | 0 | 015:56:44.727 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 79057 | 0 | 015:56:44.738 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 217018 | 0 | 015:56:44.75 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 353804 | 0 | 0
2. In the above report, even though we are reporting progress in terms of tuples_done for final btree sort & load phase we have not cleared
the blocks_done entry from previous phases. I think this can be confusing as the blocks_done does not correspond to the tuples_done in the final btree sort & load phase.
Thank you,
the blocks_done entry from previous phases. I think this can be confusing as the blocks_done does not correspond to the tuples_done in the final btree sort & load phase.
Thank you,
Rahila Syed
Hi Rahila, Thanks for looking. On 2019-Mar-04, Rahila wrote: > 1. Thank you for incorporating review comments. > Can you please rebase the latest 0001-Report-progress-of- > CREATE-INDEX-operations.patch on master? Currently it does not apply on > 754b90f657bd54b482524b73726dae4a9165031c Hmm, rebased to current master. Didn't conflict at all when rebasing, so it's strange that it didn't apply. > > 15:56:44.694 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0 | 0 | 0 | 0 > > 15:56:44.705 | building index (3 of 8): sorting tuples, spool 1 (3/5) | 442478 | 442399 | 100000000 | 0 | 0 | 0 > > 15:56:44.716 | building index (3 of 8): sorting tuples, spool 1 (3/5) | 442478 | 442399 | 100000000 | 0 | 0 | 0 > > 15:56:44.727 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 79057 | 0 | 0 > > 15:56:44.738 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 217018 | 0 | 0 > > 15:56:44.75 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000 | 353804 | 0 | 0 > 2. In the above report, even though we are reporting progress in terms of > tuples_done for final btree sort & load phase we have not cleared > the blocks_done entry from previous phases. I think this can be confusing as > the blocks_done does not correspond to the tuples_done in the final btree > sort & load phase. Good point. Done in the attached version, wherein I also added comments to explain the IndexBuildHeapScan API change. I didn't change the hash AM implementation here. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On Mon, Mar 04, 2019 at 05:46:07PM -0300, Alvaro Herrera wrote: > Hi Rahila, > > Thanks for looking. > > On 2019-Mar-04, Rahila wrote: > > > 1. Thank you for incorporating review comments. > > Can you please rebase the latest 0001-Report-progress-of- > > CREATE-INDEX-operations.patch on master? Currently it does not apply on > > 754b90f657bd54b482524b73726dae4a9165031c > > Hmm, rebased to current master. Didn't conflict at all when rebasing, > so it's strange that it didn't apply. > > > > 15:56:44.694 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0| 0 | 0 | 0 > > > 15:56:44.705 | building index (3 of 8): sorting tuples, spool 1 (3/5) | 442478 | 442399 | 100000000| 0 | 0 | 0 > > > 15:56:44.716 | building index (3 of 8): sorting tuples, spool 1 (3/5) | 442478 | 442399 | 100000000| 0 | 0 | 0 > > > 15:56:44.727 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000| 79057 | 0 | 0 > > > 15:56:44.738 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000| 217018 | 0 | 0 > > > 15:56:44.75 | building index (3 of 8): final btree sort & load (5/5) | 442478 | 442399 | 100000000| 353804 | 0 | 0 > > 2. In the above report, even though we are reporting progress in terms of > > tuples_done for final btree sort & load phase we have not cleared > > the blocks_done entry from previous phases. I think this can be confusing as > > the blocks_done does not correspond to the tuples_done in the final btree > > sort & load phase. > > Good point. Done in the attached version, wherein I also added comments > to explain the IndexBuildHeapScan API change. I didn't change the hash > AM implementation here. Would it be a very large lift to report progress for the rest of the index types we support? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Hi Alvaro,
On Tue, 5 Mar 2019 at 08:32, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Hi Rahila,
Thanks for looking.
On 2019-Mar-04, Rahila wrote:
> 1. Thank you for incorporating review comments.
> Can you please rebase the latest 0001-Report-progress-of-
> CREATE-INDEX-operations.patch on master? Currently it does not apply on
> 754b90f657bd54b482524b73726dae4a9165031c
Hmm, rebased to current master. Didn't conflict at all when rebasing,
so it's strange that it didn't apply.
Thanks for updating the patch. Sorry, I think it wasn't that the patch needed rebasing but
I failed to apply it correctly last time. I can apply it now.
I failed to apply it correctly last time. I can apply it now.
+extern char *btbuildphasename(int64 phasenum);
1. I think int64 is too large a datatype for phasenum.
Also int32 is used for phasenum in pg_indexam_progress_phasename().
Can we have it as int8?
2.
if ((previous_blkno == InvalidBlockNumber) ||
+ (scan->rs_cblock != previous_blkno))
+ {
+ pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,
+ scan->rs_cblock);
+ previous_blkno = scan->rs_cblock;
+ }
. In validate_index_heapscan, we dont calculate blocks_done similar to IndexBuildHeapScan i.e
block_done += scan->rs_cblock - previous_blkno which IMO is more accurate.
Reporting scan->rs_cblock as blocks_done might give slightly inaccurate results as we are
still processing that block.
3. There is no if(progress) check in validate_index()/ validate_index_heapscan() code. Wont it be a problem if it is called from other
index methods which dont support reporting progress at the moment?
4. Just to clarify my understanding can you please see below comment
4. Just to clarify my understanding can you please see below comment
Quoting your following comment in cluster command progress monitor thread
while referring to progress reporting from IndexBuildHeapScan,
"One, err, small issue with that idea is that we need the param numbers
not to conflict for any "progress update providers" that are to be used
simultaneously by any command."
not to conflict for any "progress update providers" that are to be used
simultaneously by any command."
Does that mean that we can't have any other INDEX progress monitoring, use
PROGRESS_SCAN_BLOCKS_TOTAL and PROGRESS_SCAN_BLOCKS_DONE
parameter numbers to report anything but the metrics they report now.
5.
15:56:44.682 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0 | 0 | 0 | 0
15:56:44.694 | building index (3 of 8): initializing (1/5) | 442478 | 442399 | 0 | 0 | 0 | 0
15:56:44.705 | building index (3 of 8): sorting tuples, spool 1 (3/5) | 442478 | 442399 | 100000000 | 0 | 0 | 0
15:56:44.716 | building index (3 of 8): sorting tuples, spool 1 (3/5) | 442478 | 442399 | 100000000 | 0 | 0 | 0
I wonder how is the phase 'building index(3 of 8): initializing(1/5)' when the blocks_done count is increasing. Shouldn't it have
changed to reflect PROGRESS_BTREE_PHASE_INDEXBUILD_HEAPSCAN as building index(3 of 8): table scan(2/5) ?
Although I think it has been rectified in the latest patch as I now get 'table scan' phase in output as I do CREATE INDEX on table with 1000000 records
Although I think it has been rectified in the latest patch as I now get 'table scan' phase in output as I do CREATE INDEX on table with 1000000 records
Thank you,
.--
.--
Rahila Syed
Performance Engineer
2ndQuadrant
http://www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi Rahila On 2019-Mar-11, Rahila Syed wrote: > On Tue, 5 Mar 2019 at 08:32, Alvaro Herrera <alvherre@2ndquadrant.com> > wrote: > > +extern char *btbuildphasename(int64 phasenum); > > 1. I think int64 is too large a datatype for phasenum. > Also int32 is used for phasenum in pg_indexam_progress_phasename(). > Can we have it as int8? It does look strange, I agree, and the first code I wrote had it using a smaller type. However, I later realized that since the value comes directly from pg_stat_get_progress_info(), which returns int8 values, it was pointless to only accept a small fraction of the possible values for no good reason, so I widened it to int64 as you see now. > 2. > . In validate_index_heapscan, we dont calculate blocks_done similar to > IndexBuildHeapScan i.e > block_done += scan->rs_cblock - previous_blkno which IMO is more accurate. > Reporting scan->rs_cblock as blocks_done might give slightly inaccurate > results as we are > still processing that block. Thanks for pointing out that there's an off-by-one bug there (should be cblock-1). However, IndexBuildHeapScan uses more complicated code because it needs to cover for two additional things that validate_index_heapscan doesn't: parallel heapscans and synchronized seqscans. We could do that, I just saw no point in it. > 3. There is no if(progress) check in validate_index()/ > validate_index_heapscan() code. Wont it be a problem if it is called from > other index methods which dont support reporting progress at the > moment? Good question. I'll have a look. Most likely, I'll end up having things so that building an index using an unsupported index AM reports progress based on IndexBuildHeapScan / validate_index / validate_index_heapscan ... which might mean I should remove the 'progress' parameter from there and have them report unconditionally. > 4. Just to clarify my understanding can you please see below comment > > Quoting your following comment in cluster command progress monitor thread > while referring to progress reporting from IndexBuildHeapScan, > > "One, err, small issue with that idea is that we need the param numbers > not to conflict for any "progress update providers" that are to be used > simultaneously by any command." > > Does that mean that we can't have any other INDEX progress monitoring, use > PROGRESS_SCAN_BLOCKS_TOTAL and PROGRESS_SCAN_BLOCKS_DONE > parameter numbers to report anything but the metrics they report now. What I mean is that the literal parameter numbers defined as PROGRESS_SCAN_BLOCKS_DONE/TOTAL may not be used for other parameters by commands that call IndexBuildHeapScan, if those other parameters are used by the same commands simultaneously with IndexBuildHeapScan. So those parameter numbers become "reserved". > 5. > > > 15:56:44.682 | building index (3 of 8): initializing (1/5) | > > 442478 | 442399 | 0 | 0 | 0 | > > I wonder how is the phase 'building index(3 of 8): initializing(1/5)' when > the blocks_done count is increasing. Shouldn't it have > changed to reflect PROGRESS_BTREE_PHASE_INDEXBUILD_HEAPSCAN as building > index(3 of 8): table scan(2/5) ? > Although I think it has been rectified in the latest patch as I now get > 'table scan' phase in output as I do CREATE INDEX on table with 1000000 > records Yeah, this was a bug that I fixed in v5. (It was a misunderstanding about how parallel scanning is set up, IIRC). For v5, I tested both parallel and non-parallel builds, with and without sync seqscans, and everything seemed to behave correctly. Thanks for looking! I intend to post a new version later this week. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Mar 11, 2019 at 8:41 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > I wonder how is the phase 'building index(3 of 8): initializing(1/5)' when > > the blocks_done count is increasing. Shouldn't it have > > changed to reflect PROGRESS_BTREE_PHASE_INDEXBUILD_HEAPSCAN as building > > index(3 of 8): table scan(2/5) ? > > Although I think it has been rectified in the latest patch as I now get > > 'table scan' phase in output as I do CREATE INDEX on table with 1000000 > > records > > Yeah, this was a bug that I fixed in v5. (It was a misunderstanding > about how parallel scanning is set up, IIRC). For v5, I tested both > parallel and non-parallel builds, with and without sync seqscans, and > everything seemed to behave correctly. > > > Thanks for looking! I intend to post a new version later this week. I don't think that I much like this (3 of 8) and (2 of 5) stuff. It's inconsistent with what we've got already and it doesn't add much. Someone who wants to know which phase it is can look at the underlying numbers directly instead of going through the view, but most people probably won't care, and given that the phases may be of dramatically unequal length, I don't think it's adding much. Another reason why I think this is a bad idea is that there may be some operations where we don't transit all the phases in all cases; the pending patch for CLUSTER progress reporting works like that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2019-Mar-11, Robert Haas wrote: > I don't think that I much like this (3 of 8) and (2 of 5) stuff. It's > inconsistent with what we've got already and it doesn't add much. > Someone who wants to know which phase it is can look at the underlying > numbers directly instead of going through the view, but most people > probably won't care, and given that the phases may be of dramatically > unequal length, I don't think it's adding much. > > Another reason why I think this is a bad idea is that there may be > some operations where we don't transit all the phases in all cases; > the pending patch for CLUSTER progress reporting works like that. What part of it don't you like? Is it the fact that we have phase numbers in the phase name? Is it the fact that we count total phases? Is it that we have two numbers being current (phase + subphase)? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Mar 11, 2019 at 3:18 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > On 2019-Mar-11, Robert Haas wrote: > > I don't think that I much like this (3 of 8) and (2 of 5) stuff. It's > > inconsistent with what we've got already and it doesn't add much. > > Someone who wants to know which phase it is can look at the underlying > > numbers directly instead of going through the view, but most people > > probably won't care, and given that the phases may be of dramatically > > unequal length, I don't think it's adding much. > > > > Another reason why I think this is a bad idea is that there may be > > some operations where we don't transit all the phases in all cases; > > the pending patch for CLUSTER progress reporting works like that. > > What part of it don't you like? Is it the fact that we have phase > numbers in the phase name? Is it the fact that we count total phases? > Is it that we have two numbers being current (phase + subphase)? that you have phase numbers in the phase name -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2019-Mar-11, Robert Haas wrote: > On Mon, Mar 11, 2019 at 3:18 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > On 2019-Mar-11, Robert Haas wrote: > > > I don't think that I much like this (3 of 8) and (2 of 5) stuff. It's > > > inconsistent with what we've got already and it doesn't add much. > > > Someone who wants to know which phase it is can look at the underlying > > > numbers directly instead of going through the view, but most people > > > probably won't care, and given that the phases may be of dramatically > > > unequal length, I don't think it's adding much. > > > > > > Another reason why I think this is a bad idea is that there may be > > > some operations where we don't transit all the phases in all cases; > > > the pending patch for CLUSTER progress reporting works like that. > > > > What part of it don't you like? Is it the fact that we have phase > > numbers in the phase name? Is it the fact that we count total phases? > > Is it that we have two numbers being current (phase + subphase)? > > that you have phase numbers in the phase name Oh. That's easily removed. Though I have to say that other people said that they liked it so much that they would have liked to have it in the original VACUUM one too (5ba2b281-9c84-772a-cf37-17780d782936@lab.ntt.co.jp). -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Mar 11, 2019 at 3:26 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Oh. That's easily removed. Though I have to say that other people said > that they liked it so much that they would have liked to have it in the > original VACUUM one too (5ba2b281-9c84-772a-cf37-17780d782936@lab.ntt.co.jp). Huh. Well, that's another option, but then what do we do if the number of phases is not a constant? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2019-Mar-11, Robert Haas wrote: > On Mon, Mar 11, 2019 at 3:26 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > Oh. That's easily removed. Though I have to say that other people said > > that they liked it so much that they would have liked to have it in the > > original VACUUM one too (5ba2b281-9c84-772a-cf37-17780d782936@lab.ntt.co.jp). > > Huh. Well, that's another option, but then what do we do if the > number of phases is not a constant? Well, why do we care? "Some phases might be skipped". -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Mar 11, 2019 at 3:43 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > Huh. Well, that's another option, but then what do we do if the > > number of phases is not a constant? > > Well, why do we care? "Some phases might be skipped". It seems pretty confusing. I mean, in the case of the CLUSTER patch, you're either going to seq-scan the table or index-scan the table. Those are (at last check) reported using different phase numbers, but they are mutually exclusive. Generally, if you are going to do either foo -> bar -> baz -> quux or foo -> bletch -> quux, how many phases are there total? 5? 4? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2019-Mar-11, Robert Haas wrote: > On Mon, Mar 11, 2019 at 3:43 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > > Huh. Well, that's another option, but then what do we do if the > > > number of phases is not a constant? > > > > Well, why do we care? "Some phases might be skipped". > > It seems pretty confusing. I mean, in the case of the CLUSTER patch, > you're either going to seq-scan the table or index-scan the table. > Those are (at last check) reported using different phase numbers, but > they are mutually exclusive. Generally, if you are going to do either > foo -> bar -> baz -> quux or foo -> bletch -> quux, how many phases > are there total? 5? 4? Hmm. Your argument is not entirely devoid of merit, but I'm not 100% convinced either. So, in CLUSTER, the phases in the middle section are exclusive of one another. You do bar and baz, or you do bletch. But you never do bar and bletch, or bletch and baz, or bar on isolation or baz on isolation. Furthermore, the counting of phases depends on internal system state (optimizer output), not on the user's input. In CREATE INDEX, it's not exactly the same. You either have a complicated 8-phase system (CREATE INDEX CONCURRENTLY) or just a two-phase system. The phases for the second case are a strict subset of the cases in the first case. Whether to use one or the other phase sequence is entirely up to the user. On the other hand, the subphase numbers vary per AM (but I expect they're always the same for any one AM.) To me, it's not a big deal, but if we don't put the number in the phase name, then we force users to keep the reference documentation open every time they create an index. I'm not wed to anything in this area, TBH. My first patch had no phase numbers and I added them because of reviewer feedback. I do agree we should be consistent ... but on the other hand, each case is a bit different: consider VACUUM, which goes back to phase 2 after doing phase 3 for a while. You don't have that behavior for either CLUSTER or CREATE INDEX. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, 25 Mar 2019 at 22:23, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Here's v6 of this patch. I have rebased on top of today's CLUSTER
monitoring, as well as on table AM commits. The latter caused a bit of
trouble, as now the number of blocks processed by a scan is not as easy
to get as before; I added a new entry point heapscan_get_blocks_done on
heapam.c to help with that. (I suppose this will need some fixups later
on.)
I removed the "M of N" phase labels that Robert didn't like; those were
suggested by Rahila and upvoted by Amit L. I'm of two minds about
those. If you care about those and want them back, please speak up.
I see value in reporting those numbers because it gives user insight into where
we are at in the whole process without having to refer to documentation or code.
Besides here also we are reporting facts as we follow for other metrics.
I agree that it will be most effective if the phases are carried out in succession
which is not the case every time and its relevance varies for each command
as mentioned upthread by Alvaro and Robert. But I feel as long as we have in
the documentation that some phases overlap, some are mutually exclusive hence
may be skipped etc. reporting `phase number versus total phases` does provide
valuable information.
as mentioned upthread by Alvaro and Robert. But I feel as long as we have in
the documentation that some phases overlap, some are mutually exclusive hence
may be skipped etc. reporting `phase number versus total phases` does provide
valuable information.
We are able to give user a whole picture in addition to reporting progress within phases.
Thank you,
--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Rahila Syed
Performance Engineer
2ndQuadrant
http://www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi Alvaro,
Please see few comments below:2. + <entry>
+ When creating an index on a partitioned, this column is set to the
+ total number of partitions on which the index is to be created.
+ </entry>
+ </row>
+ <row>
+ <entry><structfield>partitions_done</structfield></entry>
+ <entry><type>bigint</type></entry>
+ <entry>
+ When creating an index on a partitioned, this column is set to the
I think there is a typo here 's/partitioned/partitioned table/'
3.
+ if (hscan->rs_base.rs_parallel != NULL)
+ {
+ ParallelBlockTableScanDesc bpscan;
+
+ bpscan = (ParallelBlockTableScanDesc) hscan->rs_base.rs_parallel;
+ startblock = bpscan->phs_startblock;
+ }
+ else
+ startblock = hscan->rs_startblock;
+
+ /*
+ * Might have wrapped around the end of the relation, if startblock was
+ * not zero.
+ */
+ if (hscan->rs_cblock > startblock)
+ blocks_done = hscan->rs_cblock - startblock;
+ else
+ blocks_done = hscan->rs_nblocks - startblock +
+ hscan->rs_cblock;
+
+ return blocks_done;
I think parallel scan equivalent bpscan->phs_nblocks along with hscan->rs_nblocks should be used similar to startblock computation above.
Thank you,
Rahila Syed
On Fri, 29 Mar 2019 at 23:46, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2019-Mar-29, Alvaro Herrera wrote:
> So, CLUSTER and ALTER TABLE rewrites only do non-concurrent index
> builds; and REINDEX can reuse pretty much the same wait-for metrics
> columns as CIC. So I think it's okay if I move only the metrics that
> conflict for index_build.
The attached version does it that way. I had to enlarge the param set a
bit more. (I suspect those extra columns will be useful to reindex.)
Also, rebased for recent conflicting changes.
I think we should consider a new column of an array type, where we could
put things like the list of PIDs to be waited for, the list of OIDs of
index to rebuild, or the list of partitions to build the index on.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Rahila Syed
Performance Engineer
2ndQuadrant
http://www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Here's v6 of this patch. I have rebased on top of today's CLUSTER monitoring, as well as on table AM commits. The latter caused a bit of trouble, as now the number of blocks processed by a scan is not as easy to get as before; I added a new entry point heapscan_get_blocks_done on heapam.c to help with that. (I suppose this will need some fixups later on.) I removed the "M of N" phase labels that Robert didn't like; those were suggested by Rahila and upvoted by Amit L. I'm of two minds about those. If you care about those and want them back, please speak up. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On 2019/03/26 1:53, Alvaro Herrera wrote: > Here's v6 of this patch. I have rebased on top of today's CLUSTER > monitoring, as well as on table AM commits. The latter caused a bit of > trouble, as now the number of blocks processed by a scan is not as easy > to get as before; I added a new entry point heapscan_get_blocks_done on > heapam.c to help with that. (I suppose this will need some fixups later > on.) > > I removed the "M of N" phase labels that Robert didn't like; those were > suggested by Rahila and upvoted by Amit L. I'm of two minds about > those. If you care about those and want them back, please speak up. On second thought, I'm neutral on it too. Thanks, Amit
On 2019-Mar-25, Alvaro Herrera wrote: > Here's v6 of this patch. I have rebased on top of today's CLUSTER > monitoring, as well as on table AM commits. The latter caused a bit of > trouble, as now the number of blocks processed by a scan is not as easy > to get as before; I added a new entry point heapscan_get_blocks_done on > heapam.c to help with that. (I suppose this will need some fixups later > on.) Andres, I suppose you have something to say about patch 0001 here? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2019-03-25 23:11:00 -0300, Alvaro Herrera wrote: > On 2019-Mar-25, Alvaro Herrera wrote: > > > Here's v6 of this patch. I have rebased on top of today's CLUSTER > > monitoring, as well as on table AM commits. The latter caused a bit of > > trouble, as now the number of blocks processed by a scan is not as easy > > to get as before; I added a new entry point heapscan_get_blocks_done on > > heapam.c to help with that. (I suppose this will need some fixups later > > on.) > > Andres, I suppose you have something to say about patch 0001 here? I've not followed this thread at all, so I might just be out of my depth here. From my POV, a later patch in the yet-to-be-applied patchqueue moves the main part of cluster below the table AM (as there's enough low level details, e.g. dealing with HOT). Therefore I don't have a problem having heap's implementation interrogate the scan in a heap specific manner. Is that the angle you were wondering about? If not, any chance to point out more precisely what to look at? Obviously out of pure laziness, I'd prefer this to go in after my move of index creation scans & cluster below tableam.h. But admittedly, managing my exhaustion isn't the the sole goal of the project.... Greetings, Andres Freund
Hi On 2019-Mar-25, Andres Freund wrote: > I've not followed this thread at all, so I might just be out of my depth > here. From my POV, a later patch in the yet-to-be-applied patchqueue > moves the main part of cluster below the table AM (as there's enough low > level details, e.g. dealing with HOT). Therefore I don't have a problem > having heap's implementation interrogate the scan in a heap specific > manner. > > Is that the angle you were wondering about? If not, any chance to point > out more precisely what to look at? > > Obviously out of pure laziness, I'd prefer this to go in after my move > of index creation scans & cluster below tableam.h. But admittedly, > managing my exhaustion isn't the the sole goal of the project.... Well, this is create index rather than cluster, but yes this conflicts pretty heavily with patch 0008 in your v21 at 20190324031630.nt7numguo5ojq6uv@alap3.anarazel.de. I wonder if I should rather push and help merge your 0008, or wait until you push and deal with it afterwards. I'd rather do the former, I think. Anyway I was thinking about the conceptual angle -- the progress monitoring stuff is doing block-based reporting. I think even if we get a non-block-based heap, we can still report the number of physical blocks already processed by the scan, which is what the index build monitoring is interested in showing the user. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Here's v7. This is rebased on top of yesterday's tableam commit reworking the index build API (thanks Rahila for letting me know it had already rot). No changes otherwise. Got rid of 0001 because the tableam changes made that unnecessary. (Each new table AM will have to include its own progress reporting for index builds in its index_build_range_scan method, which is sensible.) Patch 0003 now takes care of all the AMs. This supports the index build phase as well as the index-validate-heapscan for CONCURRENTLY builds; the indexscan scan there is not reported, which is just a small portion of the index build so I don't feel bad about that; it can probably be added with just a two-line patch on each AM's ambulkdelete method as a subsequent patch. I have not reinstated phase numbers; I have Rahila's positive vote for them. Do I hear any more votes on this issue? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On 2019-Mar-10, David Fetter wrote: > Would it be a very large lift to report progress for the rest of the > index types we support? Patch v7 I just posted does that. Please give it a look and let me know what you think. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, 28 Mar 2019 at 14:56, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
I have not reinstated phase numbers; I have Rahila's positive vote for
them. Do I hear any more votes on this issue?
If there is a specific technical issue, I'd like to understand that more. If it is just a usability preference, then I say we should have numbers.
Numbering is natural for people. If we say "It's currently doing phase XYZ", they will say "Is that the 3rd phase?", we'll say "No, actually the 5th", and then they will say "Why didn't you just number them?"
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Mar-28, Simon Riggs wrote: > On Thu, 28 Mar 2019 at 14:56, Alvaro Herrera <alvherre@2ndquadrant.com> > wrote: > > > I have not reinstated phase numbers; I have Rahila's positive vote for > > them. Do I hear any more votes on this issue? > > If there is a specific technical issue, I'd like to understand that more. There's no technical issue -- that's pretty straightforward. Earlier versions of the patch had them, and removing them only meant editing strings in a couple of places. > If it is just a usability preference, then I say we should have numbers. > > Numbering is natural for people. If we say "It's currently doing phase > XYZ", they will say "Is that the 3rd phase?", we'll say "No, actually the > 5th", and then they will say "Why didn't you just number them?" There are eight phases. If you run normal CREATE INDEX (not concurrent) then you get phases 1, then 3, done. If you run CIC you get phases from 1 to 8. Phase 3 "building index" has arbitrary subphases (they depend on AM) in both cases. I think the lack of phase numbering comes from the fact that the first command we did (VACUUM) sometimes jumps backwards in phase numbers, so it would be a bit absurd from users's POV. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, 28 Mar 2019 at 15:39, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2019-Mar-28, Simon Riggs wrote:
> On Thu, 28 Mar 2019 at 14:56, Alvaro Herrera <alvherre@2ndquadrant.com>
> wrote:
>
> > I have not reinstated phase numbers; I have Rahila's positive vote for
> > them. Do I hear any more votes on this issue?
>
> If there is a specific technical issue, I'd like to understand that more.
There's no technical issue -- that's pretty straightforward. Earlier
versions of the patch had them, and removing them only meant editing
strings in a couple of places.
> If it is just a usability preference, then I say we should have numbers.
>
> Numbering is natural for people. If we say "It's currently doing phase
> XYZ", they will say "Is that the 3rd phase?", we'll say "No, actually the
> 5th", and then they will say "Why didn't you just number them?"
There are eight phases. If you run normal CREATE INDEX (not concurrent)
then you get phases 1, then 3, done. If you run CIC you get phases from
1 to 8. Phase 3 "building index" has arbitrary subphases (they depend
on AM) in both cases.
Maybe the AM won't know, but I don't think that matters. It's still useful to know the difference between Phase 3.3 and Phase 3.33 and Phase 7.
The description only helps you if you understand what it means. If your AM replies something many users wouldn't understand like "сортировка" or "constructing triples", we still want to know where that step fits in the overall sequence of steps.
I think the lack of phase numbering comes from the fact that the first
command we did (VACUUM) sometimes jumps backwards in phase numbers, so
it would be a bit absurd from users's POV.
Seems more like our own labelling of the phases is responsible for that, rather than it being a specific problem. The numbering should reflect the ordinal executed step number. So if a VACUUM has required two sets of index scanning, the heap scan phase (normally phase 3) should be labelled phase 6 when it occurs the second time, rather than "phase 3 again, doh" which clearly doesn't work.
By the time VACUUM moves to its 2nd phase, which is normally thought of as "Phase2 Index Scanning", we know how much of the table has been scanned, so we really should be able to calculate how many more phases will be needed. We also know how many AM sub-phases will be called for that step.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Mar 28, 2019 at 12:07 PM Simon Riggs <simon@2ndquadrant.com> wrote: > Seems more like our own labelling of the phases is responsible for that, rather than it being a specific problem. The numberingshould reflect the ordinal executed step number. So if a VACUUM has required two sets of index scanning, the heapscan phase (normally phase 3) should be labelled phase 6 when it occurs the second time, rather than "phase 3 again,doh" which clearly doesn't work. That would not be too simple to do with the infrastructure we have available, I think. Also, Alvaro's showed phase names like '3 of 8', but if you regarded each set of index scans as a separate phase rather than a repetition of a phase that had already happened, you wouldn't know whether there were going to be 8 phases in total or some other number, because you don't know how many times you're going to scan the indexes. I suggest that it makes sense to leave the phase numbers out of this commit. If someone wants to make a proposal for adding phase numbers to the various commands that now support progress reporting as a separate commit, then that can be discussed separately. BTW, if we are going to do that, it might be best to put then in separate view columns rather than making them part of the phase names. Somebody might want to look for those phase names using SQL or client-side logic and not have the logic get broken if we renumber the phases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
I just noticed that the CLUSTER calls index_build, which my patch modifies to include additional progress metrics; this means that during the index build phase, the metrics set by CLUSTER will be trashed by the ones my patch introduces. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2019-03-29 12:02:18 -0300, Alvaro Herrera wrote: > I just noticed that the CLUSTER calls index_build, which my patch > modifies to include additional progress metrics; this means that during > the index build phase, the metrics set by CLUSTER will be trashed by the > ones my patch introduces. Yea, it really seems that the index build infrastructure needs to support cooperating with the caller's progress reporting. For CLUSTER, REINDEX, ALTER TABLE rewrites etc, they all would likely want to have insight into the index build while also having their own progress. Greetings, Andres Freund
On 2019-Mar-29, Alvaro Herrera wrote: > I just noticed that the CLUSTER calls index_build, which my patch > modifies to include additional progress metrics; this means that during > the index build phase, the metrics set by CLUSTER will be trashed by the > ones my patch introduces. Indeed: pid | datid | datname | relid | command | phase | cluster_index_relid | heap_tuples_scanned | heap_tuples_written| heap_blks_total | heap_blks_scanned | index_rebuild_count 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 162402 | 162402 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 460362 | 460362 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 754004 | 754004 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 1047058 | 1047058 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 1356296 | 1356296 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 1645321 | 1645321 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 1939920 | 1939920 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 2227450 | 2227450 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 2526116 | 2526116 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 2828468 | 2828468 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 3142982 | 3142982 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 3451494 | 3451494 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 3769799 | 3769799 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 4077513 | 4077513 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 4383255 | 4383255 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 4700286 | 4700286 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 5015468 | 5015468 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 5324951 | 5324951 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 5628172 | 5628172 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 5940862 | 5940862 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 6253778 | 6253778 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 6560474 | 6560474 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 6881248 | 6881248 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 7186555 | 7186555 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 7503888 | 7503888 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 7821706 | 7821706 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 8137334 | 8137334 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 8453280 | 8453280 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 8764097 | 8764097 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 9079908 | 9079908 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 9389465 | 9389465 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 9706288 | 9706288 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 10024875 | 10024875 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 10344875 | 10344875 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 10658531 | 10658531 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 10974097 | 10974097 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 11289868 | 11289868 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 11602965 | 11602965 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 11917624 | 11917624 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 12237998 | 12237998 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 12559406 | 12559406 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 12876413 | 12876413 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 13192793 | 13192793 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 13506188 | 13506188 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 13822722 | 13822722 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 14138624 | 14138624 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 14432335 | 14432335 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 16387 | 14753617 | 14753617 | 0 | 0 | 0 --> pid | datid | datname | relid | command | phase | cluster_index_relid | heap_tuples_scanned | heap_tuples_written| heap_blks_total | heap_blks_scanned | index_rebuild_count 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 743 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 4639 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 8938 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 13457 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 17806 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 22237 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 26594 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 30935 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 35362 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 39705 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 44126 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 48546 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 52966 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 57174 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 61536 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 65868 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 66193 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 66193 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 66193 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 66193 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 2 | 15000000 | 15000000 | 0 | 66372 | 66193 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 5 | 15000000 | 15000000 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 5 | 15000000 | 15000000 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 5 | 15000000 | 15000000 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 5 | 15000000 | 15000000 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 5 | 15000000 | 15000000 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 5 | 15000000 | 15000000 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 5 | 15000000 | 15000000 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 5 | 15000000 | 15000000 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 5 | 15000000 | 15000000 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 5 | 15000000 | 15000000 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 5 | 15000000 | 15000000 | 0 | 0 | 0 28457 | 12750 | alvherre | 16384 | CLUSTER | index scanning heap | 5 | 15000000 | 15000000 | 0 | 0 | 0 I suppose I can just pick non-overlapping numbers for those columns. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Mar-29, Andres Freund wrote: > Hi, > > On 2019-03-29 12:02:18 -0300, Alvaro Herrera wrote: > > I just noticed that the CLUSTER calls index_build, which my patch > > modifies to include additional progress metrics; this means that during > > the index build phase, the metrics set by CLUSTER will be trashed by the > > ones my patch introduces. > > Yea, it really seems that the index build infrastructure needs to > support cooperating with the caller's progress reporting. For CLUSTER, > REINDEX, ALTER TABLE rewrites etc, they all would likely want to have > insight into the index build while also having their own progress. So, CLUSTER and ALTER TABLE rewrites only do non-concurrent index builds; and REINDEX can reuse pretty much the same wait-for metrics columns as CIC. So I think it's okay if I move only the metrics that conflict for index_build. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Mar-29, Alvaro Herrera wrote: > So, CLUSTER and ALTER TABLE rewrites only do non-concurrent index > builds; and REINDEX can reuse pretty much the same wait-for metrics > columns as CIC. So I think it's okay if I move only the metrics that > conflict for index_build. The attached version does it that way. I had to enlarge the param set a bit more. (I suspect those extra columns will be useful to reindex.) Also, rebased for recent conflicting changes. I think we should consider a new column of an array type, where we could put things like the list of PIDs to be waited for, the list of OIDs of index to rebuild, or the list of partitions to build the index on. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On Fri, Mar 29, 2019 at 2:16 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > I think we should consider a new column of an array type, where we could > put things like the list of PIDs to be waited for, the list of OIDs of > index to rebuild, or the list of partitions to build the index on. This has to work with a fixed-size chunk of shared memory. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2019-Mar-29, Robert Haas wrote: > On Fri, Mar 29, 2019 at 2:16 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > I think we should consider a new column of an array type, where we could > > put things like the list of PIDs to be waited for, the list of OIDs of > > index to rebuild, or the list of partitions to build the index on. > > This has to work with a fixed-size chunk of shared memory. Bah, of course. Maybe we can consider using dynamic shmem for that, and include a pointer to it in the fixed-size chunk. (It's a bit too late to be writing this code, mind; I'm just proposing this for a future improvement.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Mar 29, 2019 at 3:28 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > On 2019-Mar-29, Robert Haas wrote: > > On Fri, Mar 29, 2019 at 2:16 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > > I think we should consider a new column of an array type, where we could > > > put things like the list of PIDs to be waited for, the list of OIDs of > > > index to rebuild, or the list of partitions to build the index on. > > > > This has to work with a fixed-size chunk of shared memory. > > Bah, of course. > > Maybe we can consider using dynamic shmem for that, and include a > pointer to it in the fixed-size chunk. (It's a bit too late to be > writing this code, mind; I'm just proposing this for a future > improvement.) Sounds expensive. We don't want to spend a lot of energy pushing out progress reports which, often enough, nobody will ever examine. I designed the current system as I did to make it cheap. Adding DSM in there would open up lots of exciting new failure possibilities and significantly increase the overhead. And probably add quite a bit of code complexity, too. There's probably room for an elaborate progress-reporting facility in PostgreSQL that can even handle arbitrary stuff like queries. But I think it might look a lot different from this one, which is designed and intended to handle simple cases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2019-Mar-29, Robert Haas wrote: > On Fri, Mar 29, 2019 at 3:28 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > Maybe we can consider using dynamic shmem for that, and include a > > pointer to it in the fixed-size chunk. (It's a bit too late to be > > writing this code, mind; I'm just proposing this for a future > > improvement.) > > Sounds expensive. We don't want to spend a lot of energy pushing out > progress reports which, often enough, nobody will ever examine. I > designed the current system as I did to make it cheap. Well, I'm not proposing this for things that would change more than once or a very limited number of times during one command; certainly not once per tuple or per block like other metrics do. The examples I mentioned are once per command (eg., list of OIDs of partitions to process) or list of PIDs to wait for, which we wouldn't modify it once set for each waiting cycle (three times for CREATE INDEX CONCURRENTLY, five times for REINDEX CONCURRENTLY). > Adding DSM in there would open up lots of exciting new failure > possibilities and significantly increase the overhead. And probably > add quite a bit of code complexity, too. Yeah, that's true. Anyway, I'm not intending to tackle this for the time being. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi Rahila, thanks for reviewing. On 2019-Mar-25, Rahila Syed wrote: > Please see few comments below: > > 1. Makecheck fails currently as view definition of expected rules.out does > not reflect latest changes in progress metrics numbering. Ouch ... fixed. > 2. + <entry> > I think there is a typo here 's/partitioned/partitioned table/' Ah, so there is. Fixed. > 3. > I think parallel scan equivalent bpscan->phs_nblocks along with > hscan->rs_nblocks should be used similar to startblock computation above. Hmm, yeah. I think the way things are setup currently it doesn't matter much, but it seems fragile to rely on that. I also moved the reporting of total blocks to scan in the initial table scan so that it occurs wholly in heapam_index_build_range_scan; I had originally put that code in _bt_spools_heapscan, but that was a modularity violation I think. (It didn't make a practical difference, but it made no sense to have the two cases report the number in wildly different locations.) Also added a final nblocks metric update after the scan is done. I think this patch is done. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
Hi,
On Mon, 1 Apr 2019 at 21:40, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Hi Rahila, thanks for reviewing.
On 2019-Mar-25, Rahila Syed wrote:
> Please see few comments below:
>
> 1. Makecheck fails currently as view definition of expected rules.out does
> not reflect latest changes in progress metrics numbering.
Ouch ... fixed.
> 2. + <entry>
> I think there is a typo here 's/partitioned/partitioned table/'
Ah, so there is. Fixed.
> 3.
> I think parallel scan equivalent bpscan->phs_nblocks along with
> hscan->rs_nblocks should be used similar to startblock computation above.
Hmm, yeah. I think the way things are setup currently it doesn't matter
much, but it seems fragile to rely on that.
Thank you for incorporating the review comments.
I also moved the reporting of total blocks to scan in the initial table
scan so that it occurs wholly in heapam_index_build_range_scan; I had
originally put that code in _bt_spools_heapscan, but that was a
modularity violation I think. (It didn't make a practical difference,
but it made no sense to have the two cases report the number in wildly
different locations.) Also added a final nblocks metric update after
the scan is done.
I think this patch is done.
I tested the v8 patch by running plain CREATE INDEX, CIC, and for partitioned tables
and the results are as expected. Please see few observations below.
and the results are as expected. Please see few observations below.
1. FWIW, below results for CIC show that blocks_done does not become equal to blocks_total at the end of the phase or it processes last 800 blocks so fast that
the update is not visible in less than 1 secs interval.
Mon Mar 25 11:06:31 IST 2019
pid | datid | datname | relid | phase | lockers_total | lockers_done | current_locker_pid | blocks_total | blocks_done | tuples_total | tuples_done | partitions_total | partitions_done
-------+-------+----------+-------+----------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
10630 | 13533 | postgres | 16384 | building index: table scan | 0 | 0 | 0 | 1293366 | 1292535 | 0 | 0 | 0 | 0
(1 row)
Mon Mar 25 11:06:31 IST 2019
pid | datid | datname | relid | phase | lockers_total | lockers_done | current_locker_pid | blocks_total | blocks_done | tuples_total | tuples_done | partitions_total | partitions_done
-------+-------+----------+-------+-----------------------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
10630 | 13533 | postgres | 16384 | building index: sorting tuples, spool 1 | 0 | 0 | 0 | 0 | 0 | 200000000 | 0 | 0 | 0
(1 row)
2. However in case of partitioned tables, the following difference in blocks_done versus blocks_total at the end of phase is notably high for the first partition . Subsequent partitions show negligible difference.
Partition 1:
Mon Mar 25 14:27:57 IST 2019
pid | datid | datname | relid | phase | lockers_total | lockers_done | current_locker_pid | blocks_total | blocks_done | tuples_total | tuples_done | partitions_total | partitions_done
-------+-------+----------+-------+----------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
10630 | 13533 | postgres | 16394 | building index: table scan | 0 | 0 | 0 | 381342 | 221233 | 0 | 0 | 3 | 0
(1 row)
Mon Mar 25 14:27:57 IST 2019
pid | datid | datname | relid | phase | lockers_total | lockers_done | current_locker_pid | blocks_total | blocks_done | tuples_total | tuples_done | partitions_total | partitions_done
-------+-------+----------+-------+-----------------------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
10630 | 13533 | postgres | 16394 | building index: sorting tuples, spool 1 | 0 | 0 | 0 | 0 | 0 | 49999999 | 0 | 3 | 0
pid | datid | datname | relid | phase | lockers_total | lockers_done | current_locker_pid | blocks_total | blocks_done | tuples_total | tuples_done | partitions_total | partitions_done
-------+-------+----------+-------+----------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
10630 | 13533 | postgres | 16394 | building index: table scan | 0 | 0 | 0 | 381342 | 221233 | 0 | 0 | 3 | 0
(1 row)
Mon Mar 25 14:27:57 IST 2019
pid | datid | datname | relid | phase | lockers_total | lockers_done | current_locker_pid | blocks_total | blocks_done | tuples_total | tuples_done | partitions_total | partitions_done
-------+-------+----------+-------+-----------------------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+-----------------
10630 | 13533 | postgres | 16394 | building index: sorting tuples, spool 1 | 0 | 0 | 0 | 0 | 0 | 49999999 | 0 | 3 | 0
The partitions are equal in size and the other two partitions have blocks_done and blocks_total to be approx. 221233. The blocks_total for partition 1 is reported higher.
3. Sorry for nitpicking, I think following phase name can be made more consistent with the others.
The non-am specific phase for scanning a table is named as scan heap while am-specific one is named as table scan.
Can we use heap for am-specific one as well since heap is used elsewhere in progress reporting too?
+ scan = table_beginscan_parallel(btspool->heap,
+ ParallelTableScanFromBTShared(btshared));
Is this change required?
Besides the above comments ,the patch looks good to me.
Thank you,
--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Rahila Syed
Performance Engineer
2ndQuadrant
http://www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Apr-02, Rahila Syed wrote: > 1. FWIW, below results for CIC show that blocks_done does not become equal > to blocks_total at the end of the phase or it processes last 800 blocks so > fast that > the update is not visible in less than 1 secs interval. Yeah, I noticed this too and decided it's not fixable, nor it's desirable to spend a lot of effort in getting it perfectly accurate -- I mean, we could introduce locking or sleeping to get the results we want, but do we really want to make the index building process slower just to report those block numbers. Anyway, I think this effect is caused by parallel btree building: those final blocks are scanned by a worker process, and the leader didn't get the latest block number scanned. If you set max_parallel_maintenance_workers to 0, the effect disappears. (I used \watch 0.01 to see even faster progress updates; even in that case the final batch of block numbers is not seen in the updates. The btree build code is stupidly fast.) > 2. However in case of partitioned tables, the following difference in > blocks_done versus blocks_total at the end of phase is notably high for the > first partition . Subsequent partitions show negligible difference. > Partition 1: > Mon Mar 25 14:27:57 IST 2019 > pid | datid | datname | relid | phase | > lockers_total | lockers_done | current_locker_pid | blocks_total | > blocks_done | tuples_total | tuples_done | partitions_total | > partitions_done > -------+-------+----------+-------+----------------------------+---------------+--------------+--------------------+--------------+-------------+--------------+-------------+------------------+----------------- > 10630 | 13533 | postgres | 16394 | building index: table scan > | 0 | 0 | 0 | 381342 | > 221233 | 0 | 0 | 3 | 0 > (1 row) Hmm, in my tests with partitioned tables, I never noticed such a large discrepancy. I'm going to have another look. 800 blocks scanned by workers I can believe, but 160000 sounds a bit too much. > 3. Sorry for nitpicking, I think following phase name can be made more > consistent with the others. > The non-am specific phase for scanning a table is named as scan heap while > am-specific one is named as table scan. > Can we use heap for am-specific one as well since heap is used elsewhere in > progress reporting too? Hmm, I'd rather go the other way and use "table" everywhere rather than heap, since we've been getting a lot of stuff done for table AMs. > 4. - scan = table_beginscan_parallel(btspool->heap, > ParallelTableScanFromBTShared(btshared)); > + scan = table_beginscan_parallel(btspool->heap, > + > ParallelTableScanFromBTShared(btshared)); > > Is this change required? Yes, for my OCD. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I did this (I should stop c&p'ing this silly little setup code sometime): create table t (a int) partition by hash (a); create table t1 partition of t for values with (modulus 10, remainder 0); create table t2 partition of t for values with (modulus 10, remainder 1); create table t3 partition of t for values with (modulus 10, remainder 2); create table t4 partition of t for values with (modulus 10, remainder 3); create table t5 partition of t for values with (modulus 10, remainder 4); create table t6 partition of t for values with (modulus 10, remainder 5); create table t7 partition of t for values with (modulus 10, remainder 6); create table t8 partition of t for values with (modulus 10, remainder 7); create table t9 partition of t for values with (modulus 10, remainder 8); create table t10 partition of t for values with (modulus 10, remainder 9); insert into t select * from generate_series(1, 100 * 1000 * 1000); Here's a complete report for CIC on a partition (since partitioned tables don't support CIC anyway): select relid::regclass, phase, format('lockers: %s/%s (%s)', lockers_done, lockers_total, current_locker_pid) as lockers, format('blocks: %s/%s', blocks_done, blocks_total) as blocks, format('tuples: %s/%s', tuples_done, tuples_total) as tuples, format('partitions: %s/%s', partitions_done, partitions_total) as partitions from pg_stat_progress_create_index \watch 0,1 lun 01 abr 2019 19:02:31 -03 (cada 0,1s) relid | phase | lockers | blocks | tuples | partitions -------+--------------------------------+------------------+------------------+-------------+----------------- t2 | building index: scanning table | lockers: 0/0 (0) | blocks: 86/44254 | tuples: 0/0 | partitions: 0/0 t2 | building index: scanning table | lockers: 0/0 (0) | blocks: 4639/44254 | tuples: 0/0 | partitions: 0/0 t2 | building index: scanning table | lockers: 0/0 (0) | blocks: 10890/44254 | tuples: 0/0 | partitions: 0/0 t2 | building index: scanning table | lockers: 0/0 (0) | blocks: 17841/44254 | tuples: 0/0 | partitions: 0/0 t2 | building index: scanning table | lockers: 0/0 (0) | blocks: 22899/44254 | tuples: 0/0 | partitions: 0/0 t2 | building index: scanning table | lockers: 0/0 (0) | blocks: 29668/44254 | tuples: 0/0 | partitions: 0/0 t2 | building index: scanning table | lockers: 0/0 (0) | blocks: 35531/44254 | tuples: 0/0 | partitions: 0/0 t2 | building index: scanning table | lockers: 0/0 (0) | blocks: 41375/44254 | tuples: 0/0 | partitions: 0/0 t2 | building index: scanning table | lockers: 0/0 (0) | blocks: 44254/44254 | tuples: 0/0 | partitions: 0/0 t2 | building index: scanning table | lockers: 0/0 (0) | blocks: 44254/44254 | tuples: 0/0 | partitions: 0/0 t2 | building index: scanning table | lockers: 0/0 (0) | blocks: 44254/44254 | tuples: 0/0 | partitions: 0/0 t2 | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 291737/10001366 | partitions:0/0 t2 | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1652429/10001366 | partitions:0/0 t2 | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2984365/10001366 | partitions:0/0 t2 | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4139066/10001366 | partitions:0/0 t2 | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5463784/10001366 | partitions:0/0 t2 | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6699498/10001366 | partitions:0/0 t2 | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7947694/10001366 | partitions:0/0 t2 | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9311113/10001366 | partitions:0/0 t2 | index validation: scan index | lockers: 0/0 (0) | blocks: 2542/27426 | tuples: 0/0 | partitions: 0/0 t2 | index validation: scan index | lockers: 0/0 (0) | blocks: 7667/27426 | tuples: 0/0 | partitions: 0/0 t2 | index validation: scan index | lockers: 0/0 (0) | blocks: 15334/27426 | tuples: 0/0 | partitions: 0/0 t2 | index validation: scan index | lockers: 0/0 (0) | blocks: 23001/27426 | tuples: 0/0 | partitions: 0/0 t2 | index validation: sort index scan results | lockers: 0/0 (0) | blocks: 0/0 | tuples: 0/0 | partitions: 0/0 t2 | index validation: scan heap | lockers: 0/0 (0) | blocks: 2586/44254 | tuples: 0/0 | partitions: 0/0 t2 | index validation: scan heap | lockers: 0/0 (0) | blocks: 8180/44254 | tuples: 0/0 | partitions: 0/0 t2 | index validation: scan heap | lockers: 0/0 (0) | blocks: 13807/44254 | tuples: 0/0 | partitions: 0/0 t2 | index validation: scan heap | lockers: 0/0 (0) | blocks: 19365/44254 | tuples: 0/0 | partitions: 0/0 t2 | index validation: scan heap | lockers: 0/0 (0) | blocks: 24990/44254 | tuples: 0/0 | partitions: 0/0 t2 | index validation: scan heap | lockers: 0/0 (0) | blocks: 30943/44254 | tuples: 0/0 | partitions: 0/0 t2 | index validation: scan heap | lockers: 0/0 (0) | blocks: 36817/44254 | tuples: 0/0 | partitions: 0/0 t2 | index validation: scan heap | lockers: 0/0 (0) | blocks: 43025/44254 | tuples: 0/0 | partitions: 0/0 Here's the report for building the index on the partitioned table, same query, this time \watch 0,05: t | building index: scanning table | lockers: 0/0 (0) | blocks: 168/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 2022/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 4378/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 6419/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 8599/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 10955/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 12976/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 15627/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 17389/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 20213/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 22351/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 24788/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 26880/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 29382/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 31289/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 33826/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 35815/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 38415/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 40335/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 43002/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 361108/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1054447/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1749066/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2433169/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3129578/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3626695/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4308212/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5001391/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5698987/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6391459/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6894091/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7593037/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8286403/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8974321/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9667525/9996429 | partitions:0/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9996429/9996429 | partitions:0/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 2476/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 4639/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 6717/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 8952/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 11245/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 13361/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 15777/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 17716/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 20335/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 22114/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 24831/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 26529/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 29405/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 31153/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 33981/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 35595/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 38511/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 40114/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 43040/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44254/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44254/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44254/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44254/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44254/44254 | tuples: 0/0 | partitions: 1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44254/44254 | tuples: 0/0 | partitions: 1/10 t | building index: sorting live tuples | lockers: 0/0 (0) | blocks: 0/0 | tuples: 0/10001366 | partitions: 1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 500779/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1201309/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1883508/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2582863/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3258358/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3750979/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4430740/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5105701/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5798905/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6484789/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6992211/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7689065/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8391543/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9087049/10001366 | partitions:1/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9764485/10001366 | partitions:1/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 269/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 2861/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 4639/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 6016/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 8266/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 9916/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 12793/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 14423/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 17302/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 18936/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 21837/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 23480/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 26369/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 27993/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 30891/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 32525/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 35357/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 37096/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 39846/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 41645/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44276/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44276/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44276/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44276/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44276/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44276/44276 | tuples: 0/0 | partitions: 2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44276/44276 | tuples: 0/0 | partitions: 2/10 t | building index: sorting live tuples | lockers: 0/0 (0) | blocks: 0/0 | tuples: 0/10006231 | partitions: 2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 663688/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1355654/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2057087/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2764765/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3377603/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3987118/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4685206/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5387918/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6091400/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6694322/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7304995/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7984067/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8672517/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9351567/10006231 | partitions:2/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 10006230/10006231 | partitions:2/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 1178/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 3833/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 5288/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 8125/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 9760/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 12663/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 14323/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 17252/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 18880/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 21807/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 23444/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 26360/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 28008/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 30932/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 32587/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 35375/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 37090/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 39827/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 41657/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44255/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44255/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44255/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44255/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44255/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44255/44255 | tuples: 0/0 | partitions: 3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44255/44255 | tuples: 0/0 | partitions: 3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 118053/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 820650/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1512165/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2201720/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2883192/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3388429/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4077607/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4770319/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5453432/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6130568/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6725569/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7327477/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7989790/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8685547/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9365865/10001417 | partitions:3/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9981553/10001417 | partitions:3/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 1060/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 3570/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 4880/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 7585/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 9172/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 12060/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 13670/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 16529/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 18132/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 20976/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 22557/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 25393/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 26973/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 29800/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 31372/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 34214/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 35789/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 38667/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 39728/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 40861/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 43730/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44259/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44259/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44259/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44259/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44259/44259 | tuples: 0/0 | partitions: 4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44259/44259 | tuples: 0/0 | partitions: 4/10 t | building index: sorting live tuples | lockers: 0/0 (0) | blocks: 0/0 | tuples: 0/10002444 | partitions: 4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 518863/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1211461/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1907959/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2597929/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3285807/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3804693/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4495350/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5195042/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5890039/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6587559/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7097473/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7786066/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8480443/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9174157/10002444 | partitions:4/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9855283/10002444 | partitions:4/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 559/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 3158/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 4639/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 7416/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 9069/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 11908/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 13567/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 16473/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 18147/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 21024/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 22690/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 25582/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 27239/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 30148/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 31812/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 34722/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 36378/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 39290/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 40959/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 43844/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44249/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44249/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44249/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44249/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44249/44249 | tuples: 0/0 | partitions: 5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44249/44249 | tuples: 0/0 | partitions: 5/10 t | building index: sorting live tuples | lockers: 0/0 (0) | blocks: 0/0 | tuples: 0/10000082 | partitions: 5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 701199/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1386576/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2026138/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2686458/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3328278/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3886189/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4566129/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5250205/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5936354/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6632954/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7131820/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7801535/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8472974/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9122825/10000082 | partitions:5/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9801504/10000082 | partitions:5/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 429/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 3011/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 4639/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 7332/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 8987/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 11878/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 13537/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 16425/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 18106/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 21040/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 22718/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 25635/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 27312/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 30215/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 31876/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 34800/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 36455/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 39359/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 41032/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 43943/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44220/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44220/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44220/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44220/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44220/44220 | tuples: 0/0 | partitions: 6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44220/44220 | tuples: 0/0 | partitions: 6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 25987/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 732379/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1424513/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2081443/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2704663/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3342516/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3812257/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4489696/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5146693/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5833638/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6524497/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7029547/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7715519/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8417783/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9099477/9993603 | partitions:6/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9781549/9993603 | partitions:6/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 368/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 2921/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 4639/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 7160/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 8878/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 11603/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 13265/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 16175/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 17962/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 20751/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 22442/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 25320/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 26957/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 29858/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 31467/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 34382/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 36023/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 38874/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 40532/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 43439/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44259/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44259/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44259/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44259/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44259/44259 | tuples: 0/0 | partitions: 7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44259/44259 | tuples: 0/0 | partitions: 7/10 t | building index: sorting live tuples | lockers: 0/0 (0) | blocks: 0/0 | tuples: 0/10002524 | partitions: 7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 560347/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1248199/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1947121/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2634011/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3308164/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3829093/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4522592/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5201695/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5894625/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6592741/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7108453/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7782573/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8433007/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9098304/10002524 | partitions:7/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9769812/10002524 | partitions:7/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 300/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 2880/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 4639/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 7121/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 8974/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 11610/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 13440/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 16136/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 17879/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 20700/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 22293/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 25260/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 26863/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 29775/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 31406/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 34318/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 35974/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 38869/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 40507/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 43431/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44232/44232 | tuples: 0/0 | partitions: 8/10 t | building index: sorting live tuples | lockers: 0/0 (0) | blocks: 0/0 | tuples: 0/9996288 | partitions: 8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 556257/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1261969/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1962859/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2666023/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3351782/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3870067/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4467186/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5040844/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5596873/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6273009/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6744020/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7260343/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7836896/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8362115/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8982007/9996288 | partitions:8/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9683282/9996288 | partitions:8/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 257/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 2853/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 4639/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 7071/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 8961/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 11536/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 13377/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 16115/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 17798/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 20649/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 22334/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 25185/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 26818/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 29740/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 31283/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 34208/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 35835/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 38531/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 39637/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 41722/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44082/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44247/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44247/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44247/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44247/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44247/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44247/44247 | tuples: 0/0 | partitions: 9/10 t | building index: scanning table | lockers: 0/0 (0) | blocks: 44247/44247 | tuples: 0/0 | partitions: 9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 89251/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 724681/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1352185/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 1978963/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 2545165/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3116125/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 3495200/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4038079/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 4612813/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5239737/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 5738502/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6336605/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 6779349/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 7417770/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8041496/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 8722536/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9142846/9999616 | partitions:9/10 t | building index: loading tuples in tree | lockers: 0/0 (0) | blocks: 0/0 | tuples: 9782646/9999616 | partitions:9/10 I think this is working well enough. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I noticed a couple of inconsistent uses of "heap" which I changed to "table", and also changed one update_multi_params() call that had only one param to update_param(). Very minor other tweaks here and there. And pushed. Thanks for the reviews! -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
It seems we can easily extend this to also monitor REINDEX [CONCURRENTLY]. Attached is a quick patch. For the concurrently part, we currently don't have any phases defined for the index swap and drop, but maybe we can just skip that initially. What happens if we don't have those? It might be nice to have a column in the view not only for the table OID but also the index OID. That is obviously not so useful for CREATE INDEX but more useful for REINDEX. I haven't looked into how adding that would work. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On 2019-Apr-05, Peter Eisentraut wrote: > It seems we can easily extend this to also monitor REINDEX > [CONCURRENTLY]. Attached is a quick patch. That's much easier than I was expecting. I think we should endeavor to get this done for pg12. > For the concurrently part, we currently don't have any phases defined > for the index swap and drop, but maybe we can just skip that initially. > What happens if we don't have those? Users are going to wonder why the other phases don't appear to complete for a long time :-) Keep in mind that the "waiting" phases are very confusing to users. I suggest we just define additional phase numbers for those phases, then switch the "false" argument to WaitForLockersMultiple to "true", and it should work :-) Doc-wise, list all the phases in the same docbook table, indicate that REINDEX is also covered, and document in an easier-to-follow fashion which phases each command goes through. > It might be nice to have a column in the view not only for the table OID > but also the index OID. That is obviously not so useful for CREATE > INDEX but more useful for REINDEX. I haven't looked into how adding > that would work. Yeah, I think that's simple enough -- the CLUSTER one already does that, I think. Another thing for REINDEX TABLE is that we should add a count of indexes to process, and how many are done. I was wondering about reporting the command being run. In the progress_cluster view we have a "command" column, which says either CLUSTER or VACUUM FULL. I didn't add one for CREATE INDEX vs. CONCURRENTLY because it seemed to me that joining to pg_stat_activity ought to be sufficient. If we agree with that reasoning, then we should remove the column from the CLUSTER view, I think. If not, we should add one to the create_index view. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-04-05 17:01, Alvaro Herrera wrote: > Users are going to wonder why the other phases don't appear to complete > for a long time :-) Keep in mind that the "waiting" phases are very > confusing to users. I suggest we just define additional phase numbers > for those phases, then switch the "false" argument to > WaitForLockersMultiple to "true", and it should work :-) Doc-wise, list > all the phases in the same docbook table, indicate that REINDEX is also > covered, and document in an easier-to-follow fashion which phases each > command goes through. Done in the attached patch. I've reworded the phases a bit. There was a bit of a mixup of waiting for snapshots and waiting for lockers. Perhaps not so important from a user's perspective, but at least now it's more consistent with the source code comments. > Yeah, I think that's simple enough -- the CLUSTER one already does that, > I think. Added that. > Another thing for REINDEX TABLE is that we should add a count > of indexes to process, and how many are done. Reasonable, but maybe a bit too much for the last moment. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On 2019-Apr-05, Peter Eisentraut wrote: > I've reworded the phases a bit. There was a bit of a mixup of waiting > for snapshots and waiting for lockers. Perhaps not so important from a > user's perspective, but at least now it's more consistent with the > source code comments. No disagreement with that. Looks reasonable. I didn't test the patch, but it seems OK in a quick once-over. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-04-06 06:40, Alvaro Herrera wrote: > On 2019-Apr-05, Peter Eisentraut wrote: > >> I've reworded the phases a bit. There was a bit of a mixup of waiting >> for snapshots and waiting for lockers. Perhaps not so important from a >> user's perspective, but at least now it's more consistent with the >> source code comments. > > No disagreement with that. Looks reasonable. > > I didn't test the patch, but it seems OK in a quick once-over. committed -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services