Re: error context for vacuum to include block number (atomicprogress update)

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: error context for vacuum to include block number (atomicprogress update)
Дата
Msg-id 20191229201747.GL12890@telsasoft.com
обсуждение исходный текст
Ответ на Re: error context for vacuum to include block number  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: error context for vacuum to include block number (atomicprogress update)  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
On Sat, Dec 28, 2019 at 07:21:31PM -0500, Robert Haas wrote:
> On Thu, Dec 26, 2019 at 10:57 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > I agree that's better.
> > I don't see any reason why the progress params need to be updated atomically.
> > So rebasified against your patch.
> 
> I am not sure whether it's important enough to make a stink about, but
> it bothers me a bit that this is being dismissed as unimportant. The
> problem is that, if the updates are not atomic, then somebody might
> see the data after one has been updated and the other has not yet been
> updated. The result is that when the phase is
> PROGRESS_VACUUM_PHASE_VACUUM_INDEX, someone reading the information
> can't tell whether the number of index scans reported is the number
> *previously* performed or the number performed including the one that
> just finished. The race to see the latter state is narrow, so it
> probably wouldn't come up often, but it does seem like it would be
> confusing if it did happen.

What used to be atomic was this:

-               hvp_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_HEAP;
-               hvp_val[1] = vacrelstats->num_index_scans + 1;

=> switch from PROGRESS_VACUUM_PHASE_VACUUM INDEX to HEAP and increment
index_vacuum_count, which is documented as the "Number of completed index
vacuum cycles."

Now, it 1) increments the number of completed scans; and, 2) then progresses
phase to HEAP, so there's a window where the number of completed scans is
incremented, and it still says VACUUM_INDEX.

Previously, if it said VACUUM_INDEX, one could assume that index_vacuum_count
would increase at least once more, and that's no longer true.  If someone sees
VACUUM_INDEX and some NUM_INDEX_VACUUMS, and then later sees VACUUM_HEAP or
other later stage, with same (maybe final) value of NUM_INDEX_VACUUMS, that's
different than previous behavior.

It seems to me that a someone or their tool monitoring pg_stat shouldn't be
confused by this change, since:
1) there's no promise about how high NUM_INDEX_VACUUMS will or won't go; and, 
2) index_vacuum_count didn't do anything strange like decreasing, or increased
before the scans were done; and,
3) the vacuum can finish at any time, and the monitoring process presumably
knows that when the PID is gone, it's finished, even if it missed intermediate
updates;

The behavior is different from before, but I think that's ok: the number of
scans is accurate, and the PHASE is accurate, even though it'll change a moment
later.

I see there's similar case here:
|    /* report all blocks vacuumed; and that we're cleaning up */
|    pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
|    pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,
|                                 PROGRESS_VACUUM_PHASE_INDEX_CLEANUP);

heap_blks_scanned is documented as "Number of heap blocks SCANNED", and it
increments exactly to heap_blks_total.  Would someone be confused if
heap_blks_scanned==heap_blks_total AND phase=='scanning heap' ?  I think they'd
just expect PHASE to be updated a moment later.  (And if it wasn't, I agree they
should then be legitimately confused or concerned).

Actually, the doc says:
|If heap_blks_scanned is less than heap_blks_total, the system will return to
|scanning the heap after this phase is completed; otherwise, it will begin
|cleaning up indexes AFTER THIS PHASE IS COMPLETED.

I read that to mean that it's okay if heap_blks_scanned==heap_blks_total when
scanning/vacuuming heap.

Justin



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: TAP testing for psql's tab completion code
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: [HACKERS] Block level parallel vacuum