Re: [PROPOSAL] VACUUM Progress Checker.

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [PROPOSAL] VACUUM Progress Checker.
Дата
Msg-id CA+TgmoZqXUhTxvX6Z5M76_PFo63DJsKqCFxk-zC+KxpXKDBt=g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PROPOSAL] VACUUM Progress Checker.  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [PROPOSAL] VACUUM Progress Checker.  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-hackers
On Thu, Dec 10, 2015 at 9:49 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Oh, please, no.  Gosh, this is supposed to be a lightweight facility!
>> Just have a chunk of shared memory and write the data in there.  If
>> you try to feed this through the stats collector you're going to
>> increase the overhead by 100x or more, and there's no benefit.  We've
>> got to do relation stats that way because there's no a priori bound on
>> the number of relations, so we can't just preallocate enough shared
>> memory for all of them.  But there's no similar restriction here: the
>> number of backends IS fixed at startup time.  As long as we limit the
>> amount of progress information that a backend can supply to some fixed
>> length, which IMHO we definitely should, there's no need to add the
>> expense of funneling this through the stats collector.
>
> I agree with this, and I'd further add that if we don't have a
> fixed-length progress state, we've overdesigned the facility entirely.
> People won't be able to make sense of anything that acts much more
> complicated than "0% .. 100% done".  So you need to find a way of
> approximating progress of a given command in terms more or less
> like that, even if it's a pretty crude approximation.

That I don't agree with.  Even for something like VACUUM, it's pretty
hard to approximate overall progress - because, for example, normally
we'll only have 1 index scan per index, but we might have multiple
index scans or none if maintenance_work_mem is too small or if there
aren't any dead tuples after all.  I don't want our progress reporting
facility to end up with this reputation:

https://xkcd.com/612/

This point has already been discussed rather extensively upthread, but
to reiterate, I think it's much better to report slightly more
detailed information and let the user figure out what to do with it.
For example, for a VACUUM, I think we should report something like
this:

1. The number of heap pages scanned thus far.
2. The number of dead tuples found thus far.
3. The number of dead tuples we can store before we run out of
maintenance_work_mem.
4. The number of index pages processed by the current index vac cycle
(or a sentinel value if none is in progress).
5. The number of heap pages for which the "second heap pass" has been completed.

Now, if the user wants to flatten this out to a progress meter, they
can write an SQL expression which does that easily enough, folding the
sizes of the table and its indices and whatever assumptions they want
to make about what will happen down the road.  If we all agree on how
that should be done, it can even ship as a built-in view.  But I
*don't* think we should build those assumptions into the core progress
reporting facility.  For one thing, that would make updating the
progress meter considerably more expensive - you'd have to recompute a
new completion percentage instead of just saying "heap pages processed
went up by one".  For another thing, there are definitely going to be
some people that want the detailed information - and I can practically
guarantee that if we don't make it available, at least one person will
write a tool that tries to reverse-engineer the detailed progress
information from whatever we do report.

Heck, I might do it myself.  If I find a long-running vacuum on a
customer system that doesn't seem to be making progress, knowing that
it's 69% complete and that the completion percentage isn't rising
doesn't help me much.  What I want to know is are we stuck in a heap
vacuum phase, or an index vacuum phase, or a second heap pass.  Are we
making progress so slowly that 69% takes forever to get to 70%, or are
we making absolutely no progress at all?  I think if we don't report a
healthy amount of detail here people will still frequently have to
resort to what I do now, which is ask the customer to install strace
and attach it to the vacuum process.  For many customers, that's not
so easy; and it never inspires any confidence.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Is postgresql on Windows compatible with flex 2.6.0?
Следующее
От: Robert Haas
Дата:
Сообщение: Re: mdnblocks() sabotages error checking in _mdfd_getseg()