Re: [HACKERS] CLUSTER command progress monitor

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] CLUSTER command progress monitor
Дата
Msg-id CAH2-Wzm2=6Jq4E-Kht=DaAG4BKsC+cyCu=dHu=AgV=2VW2b0WA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] CLUSTER command progress monitor  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
On Tue, Dec 18, 2018 at 2:47 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Well, if you think about individual blocks in terms of storage space,
> maybe that's true, but I meant in an Heraclitus way of men never
> stepping into the same river -- the second time you write the block,
> it's not the same block you wrote before, so you count it twice.  It's
> not the actual disk space utilization that matters, but how much I/O
> have you done (even if it is just to kernel cache, I suppose).

Right.

> I suppose mapping such numbers to actual progress is a bit of an art (or
> intuition as you say), but it seems to be the best we can do, if we do
> anything at all.

I think that it's fairly useful. I suspect that you don't have to have
my theoretical grounding in sorting to be able to do almost as well.
All you need is a little bit of experience.

> How good are those predictions?  The feeling I get from this thread is
> that if the estimation of the number of passes is unreliable, it's
> better not to report it at all; just return how many we've done thus
> far.  It's undesirable to report that we're about 150% done (or take
> hours to get to 40% done, then suddenly be over).

Maybe it isn't that reliable. But on second thought I think that it
might not matter, and maybe we should just not do that.

"How slow can I make this sort go by subtracting work_mem?" is a game
that I like to play sometimes. This blogpost plays that game, and
reaches some pretty amusing conclusions:

https://www.cybertec-postgresql.com/en/postgresql-improving-sort-performance/

It says that sorting numeric is 60% slower when you do an external
sort. But it's an apples to asteroids comparison, because the
comparison is made between 4MB of work_mem, and 1GB. I think that it's
pretty damn impressive that it's only 60% slower! Besides, even that
difference is probably on the high side of average, because numeric
abbreviated keys work particularly well, and you won't get the same
benefit with a unique numeric values when you happen to be doing a lot
of merging. If you tried the same experiment with integers, or even
text + abbreviated keys, I bet the difference would be a lot smaller.
Despite the huge gap in the amount of memory used.

On modern hardware, where doing some amount of random I/O is not that
noticeable, you'll have a very hard time finding a case where even a
paltry amount of memory with many passes does all that much worse than
an internal sort (OTOH, it's not hard to find cases where an external
sort is *faster*). Even if you make a generic estimate, it's still
probably going to be pretty good, because there just isn't that much
variation in how long the sort will take as you vary the amount of
memory it can use. Some people will be surprised at this, but it's a
pretty robust effect. (This is why I think that a hash_mem GUC might
be a good medium term solution that improves upon work_mem -- the
situation is dramatically different when it comes to hashing.)

My point is that you could offer users the kind of insight they'd find
very useful with only a very crude estimate of the amount of merging.
Even if it was 60% slower than initially projected, that's still not
an awful estimate to most users. That just leaves initial run
generation, but it's relatively easy to accurately estimate the amount
of initial runs. I rarely see a case where merging takes more than 40%
of the total, barring parallel CREATE INDEX.

> I wonder if internal sorts are really all that interesting from the PoV
> of progress reporting.  Also, I have the impression that quicksort isn't
> very amenable to letting you know how much work is left.

It is hard to predict the duration of one massive quicksort, but it's
seems fairly easy to recognize a kind of cadence across multiple
quicksorts/runs that each take seconds to a couple of minutes. That's
going to be the vast, vast majority of cases we care about.

-- 
Peter Geoghegan


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Doc typo?
Следующее
От: David Fetter
Дата:
Сообщение: Re: Doc typo?