Re: [HACKERS] CLUSTER command progress monitor
От | Peter Geoghegan |
---|---|
Тема | Re: [HACKERS] CLUSTER command progress monitor |
Дата | |
Msg-id | CAH2-Wzm2=6Jq4E-Kht=DaAG4BKsC+cyCu=dHu=AgV=2VW2b0WA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] CLUSTER command progress monitor (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Список | pgsql-hackers |
On Tue, Dec 18, 2018 at 2:47 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Well, if you think about individual blocks in terms of storage space, > maybe that's true, but I meant in an Heraclitus way of men never > stepping into the same river -- the second time you write the block, > it's not the same block you wrote before, so you count it twice. It's > not the actual disk space utilization that matters, but how much I/O > have you done (even if it is just to kernel cache, I suppose). Right. > I suppose mapping such numbers to actual progress is a bit of an art (or > intuition as you say), but it seems to be the best we can do, if we do > anything at all. I think that it's fairly useful. I suspect that you don't have to have my theoretical grounding in sorting to be able to do almost as well. All you need is a little bit of experience. > How good are those predictions? The feeling I get from this thread is > that if the estimation of the number of passes is unreliable, it's > better not to report it at all; just return how many we've done thus > far. It's undesirable to report that we're about 150% done (or take > hours to get to 40% done, then suddenly be over). Maybe it isn't that reliable. But on second thought I think that it might not matter, and maybe we should just not do that. "How slow can I make this sort go by subtracting work_mem?" is a game that I like to play sometimes. This blogpost plays that game, and reaches some pretty amusing conclusions: https://www.cybertec-postgresql.com/en/postgresql-improving-sort-performance/ It says that sorting numeric is 60% slower when you do an external sort. But it's an apples to asteroids comparison, because the comparison is made between 4MB of work_mem, and 1GB. I think that it's pretty damn impressive that it's only 60% slower! Besides, even that difference is probably on the high side of average, because numeric abbreviated keys work particularly well, and you won't get the same benefit with a unique numeric values when you happen to be doing a lot of merging. If you tried the same experiment with integers, or even text + abbreviated keys, I bet the difference would be a lot smaller. Despite the huge gap in the amount of memory used. On modern hardware, where doing some amount of random I/O is not that noticeable, you'll have a very hard time finding a case where even a paltry amount of memory with many passes does all that much worse than an internal sort (OTOH, it's not hard to find cases where an external sort is *faster*). Even if you make a generic estimate, it's still probably going to be pretty good, because there just isn't that much variation in how long the sort will take as you vary the amount of memory it can use. Some people will be surprised at this, but it's a pretty robust effect. (This is why I think that a hash_mem GUC might be a good medium term solution that improves upon work_mem -- the situation is dramatically different when it comes to hashing.) My point is that you could offer users the kind of insight they'd find very useful with only a very crude estimate of the amount of merging. Even if it was 60% slower than initially projected, that's still not an awful estimate to most users. That just leaves initial run generation, but it's relatively easy to accurately estimate the amount of initial runs. I rarely see a case where merging takes more than 40% of the total, barring parallel CREATE INDEX. > I wonder if internal sorts are really all that interesting from the PoV > of progress reporting. Also, I have the impression that quicksort isn't > very amenable to letting you know how much work is left. It is hard to predict the duration of one massive quicksort, but it's seems fairly easy to recognize a kind of cadence across multiple quicksorts/runs that each take seconds to a couple of minutes. That's going to be the vast, vast majority of cases we care about. -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: