Re: [HACKERS] Block level parallel vacuum
От | Masahiko Sawada |
---|---|
Тема | Re: [HACKERS] Block level parallel vacuum |
Дата | |
Msg-id | CAD21AoB7jaApns9=S3uTndZnTJsLtGMmN4Ad7zg1pp1dyHD47Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] Block level parallel vacuum (Dilip Kumar <dilipbalaut@gmail.com>) |
Список | pgsql-hackers |
On Fri, Oct 18, 2019 at 3:48 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Oct 18, 2019 at 8:45 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > > > > > On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > > > On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > Another point in this regard is that the user anyway has an option to > > > > > > > turn off the cost-based vacuum. By default, it is anyway disabled. > > > > > > > So, if the user enables it we have to provide some sensible behavior. > > > > > > > If we can't come up with anything, then, in the end, we might want to > > > > > > > turn it off for a parallel vacuum and mention the same in docs, but I > > > > > > > think we should try to come up with a solution for it. > > > > > > > > > > > > I finally got your point and now understood the need. And the idea I > > > > > > proposed doesn't work fine. > > > > > > > > > > > > So you meant that all workers share the cost count and if a parallel > > > > > > vacuum worker increase the cost and it reaches the limit, does the > > > > > > only one worker sleep? Is that okay even though other parallel workers > > > > > > are still running and then the sleep might not help? > > > > > > > > > > > > > > Remember that the other running workers will also increase > > > > VacuumCostBalance and whichever worker finds that it becomes greater > > > > than VacuumCostLimit will reset its value and sleep. So, won't this > > > > make sure that overall throttling works the same? > > > > > > > > > I agree with this point. There is a possibility that some of the > > > > > workers who are doing heavy I/O continue to work and OTOH other > > > > > workers who are doing very less I/O might become the victim and > > > > > unnecessarily delay its operation. > > > > > > > > > > > > > Sure, but will it impact the overall I/O? I mean to say the rate > > > > limit we want to provide for overall vacuum operation will still be > > > > the same. Also, isn't a similar thing happens now also where heap > > > > might have done a major portion of I/O but soon after we start > > > > vacuuming the index, we will hit the limit and will sleep. > > > > > > Actually, What I meant is that the worker who performing actual I/O > > > might not go for the delay and another worker which has done only CPU > > > operation might pay the penalty? So basically the worker who is doing > > > CPU intensive operation might go for the delay and pay the penalty and > > > the worker who is performing actual I/O continues to work and do > > > further I/O. Do you think this is not a practical problem? > > > > > > > I don't know. Generally, we try to delay (if required) before > > processing (read/write) one page which means it will happen for I/O > > intensive operations, so I am not sure if the point you are making is > > completely correct. > > Ok, I agree with the point that we are checking it only when we are > doing the I/O operation. But, we also need to consider that each I/O > operations have a different weightage. So even if we have a delay > point at I/O operation there is a possibility that we might delay the > worker which is just performing read buffer with page > hit(VacuumCostPageHit). But, the other worker who is actually > dirtying the page(VacuumCostPageDirty = 20) continue the work and do > more I/O. > > > > > > Stepping back a bit, OTOH, I think that we can not guarantee that the > > > one worker who has done more I/O will continue to do further I/O and > > > the one which has not done much I/O will not perform more I/O in > > > future. So it might not be too bad if we compute shared costs as you > > > suggested above. > > > > > > > I am thinking if we can write the patch for both the approaches (a. > > compute shared costs and try to delay based on that, b. try to divide > > the I/O cost among workers as described in the email above[1]) and do > > some tests to see the behavior of throttling, that might help us in > > deciding what is the best strategy to solve this problem, if any. > > What do you think? > > I agree with this idea. I can come up with a POC patch for approach > (b). Meanwhile, if someone is interested to quickly hack with the > approach (a) then we can do some testing and compare. Sawada-san, > by any chance will you be interested to write POC with approach (a)? Yes, I will try to write the PoC patch with approach (a). Regards, -- Masahiko Sawada
В списке pgsql-hackers по дате отправления: