Re: Turning off HOT/Cleanup sometimes

Поиск
Список
Период
Сортировка
От Pavan Deolasee
Тема Re: Turning off HOT/Cleanup sometimes
Дата
Msg-id CABOikdM6zPgaDyeJrvQMXwmJx94DZRKmFUdKpU_zPtT9ZA0enQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Turning off HOT/Cleanup sometimes  (Greg Stark <stark@mit.edu>)
Ответы Re: Turning off HOT/Cleanup sometimes  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers


On Thu, Apr 16, 2015 at 2:47 PM, Greg Stark <stark@mit.edu> wrote:


On 15 Apr 2015 15:43, "Simon Riggs" <simon@2ndquadrant.com> wrote:
>
> It all depends upon who is being selfish. Why is a user "selfish" for
> not wanting to clean every single block they scan, when the people
> that made the mess do nothing and go faster 10 minutes from now?
> Randomly and massively penalising large SELECTs makes no sense. Some
> cleanup is OK, with reasonable limits, which is why that is proposed.

I don't think it's productive to think of a query as a different actor with only an interest in its own performance and no interest in overall system performance.

From a holistic point of view the question is how many times is a given hit chain going to need to be followed before it's pruned. Or to put it another way, how expensive is creating a hot chain. Does it cause a single prune? a fixed number of chain readers followed by a prune? Does the amount of work depend on the workload or is it consistent?


IMO the size or traversal of the HOT chain is not that expensive compared to the cost of either pruning too frequently, which generates WAL as well as makes buffers dirty. OTOH cost of less frequent pruning could also be very high. It can cause severe table bloat which may just stay for a very long time. Even if dead space is recovered within a page, truncating a bloated heap is not always possible. In such cases, even SELECTs would be slowed down just because they need to read/scan far more pages than they otherwise would have. IOW its probably wrong to assume that not-pruning quickly enough will have impact only on the non-SELECT queries.

I also concur with arguments upthread that this change needs to be carefully calibrated because it can lead to significant degradation for certain workloads.

My intuition, again, is that what we need is a percentage such as "do 10 prunes then ignore the next 1000 clean pages with hot chains. That guarantees that after 100 selects the hot chains will all be pruned but each select will only prune 1% of the clean pages it sees.

I think some such proposal was made in the last. There could be knob to control how much a read-only query (or may be a read-only transaction) should do HOT cleanup, say as a percentage of pages it looks at. The default can be left at 100% in the first release so that the current behaviour is not suddenly disrupted. But it will allow others to play with the percentages and then based on field reports, we can change defaults in the next releases.

Thanks,
Pavan

--

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
Следующее
От: Etsuro Fujita
Дата:
Сообщение: Re: Optimization for updating foreign tables in Postgres FDW