Re: Planning performance problem (67626.278ms)

Поиск

Список

Период

Сортировка

От	Manuel Weitzman
Тема	Re: Planning performance problem (67626.278ms)
Дата	14 июня 2021 г. 20:25:38
Msg-id	EFE95962-F088-45E2-BD13-AA5012829A69@gmail.com обсуждение исходный текст
Ответ на	Re: Planning performance problem (67626.278ms) (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: Planning performance problem (67626.278ms)
Список	pgsql-performance

Дерево обсуждения

> However, I'm skeptical that any problem actually remains in
> real-world use cases.

Hello Tom,

We also had some issues with planning and get_actual_variable_range(). We
actually found some interesting behaviour that probably requires an eye with
better expertise in how the planner works.
For the example being discussed you can add some joins into the equation and
planning times deteriorate quite a bit.
I'll just skip posting the first executions as it is already established that
a subsequent one will be faster.


create table b (b int primary key, a int references a(a))
with (autovacuum_enabled=off);

insert into a select x from generate_series(1,10000000) x;
insert into b select x, x from generate_series(1,10000000) x;
create index b_a_idx on b(a);
analyze a, b;


For our case a rollback of a bulk insert causes bloat on the index.


begin;
insert into a select x from generate_series(10000001,20000000) x;
rollback;

explain (analyze, buffers)
select * from a
join b on (b.a = a.a)
where b.a in (1,100,10000,1000000,1000001);

 Planning:
   Buffers: shared hit=9 read=27329
 Planning Time: 134.560 ms
 Execution Time: 0.100 ms


I see a lot of buffers being read for some reason (wasn't this fixed?). And
times are slow too. But it get's worse with each join added to the select.


explain (analyze, buffers)
select * from a
join b b1 on (b1.a = a.a)
join b b2 on (b2.a = a.a)
where b1.a in (1,100,10000,1000000,1000001);

 Planning:
   Buffers: shared hit=38 read=81992
 Planning Time: 312.826 ms
 Execution Time: 0.131 ms

Just add a few more joins and it is a recipe for disaster.
Apparently, the planner isn't reusing the data boundaries across alternative
plans. It would be nicer if the planner remembered each column boundaries
for later reuse (within the same planner execution).

Another thing that worries me is that even the second run has faster planning
it is still way slower than the case without lots of bloat in the index. And
I don't think this is just an edge case. Rollbacks on bulk inserts can be
quite common, and joins are expected in a SQL database.

We had downtime due to how the planner works on this case. Unfortunately
setting more aggressive vacuum settings won't fix our problems. Most of the
read queries are being issued to a replica. When the issues with the planner
start happening, CPU usage on that node goes to 100% which interferes with the
replication process.
This means the replica cannot get to a new checkpoint with a new live
max value in the index nor can it delete the bloat that vacuum has already
cleaned on the leader server.

Oh, by the way, we're running version 13.2


Regards,

Manuel

В списке pgsql-performance по дате отправления:

Предыдущее

От: Laurenz Albe
Дата: 14 июня 2021 г., 19:18:23
Сообщение: Re: overcommit_ratio setting

Следующее

От: Haseeb Khan
Дата: 14 июня 2021 г., 22:36:46
Сообщение: Master - Slave Replication Window Server

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Planning performance problem (67626.278ms)

Предыдущее

Следующее