Re: increasing effective_cache_size slows down join queries by a factor of 4000x

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: increasing effective_cache_size slows down join queries by a factor of 4000x
Дата
Msg-id ee5300c3-7f49-ed52-0e7f-33563f240ba8@enterprisedb.com
обсуждение исходный текст
Ответ на Re: increasing effective_cache_size slows down join queries by a factor of 4000x  (A Shaposhnikov <artyom@gmail.com>)
Список pgsql-general

On 2/15/22 01:06, A Shaposhnikov wrote:
> 
> Interestingly I have a second PG 14.2 database, with identical table 
> definitions, but about 10% smaller row counts, and the exact same query 
> works fast there without the 2nd condition:
> 

Are you sure about the 10%? Because in the plans from the first machine 
I see this:

 >
 > -> Index Scan using team_pkey on team t (cost=0.57..11382381.88
 > rows=78693167 width=175) (actual time=0.016..0.695 rows=854 loops=1)
 >

while the second machine does this:

> 
> -> Index Scan using team_pkey on team t (cost=0.57..2366113.83 
> rows=2807531 width=160) (actual time=0.031..0.801 rows=888 loops=1)
> 

That's 2.8M vs. 78M, quite far from "10% difference". Not sure about 
team_aliases table, that's imposible to say from the plans.

This may matter a lot, because we use effective cache size to calculate 
cache hit ratio for the query, with relation sizes as an input. So 
smaller relations (or larger effective_cache_size) means cheaper random 
I/O, hence preference for nested loop join.

The other thing is data distribution - that may matter too.


IMO it's pointless to investigate this further - we know what's causing 
the issue. The optimizer is oblivious that merge join will have to skip 
large part of the second input, due to the implicit condition. Notice 
that adding the condition changes the cost from:

  Limit (cost=81.33..331.82 rows=1000 width=183) ...

to

  Limit (cost=81.33..720.48 rows=1000 width=183) ...

So it seems *more* expensive than the first plan. Taken to the extreme 
the planner could theoretically have chosen to use the first plan (and 
delay the condition until after the join).

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-general по дате отправления:

Предыдущее
От: Marc Millas
Дата:
Сообщение: Re: Moving the master to a new server
Следующее
От: Glen Eustace
Дата:
Сообщение: Re: Moving the master to a new server