Обсуждение: extend statistics help reduce index scan a lot of shared buffer hits.

Поиск

Список

Период

Сортировка

extend statistics help reduce index scan a lot of shared buffer hits.

От

James Pang

Дата:

27 февраля, 16:54:48

Postgresql 14.8, Redhat8. looks like have to create extend statistics on indexed and joined columns to make join filters pushed down to secondary index scan in nestloop, and the shared buffer hits show big difference.

is it expected ?

SELECT ....
FROM
mtgxxxxxxxx a LEFT OUTER JOIN mtgxxxxxxxext b ON a.sssid = b.sssid and a.MMMUUID = b.MMMUUID and a.uuid = b.uuid
WHERE
a.SSSID=$1
AND a.MMMUID=$2
ORDER BY a.XXXX asc
offset 300 rows
FETCH FIRST 51 ROWS ONLY

explain (analyze,buffers) slowsql1(...)

1. with default, join filters just after nestloop,
Limit (cost=5.61..5.62 rows=1 width=1169) (actual time=2249.443..2249.454 rows=51 loops=1)
Buffers: shared hit=3864917
-> Sort (cost=5.61..5.61 rows=1 width=1169) (actual time=2249.404..2249.438 rows=351 loops=1)
Sort Key: a.email
Sort Method: top-N heapsort Memory: 174kB
Buffers: shared hit=3864917
-> Nested Loop Left Join (cost=1.12..5.60 rows=1 width=1169) (actual time=1.335..2246.971 rows=2142 loops=1)
Join Filter: ((a.siteid = b.siteid) AND ((a.mtguuid)::text = (b.mtguuid)::text) AND (a.uuid = b.uuid))
Rows Removed by Join Filter: 4586022
Buffers: shared hit=3864917
-> Index Scan using idx_mmmcfattlist_mmmuuid_f on mtgxxxxxxxx a (cost=0.56..2.79 rows=1 width=1093) (actual time=0.026..5.318 rows
=2142 loops=1)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Filter: ((joinstatus = ANY (ARRAY[$3, $4])) OR ((usertype)::text = 'Testlist'::text))
Buffers: shared hit=2891
-> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.78 rows=1 width=133) (actual time=0.016..0.698
rows=2142 loops=2142)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Buffers: shared hit=3862026 <<< here huge shared hits.
Planning Time: 0.033 ms
Execution Time: 2249.527 ms

create statistics mtgxxxxxxext_sssid_mmmuuid(dependencies,ndistinct) on sssid, mmmuuid from mtgxxxxxxxext.
analyze mtgxxxxxxxext.

2. join filters pushed down to secondary index scan, and reduce a lot of shared blks access.

-------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------
Limit (cost=5.61..5.62 rows=1 width=1245) (actual time=12.371..12.380 rows=51 loops=1)
Buffers: shared hit=12865
-> Sort (cost=5.61..5.61 rows=1 width=1245) (actual time=12.333..12.364 rows=351 loops=1)
Sort Key: a.email
Sort Method: top-N heapsort Memory: 174kB
Buffers: shared hit=12865
-> Nested Loop Left Join (cost=1.12..5.60 rows=1 width=1245) (actual time=0.042..10.819 rows=2142 loops=1)
Buffers: shared hit=12865
-> Index Scan using idx_mmmcfattlist_mmmuuid_f on mtgxxxxxxxx a (cost=0.56..2.79 rows=1 width=1169) (actual time=0.025..2.492
rows=2142 loops=1)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Filter: ((joinstatus = ANY (ARRAY[$3, $4])) OR ((usertype)::text = 'Testlist'::text))
Buffers: shared hit=2891
-> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.79 rows=1 width=133) (actual time=0.003..0
.003 rows=1 loops=2142)
Index Cond: ((sssid = a.sssid) AND (sssid = $1) AND ((mmmuuid)::text = (a.mmmuuid)::text) AND ((mmmuuid)::text = ($2)::text) AND (uui
d = a.uuid))
Buffers: shared hit=10710 <<< here much less shared hits , and Index Cond automatically added sssid = a.sssid ,mmuuid=a.mmmuuid.
Planning Time: 0.021 ms
Execution Time: 12.451 ms
(17 rows)

Thanks,

James

Fwd: extend statistics help reduce index scan a lot of shared buffer hits.

От

James Pang

Дата:

27 февраля, 16:58:59

is it expected ?

SELECT ....
FROM
mtgxxxxxxxx a LEFT OUTER JOIN mtgxxxxxxxext b ON a.sssid = b.sssid and a.MMMUUID = b.MMMUUID and a.uuid = b.uuid
WHERE
a.SSSID=$1
AND a.MMMUID=$2
ORDER BY a.XXXX asc
offset 300 rows
FETCH FIRST 51 ROWS ONLY

explain (analyze,buffers) slowsql1(...)

1. with default, join filters just after nestloop,
Limit (cost=5.61..5.62 rows=1 width=1169) (actual time=2249.443..2249.454 rows=51 loops=1)
Buffers: shared hit=3864917
-> Sort (cost=5.61..5.61 rows=1 width=1169) (actual time=2249.404..2249.438 rows=351 loops=1)
Sort Key: a.email
Sort Method: top-N heapsort Memory: 174kB
Buffers: shared hit=3864917
-> Nested Loop Left Join (cost=1.12..5.60 rows=1 width=1169) (actual time=1.335..2246.971 rows=2142 loops=1)
Join Filter: ((a.sssid = b.sssid) AND ((a.mmmuuid)::text = (b.mmmuuid)::text) AND (a.uuid = b.uuid))
Rows Removed by Join Filter: 4586022
Buffers: shared hit=3864917
-> Index Scan using idx_mmmcfattlist_mmmuuid_f on mtgxxxxxxxx a (cost=0.56..2.79 rows=1 width=1093) (actual time=0.026..5.318 rows
=2142 loops=1)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Filter: ((joinstatus = ANY (ARRAY[$3, $4])) OR ((usertype)::text = 'Testlist'::text))
Buffers: shared hit=2891
-> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.78 rows=1 width=133) (actual time=0.016..0.698
rows=2142 loops=2142)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Buffers: shared hit=3862026 <<< here huge shared hits.
Planning Time: 0.033 ms
Execution Time: 2249.527 ms

create statistics mtgxxxxxxext_sssid_mmmuuid(dependencies,ndistinct) on sssid, mmmuuid from mtgxxxxxxxext.
analyze mtgxxxxxxxext.

2. join filters pushed down to secondary index scan, and reduce a lot of shared blks access.

-------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------
Limit (cost=5.61..5.62 rows=1 width=1245) (actual time=12.371..12.380 rows=51 loops=1)
Buffers: shared hit=12865
-> Sort (cost=5.61..5.61 rows=1 width=1245) (actual time=12.333..12.364 rows=351 loops=1)
Sort Key: a.email
Sort Method: top-N heapsort Memory: 174kB
Buffers: shared hit=12865
-> Nested Loop Left Join (cost=1.12..5.60 rows=1 width=1245) (actual time=0.042..10.819 rows=2142 loops=1)
Buffers: shared hit=12865
-> Index Scan using idx_mmmcfattlist_mmmuuid_f on mtgxxxxxxxx a (cost=0.56..2.79 rows=1 width=1169) (actual time=0.025..2.492
rows=2142 loops=1)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Filter: ((joinstatus = ANY (ARRAY[$3, $4])) OR ((usertype)::text = 'Testlist'::text))
Buffers: shared hit=2891
-> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.79 rows=1 width=133) (actual time=0.003..0
.003 rows=1 loops=2142)
Index Cond: ((sssid = a.sssid) AND (sssid = $1) AND ((mmmuuid)::text = (a.mmmuuid)::text) AND ((mmmuuid)::text = ($2)::text) AND (uui
d = a.uuid))
Buffers: shared hit=10710 <<< here much less shared hits , and Index Cond automatically added sssid = a.sssid ,mmuuid=a.mmmuuid.
Planning Time: 0.021 ms
Execution Time: 12.451 ms
(17 rows)

Thanks,

James

Re: Fwd: extend statistics help reduce index scan a lot of shared buffer hits.

От

Tomas Vondra

Дата:

28 февраля, 17:53:55

On 2/27/24 14:58, James Pang wrote:
>    Postgresql 14.8, Redhat8.   looks like have to create extend statistics
> on indexed and joined columns to make join filters pushed down to secondary
> index scan in nestloop, and the shared buffer hits show big difference.
> 
> is it expected ?
> 

It's hard to say what exactly is happening in the example query (I'd
have to do some debugging, but that's impossible without a reproducer),
but I think it's mostly expected.

My guess is that without the stats the optimizer sees this:

->  Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b
(cost=0.56..2.78 rows=1 width=133) (actual time=0.016..0.698 rows=2142
loops=2142)

and so decides there's no point in pushing down more conditions to the
index scan (because it already returns just 1 row). But there's some
sort of correlation / skew, and it returns 2142 rows.

With the extended stats it realizes pushing down more conditions makes
sense, because doing that in index scan is cheaper than having to read
the heap pages etc. So it does that.

So yeah, this seems exactly the improvement you'd expect from extended
stats. Why do you think this would not be expected?

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: extend statistics help reduce index scan a lot of shared buffer hits.

extend statistics help reduce index scan a lot of shared buffer hits.

Fwd: extend statistics help reduce index scan a lot of shared buffer hits.

Re: Fwd: extend statistics help reduce index scan a lot of shared buffer hits.