slow join not using index properly

Поиск

Список

Период

Сортировка

От	Stefan Amshey
Тема	slow join not using index properly
Дата	21 марта 2014 г. 02:56:28
Msg-id	CAOvwoRajrU_O9KOwYPPt0BEt0iqCyRZXkCriHxXzD0aXphktXQ@mail.gmail.com обсуждение исходный текст
Ответы	Re: slow join not using index properly Re: slow join not using index properly
Список	pgsql-performance

Дерево обсуждения

We have a slow performing query that we are trying to improve, and it appears to be performing a sequential scan at a point where it should be utilizing an index. Can anyone tell me why postgres is opting to do it this way?

The original query is as follows:

SELECT DISTINCT

a1.context_key

FROM

virtual_ancestors a1, collection_data, virtual_ancestors a2

WHERE

a1.ancestor_key = collection_data.context_key

AND collection_data.collection_context_key = a2.context_key

AND a2.ancestor_key = ?

The key relationships should all using indexed columns, but the query plan that postgres comes up with ends up performing a sequential scan on the collection_data table (in this case about 602k rows) where we would have expected it to utilize the index:

HashAggregate (cost=60905.73..60935.73 rows=3000 width=4) (actual time=3366.165..3367.354 rows=3492 loops=1)

Buffers: shared hit=16291 read=1222

-> Nested Loop (cost=17546.26..60898.23 rows=3000 width=4) (actual time=438.332..3357.918 rows=13037 loops=1)

Buffers: shared hit=16291 read=1222

-> Hash Join (cost=17546.26..25100.94 rows=98 width=4) (actual time=408.554..415.767 rows=2092 loops=1)

Hash Cond: (a2.context_key = collection_data.collection_context_key)

Buffers: shared hit=4850 read=3

-> Index Only Scan using virtual_ancestors_pkey on virtual_ancestors a2 (cost=0.00..233.32 rows=270 width=4) (actual time=8.532..10.703 rows=1960 loops=1)

Index Cond: (ancestor_key = 1072173)

Heap Fetches: 896

Buffers: shared hit=859 read=3

-> Hash (cost=10015.56..10015.56 rows=602456 width=8) (actual time=399.708..399.708 rows=602570 loops=1)

Buckets: 65536 Batches: 1 Memory Usage: 23538kB

Buffers: shared hit=3991

######## sequential scan occurs here ##########

-> Seq Scan on collection_data (cost=0.00..10015.56 rows=602456 width=8) (actual time=0.013..163.509 rows=602570 loops=1)

Buffers: shared hit=3991

-> Index Only Scan using virtual_ancestors_pkey on virtual_ancestors a1 (cost=0.00..360.70 rows=458 width=8) (actual time=1.339..1.403 rows=6 loops=2092)

Index Cond: (ancestor_key = collection_data.context_key)

Heap Fetches: 7067

Buffers: shared hit=11441 read=1219

Total runtime: 3373.058 ms

The table definitions are as follows:

Table "public.virtual_ancestors"

Column | Type | Modifiers

--------------+----------+-----------

ancestor_key | integer | not null

context_key | integer | not null

degree | smallint | not null

Indexes:

"virtual_ancestors_pkey" PRIMARY KEY, btree (ancestor_key, context_key)

"virtual_context_key_idx" btree (context_key)

Foreign-key constraints:

"virtual_ancestors_ancestor_key_fkey" FOREIGN KEY (ancestor_key) REFERENCES contexts(context_key)

"virtual_ancestors_context_key_fkey" FOREIGN KEY (context_key) REFERENCES contexts(context_key)

Table "public.collection_data"

Column | Type | Modifiers

------------------------+----------------------+-----------

collection_context_key | integer | not null

context_key | integer | not null

type | character varying(1) | not null

source | character varying(1) | not null

Indexes:

"collection_data_context_key_idx" btree (context_key)

"collection_data_context_key_index" btree (collection_context_key) CLUSTER

Foreign-key constraints:

"collection_data_collection_context_key_fkey" FOREIGN KEY (collection_context_key) REFERENCES contexts(context_key) ON DELETE CASCADE

"collection_data_context_key_fkey" FOREIGN KEY (context_key) REFERENCES contexts(context_key) ON DELETE CASCADE

Can anyone suggest a way that we can get postgres to use the collection_data_context_key_index properly? I thought that it might be related to the fact that collection_data_context_key_index is a CLUSTERED index, but we did some basic experimentation that seems to indicate otherwise, i.e. the bad plan persists despite re-clustering the index.

We are using PostgreSQL 9.2.5 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.4.5-8) 4.4.5, 64-bit

Interestingly, on an instance running PostgreSQL 9.2.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.4.5-8) 4.4.5, 64-bit where I copied the 2 tables over to a temporary database, the plan comes out differently:

HashAggregate (cost=39692.03..39739.98 rows=4795 width=4) (actual time=73.285..75.141 rows=3486 loops=1)

Buffers: shared hit=22458

-> Nested Loop (cost=0.00..39680.05 rows=4795 width=4) (actual time=0.077..63.116 rows=13007 loops=1)

Buffers: shared hit=22458

-> Nested Loop (cost=0.00..32823.38 rows=164 width=4) (actual time=0.056..17.685 rows=2084 loops=1)

Buffers: shared hit=7529

-> Index Only Scan using virtual_ancestors_pkey on virtual_ancestors a2 (cost=0.00..1220.85 rows=396 width=4) (actual time=0.025..2.732 rows=1954 loops=1)

Index Cond: (ancestor_key = 1072173)

Heap Fetches: 1954

Buffers: shared hit=1397

######## Note the index scan here - this is what it SHOULD be doing ##############

-> Index Scan using collection_data_context_key_index on collection_data (cost=0.00..79.24 rows=56 width=8) (actual time=0.004..0.005 rows=1 loops=1954)

Index Cond: (collection_context_key = a2.context_key)

Buffers: shared hit=6132

-> Index Only Scan using virtual_ancestors_pkey on virtual_ancestors a1 (cost=0.00..35.40 rows=641 width=8) (actual time=0.007..0.015 rows=6 loops=2084)

Index Cond: (ancestor_key = collection_data.context_key)

Heap Fetches: 13007

Buffers: shared hit=14929

Total runtime: 76.431 ms

Why can't I get the Postgres 9.2.5 instance to use the optimal plan?

Thanks in advance!

/Stefan

--
-
Stefan Amshey

В списке pgsql-performance по дате отправления:

Предыдущее

От: Torsten Förtsch
Дата: 20 марта 2014 г., 20:14:08
Сообщение: Re: Performance of UNION vs IN

Следующее

От: Jean-Max Reymond
Дата: 21 марта 2014 г., 08:33:57
Сообщение: Re: Performance of UNION vs IN

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

slow join not using index properly

Предыдущее

Следующее