Getting rid of a seq scan in query on a large table

Поиск

Список

Период

Сортировка

От	Jens Hoffrichter
Тема	Getting rid of a seq scan in query on a large table
Дата	27 июня 2011 г. 12:46:53
Msg-id	BANLkTik0KVOg8itw_ud-s8_TtK9kzt-O=Q@mail.gmail.com обсуждение исходный текст
Ответы	Re: Getting rid of a seq scan in query on a large table ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Список	pgsql-performance

Дерево обсуждения

Hi everyone,

I'm having trouble getting rid of a sequential scan on a table with roughly 120k entries it. Creation of an index on that particular column which triggers the sequential scan doesn't do anything, VACUUM and ANALYZE has been done on the table.

The table in question has the following definition:

Column | Type | Modifiers

--------------------+--------------------------+------------------------------------------------------------------

post_id | bigint | not null default nextval('posts_post_id_seq'::regclass)

forum_id | bigint | not null

threadlink | character varying(255) | not null

timestamp | timestamp with time zone | not null

poster_id | bigint |

thread_id | bigint | not null

subject | text | not null

text | text | not null

postername | character varying(255) |

internal_post_id | bigint | not null default nextval('posts_internal_post_id_seq'::regclass)

internal_thread_id | bigint |

Indexes:

"posts_pkey" PRIMARY KEY, btree (internal_post_id)

"posts_forum_id_key" UNIQUE, btree (forum_id, post_id)

"idx_internal_thread_id" btree (internal_thread_id)

"idx_posts_poster_id" btree (poster_id)

Foreign-key constraints:

"posts_forum_id_fkey" FOREIGN KEY (forum_id) REFERENCES forums(forum_id)

"posts_internal_thread_id_fkey" FOREIGN KEY (internal_thread_id) REFERENCES threads(internal_thread_id)

"posts_poster_id_fkey" FOREIGN KEY (poster_id) REFERENCES posters(poster_id)

The query is this:

SELECT threads.internal_thread_id AS threads_internal_thread_id, threads.forum_id AS threads_forum_id, threads.thread_id AS threads_thread_id, threads.title AS threads_title, threads.poster_id AS threads_poster_id, threads.postername AS threads_postername, threads.category AS threads_category, threads.posttype AS threads_posttype FROM threads JOIN posts ON threads.internal_thread_id = posts.internal_thread_id JOIN posters ON posts.poster_id = posters.poster_id JOIN posters_groups AS posters_groups_1 ON posters.poster_id = posters_groups_1.poster_id JOIN groups ON groups.group_id = posters_groups_1.group_id WHERE groups.group_id = 4 ORDER BY posts.timestamp DESC;

The query plan (with an explain analyze) gives me the following:

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------

Sort (cost=13995.93..14006.63 rows=4279 width=108) (actual time=79.927..79.947 rows=165 loops=1)

Sort Key: posts."timestamp"

Sort Method: quicksort Memory: 50kB

-> Nested Loop (cost=6.97..13737.84 rows=4279 width=108) (actual time=0.605..79.693 rows=165 loops=1)

-> Seq Scan on groups (cost=0.00..1.05 rows=1 width=8) (actual time=0.013..0.014 rows=1 loops=1)

Filter: (group_id = 4)

-> Nested Loop (cost=6.97..13694.00 rows=4279 width=116) (actual time=0.587..79.616 rows=165 loops=1)

-> Hash Join (cost=6.97..12343.10 rows=4279 width=24) (actual time=0.568..78.230 rows=165 loops=1)

Hash Cond: (posts.poster_id = posters.poster_id)

-> Seq Scan on posts (cost=0.00..11862.12 rows=112312 width=24) (actual time=0.019..60.092 rows=112312 loops=1)

-> Hash (cost=6.79..6.79 rows=14 width=24) (actual time=0.101..0.101 rows=14 loops=1)

-> Hash Join (cost=2.14..6.79 rows=14 width=24) (actual time=0.060..0.093 rows=14 loops=1)

Hash Cond: (posters.poster_id = posters_groups_1.poster_id)

-> Seq Scan on posters (cost=0.00..3.83 rows=183 width=8) (actual time=0.006..0.023 rows=185 loops=1)

-> Hash (cost=1.96..1.96 rows=14 width=16) (actual time=0.025..0.025 rows=14 loops=1)

-> Seq Scan on posters_groups posters_groups_1 (cost=0.00..1.96 rows=14 width=16) (actual time=0.016..0.021 rows=14 loops=1)

Filter: (group_id = 4)

-> Index Scan using threads_pkey on threads (cost=0.00..0.30 rows=1 width=100) (actual time=0.006..0.007 rows=1 loops=165)

Index Cond: (threads.internal_thread_id = posts.internal_thread_id)

Total runtime: 80.137 ms

(20 rows)

So the big time lost is in this line:

Seq Scan on posts (cost=0.00..11862.12 rows=112312 width=24) (actual time=0.019..60.092 rows=112312 loops=1)

which I can understand why it slow ;)

But I haven't yet managed to convert the Seq Scan into an Index Scan, and I'm not sure how to continue there.

As I am not a big expert on psql optimization, any input would be greatly appreciated.

Best regards,

Jens

В списке pgsql-performance по дате отправления:

Предыдущее

От: Greg Smith
Дата: 24 июня 2011 г., 23:26:42
Сообщение: Re: Cost of creating an emply WAL segment

Следующее

От: Jenish
Дата: 27 июня 2011 г., 14:23:05
Сообщение: Performance issue with Insert

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Getting rid of a seq scan in query on a large table

Предыдущее

Следующее