Query on view radically slower than query on underlying table

Поиск

Список

Период

Сортировка

От	Craig James
Тема	Query on view radically slower than query on underlying table
Дата	28 февраля 2011 г. 14:34:51
Msg-id	4D6BE95C.3090102@emolecules.com обсуждение исходный текст
Ответ на	Re: inheritance: planning time vs children number vs column number (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Ответы	Re: Query on view radically slower than query on underlying table
Список	pgsql-performance

Дерево обсуждения

We have a medium-sized catalog (about 5 million rows), but some of our customers only want to see portions of it.  I've
beenexperimenting with a customer-specific schema that contains nothing but a "join table" -- just the primary keys of
thatportion of the data that each customer wants to see, which is used to create a view that looks like the original
table.But the most important query, the one that customers use to scan page-by-page through search results, turns out
tobe far too slow (65 seconds versus 55 milliseconds). 

Below are the results of two explain/analyze statements.  The first one uses the view, the second one goes directly to
theoriginal tables.  I thought this would be a slam-dunk, that it would return results in a flash because the view is
createdfrom two tables with the same primary keys. 

My guess (and it's just a wild guess) is that the "left join" is forcing a sequence scan or something.  But we need the
leftjoin, because it's on a "hitlist" that recorded all the matches to a customer's earlier query, and if rows have
beenremoved from the tables, the customer needs to see a blank row. 

Here is the "bad" query, which is run on the view:

em=> explain analyze
select version.version_id, version.isosmiles
from hitlist_rows_reset_140
left join version on (hitlist_rows_reset_140.objectid = version.version_id)
where hitlist_rows_reset_140.sortorder >= 1
and hitlist_rows_reset_140.sortorder <= 10
order by hitlist_rows_reset_140.sortorder;
                                                                         QUERY PLAN


-----------------------------------------------------------------------------------------------------------------------------
------------------------------
  Nested Loop Left Join  (cost=23687.51..215315.74 rows=1 width=54) (actual time=2682.662..63680.076 rows=10 loops=1)
    Join Filter: (hitlist_rows_reset_140.objectid = v.version_id)
    ->  Index Scan using hitlist_rows_reset_140_pkey on hitlist_rows_reset_140  (cost=0.00..8.36 rows=1 width=8)
(actualtime= 
0.015..0.049 rows=10 loops=1)
          Index Cond: ((sortorder >= 1) AND (sortorder <= 10))
    ->  Hash Join  (cost=23687.51..204666.54 rows=851267 width=50) (actual time=31.829..6263.403 rows=851267 loops=10)
          Hash Cond: (v.version_id = mv.version_id)
          ->  Seq Scan on version v  (cost=0.00..116146.68 rows=5631968 width=50) (actual time=0.006..859.758
rows=5632191loo 
ps=10)
          ->  Hash  (cost=13046.67..13046.67 rows=851267 width=4) (actual time=317.488..317.488 rows=851267 loops=1)
                ->  Seq Scan on my_version mv  (cost=0.00..13046.67 rows=851267 width=4) (actual time=2.888..115.166
rows=8512
67 loops=1)
  Total runtime: 63680.162 ms

Here is the "good" query, which is run directly on the data tables.

em=> explain analyze
select registry.version.version_id, registry.version.isosmiles
from hitlist_rows_reset_140
left join registry.version on (hitlist_rows_reset_140.objectid = registry.version.version_id)
where hitlist_rows_reset_140.sortorder >= 1
and hitlist_rows_reset_140.sortorder <= 10
order by hitlist_rows_reset_140.SortOrder;
                                                                         QUERY PLAN


-----------------------------------------------------------------------------------------------------------------------------
------------------------------
  Nested Loop Left Join  (cost=0.00..17.73 rows=1 width=54) (actual time=36.022..55.558 rows=10 loops=1)
    ->  Index Scan using hitlist_rows_reset_140_pkey on hitlist_rows_reset_140  (cost=0.00..8.36 rows=1 width=8)
(actualtime= 
0.021..0.025 rows=10 loops=1)
          Index Cond: ((sortorder >= 1) AND (sortorder <= 10))
    ->  Index Scan using version_pkey on version  (cost=0.00..9.35 rows=1 width=50) (actual time=5.551..5.552 rows=1
loops=10)
          Index Cond: (hitlist_rows_reset_140.objectid = version.version_id)
  Total runtime: 55.608 ms
(6 rows)


The view is defined like this:

em=> \d my_version
Table "test_schema.my_version"
    Column   |  Type   | Modifiers
------------+---------+-----------
  version_id | integer | not null
Indexes:
     "my_version_pkey" PRIMARY KEY, btree (version_id)

em=> \d version
  View "test_schema.version"
    Column   |  Type   | Modifiers
------------+---------+-----------
  version_id | integer |
  parent_id  | integer |
  isosmiles  | text    |
  coord_2d   | text    |
View definition:
  SELECT v.version_id, v.parent_id, v.isosmiles, v.coord_2d
    FROM registry.version v
    JOIN my_version mv ON mv.version_id = v.version_id;

This is:

   Postgres 8.4.4
   Ubuntu Linux 2.6.32-27
   Database: 8x7200 RAID 10, LSI RAID controller with BBU
   WAL: 2x7200 RAID1

Non-default config parameters:

max_connections = 500
shared_buffers = 1000MB
work_mem = 128MB
synchronous_commit = off
full_page_writes = off
wal_buffers = 256kB
checkpoint_segments = 30
effective_cache_size = 4GB
track_activities = on
track_counts = off
track_functions = none
escape_string_warning = off

Thanks,
Craig

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Query on view radically slower than query on underlying table