JOIN performance

Поиск

Список

Период

Сортировка

От	Dean Gibson (DB Administrator)
Тема	JOIN performance
Дата	20 сентября 2004 г. 20:25:04
Msg-id	5.1.0.14.2.20040920121606.00a89c68@imaps.mailpen.net обсуждение исходный текст
Ответ на	Re: COUNT(*) to find records which have a certain number of (Greg Stark <gsstark@mit.edu>)
Ответы	Re: JOIN performance
Список	pgsql-sql

Дерево обсуждения

I have a view that when used, is slow:

CREATE  VIEW     "Data".genlic_a4avail  AS        SELECT                genlic_a4.*,                last_action_date,
            end_date,                canceled            FROM    genlic_a4                LEFT JOIN lic_hd  USING(
sys_id)            WHERE       status != 'A';
 

Here is the EXPLAIN output:
 Merge Join  (cost=155360.47..159965.70 rows=13063 width=75)   Merge Cond: ("outer".sys_id = "inner".sys_id)   ->  Sort
(cost=3912.51..3916.48 rows=1589 width=62)         Sort Key: "_GenLicGroupA4".sys_id         ->  Nested Loop
(cost=0.00..3828.04rows=1589 width=62)               Join Filter: ("outer".callsign ~ ("inner".pattern)::text)
    ->  Seq Scan on "_GenLicGroupA4"  (cost=0.00..1667.40 
 
rows=1589 width=21)                     Filter: ((status <> 'R'::bpchar) AND (status <> 
'A'::bpchar) AND (geo_region = 12))               ->  Seq Scan on "_GeoRestrict"  (cost=0.00..1.16 rows=16 
width=41)   ->  Sort  (cost=262032.96..264249.96 rows=886799 width=72)         Sort Key: lic_hd.sys_id         ->
SubqueryScan lic_hd  (cost=0.00..24529.99 rows=886799 width=72)               ->  Seq Scan on "_LicHD"
(cost=0.00..24529.99rows=886799 
 
width=72)

If I change the view to this:

CREATE  VIEW     "Data".genlic_a4avail  AS        SELECT                genlic_a4.*,                (SELECT
last_action_dateFROM lic_hd WHERE sys_id = 
 
genlic_a4.sys_id LIMIT 1) AS                last_action_date,                (SELECT end_date         FROM lic_hd WHERE
sys_id= 
 
genlic_a4.sys_id LIMIT 1) AS                end_date,                (SELECT canceled         FROM lic_hd WHERE sys_id
=
 
genlic_a4.sys_id LIMIT 1) AS                canceled            FROM    genlic_a4            WHERE       status !=
'A';

Then the performance is MUCH better:
 Subquery Scan genlic_a4avail  (cost=0.00..3828.04 rows=1589 width=62)   ->  Nested Loop  (cost=0.00..3828.04 rows=1589
width=62)        Join Filter: ("outer".callsign ~ ("inner".pattern)::text)         ->  Seq Scan on "_GenLicGroupA4"
(cost=0.00..1667.40rows=1589 
 
width=21)               Filter: ((status <> 'R'::bpchar) AND (status <> 
'A'::bpchar) AND (geo_region = 12))         ->  Seq Scan on "_GeoRestrict"  (cost=0.00..1.16 rows=16 width=41)
SubPlan          ->  Limit  (cost=0.00..3.01 rows=1 width=4)                 ->  Index Scan using "_LicHD_pkey" on 
 
"_LicHD"  (cost=0.00..3.01 rows=1 width=4)                       Index Cond: (unique_system_identifier = $0)
-> Limit  (cost=0.00..3.01 rows=1 width=8)                 ->  Index Scan using "_LicHD_pkey" on 
 
"_LicHD"  (cost=0.00..3.01 rows=1 width=8)                       Index Cond: (unique_system_identifier = $0)
-> Limit  (cost=0.00..3.01 rows=1 width=8)                 ->  Index Scan using "_LicHD_pkey" on 
 
"_LicHD"  (cost=0.00..3.01 rows=1 width=8)                       Index Cond: (unique_system_identifier = $0)

Note that genlic_a4 is small (4519), and lic_hd is large (886799), and 
lic_hd has sys_id as its PRIMARY KEY.

Is there a better way to write the LEFT JOIN so as to achieve the 
performance of the second VIEW without the clumsiness of the three (SELECT 
... LIMIT 1) ???

В списке pgsql-sql по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

JOIN performance