Sorting when LEFT JOINING to 2 same tables, even aliased.

Поиск
Список
Период
Сортировка
От Octavio Alvarez
Тема Sorting when LEFT JOINING to 2 same tables, even aliased.
Дата
Msg-id 2400.192.168.0.64.1079070086.squirrel@alvarezp.ods.org
обсуждение исходный текст
Ответы Re: Sorting when LEFT JOINING to 2 same tables, even  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
Список pgsql-performance
Hello to everybody.

I ask your help for a severe problem when doing a query that LEFT JOINs
one table to another ON a field, and then LEFT JOINs again to another
"instance" of a table ON another field which stores the same entity, but
with different meaning.

I include 3 EXPLAIN ANALYZEs:
* The first one, the target (and problematic) query, which runs in 5 to 6
minutes.
* The second one, a variation with the second LEFT JOIN commented out,
which runs in 175 to 450 ms.
* The third one, a variation of the first one with ORDER BY removed, which
gives me about 19 seconds.

Therefore, I feel like there are two problems here the one that raises the
clock to 6 minutes and one that raises it to 20 seconds. I expected a much
lower time. I checked indexes and data types already, they are all fine.
All relevant fields have BTREEs, all PKs have UNIQUE BTREE, and all id and
ext_* fields have 'integer' as data type. Each ext_* has its corresponding
REFERENCES contraint.

I translated all the table and field names to make it easier to read. I
made my best not to let any typo go through.

I'd appreciate any help.

Octavio.

=== First EXPLAIN ANALYZE ===

EXPLAIN ANALYZE
SELECT
    t_materias_en_tira.id AS Id,
    t_clientes.paterno || ' ' || t_clientes.materno || ' ' ||
t_clientes.nombre AS Alumno,
    t_materias.nombre AS Materia,
    t_materias__equivalentes.nombre AS MateriaEquivalente,
    t_grupos.nombre AS Grupo,
    calificacion_final AS Calificacion,
    tipo AS Tipo,
    eer AS EER,
    total_asistencias AS TotalAsistencias,
    total_clases As TotalClases
FROM
    t_materias_en_tira
    LEFT JOIN t_alumnos_en_semestre ON ext_alumno_en_semestre =
t_alumnos_en_semestre.id
    LEFT JOIN t_alumnos ON ext_alumno = t_alumnos.id
    LEFT JOIN t_clientes ON ext_cliente = t_clientes.id
    LEFT JOIN t_materias ON ext_materia = t_materias.id
    LEFT JOIN t_materias AS t_materias__equivalentes ON
ext_materia__equivalencia = t_materias.id
    LEFT JOIN t_grupos ON ext_grupo = t_grupos.id
WHERE
    t_alumnos_en_semestre.ext_ciclo = 2222
ORDER BY
    Alumno, Materia;

This one gave:
                                                                                                     QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=11549.08..11552.11 rows=1210 width=112) (actual
time=311246.000..355615.000 rows=1309321 loops=1)
   Sort Key: (((((t_clientes.paterno)::text || ' '::text) ||
(t_clientes.materno)::text) || ' '::text) ||
(t_clientes.nombre)::text), t_materias.nombre
   InitPlan
     ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=2.000..2.000 rows=1 loops=1)
   ->  Hash Left Join  (cost=1089.25..11487.11 rows=1210 width=112)
(actual time=83.000..19303.000 rows=1309321 loops=1)
         Hash Cond: ("outer".ext_grupo = "inner".id)
         ->  Nested Loop Left Join  (cost=1086.92..11454.53 rows=1210
width=107) (actual time=82.000..9077.000 rows=1309321 loops=1)
               Join Filter: ("outer".ext_materia__equivalencia =
"outer".id) ->  Hash Left Join  (cost=1078.15..1181.93
rows=1210
width=93) (actual time=82.000..275.000 rows=3473 loops=1)
                     Hash Cond: ("outer".ext_materia = "inner".id) ->
Merge Right Join  (cost=1068.43..1154.07
rows=1210 width=71) (actual time=81.000..213.000
rows=3473 loops=1)
                           Merge Cond: ("outer".id = "inner".ext_cliente)
->  Index Scan using t_clientes_pkey on
t_clientes  (cost=0.00..62.87 rows=1847
width=38) (actual time=10.000..34.000 rows=1847
loops=1)
                           ->  Sort  (cost=1068.43..1071.46 rows=1210
width=41) (actual time=71.000..76.000 rows=3473
loops=1)
                                 Sort Key: t_alumnos.ext_cliente
                                 ->  Hash Left Join  (cost=41.12..1006.48
rows=1210 width=41) (actual
time=9.000..61.000 rows=3473 loops=1)
                                       Hash Cond: ("outer".ext_alumno =
"inner".id)
                                       ->  Nested Loop  (cost=0.00..944.18
rows=1210 width=41) (actual
time=3.000..36.000 rows=3473
loops=1)
                                             ->  Index Scan using
i_t_alumnos_en_semestre__ext_ciclo
on t_alumnos_en_semestre
(cost=0.00..8.63 rows=269
width=8) (actual
time=2.000..3.000 rows=457
loops=1)
                                                   Index Cond: (ext_ciclo
= $0)
                                             ->  Index Scan using
i_t_materias_en_tira__ext_alumno_en_semestre
on t_materias_en_tira
(cost=0.00..3.32 rows=12
width=41) (actual
time=0.009..0.035 rows=8
loops=457)
                                                   Index Cond:
(t_materias_en_tira.ext_alumno_en_semestre
= "outer".id)
                                       ->  Hash  (cost=36.50..36.50
rows=1850 width=8) (actual
time=6.000..6.000 rows=0loops=1)
                                             ->  Seq Scan on t_alumnos
(cost=0.00..36.50 rows=1850
width=8) (actual
time=1.000..3.000 rows=1850
loops=1)
                     ->  Hash  (cost=8.77..8.77 rows=377 width=26) (actual
time=1.000..1.000 rows=0 loops=1)
                           ->  Seq Scan on t_materias  (cost=0.00..8.77
rows=377 width=26) (actual time=0.000..1.000
rows=377 loops=1)
               ->  Materialize  (cost=8.77..12.54 rows=377 width=22)
(actual time=0.000..0.175 rows=377 loops=3473)
                     ->  Seq Scan on t_materias t_materias__equivalentes
(cost=0.00..8.77 rows=377 width=22) (actual
time=0.000..1.000 rows=377 loops=1)
         ->  Hash  (cost=2.07..2.07 rows=107 width=13) (actual
time=1.000..1.000 rows=0 loops=1)
               ->  Seq Scan on t_grupos  (cost=0.00..2.07 rows=107
width=13) (actual time=0.000..1.000 rows=107 loops=1)

 Total runtime: 356144.000 ms

=== Second EXPLAIN ANALYZE ===

EXPLAIN ANALYZE
SELECT
    t_materias_en_tira.id AS Id,
    t_clientes.paterno || ' ' || t_clientes.materno || ' ' ||
t_clientes.nombre AS Alumno,
    t_materias.nombre AS Materia,
--    t_materias__equivalentes.nombre AS MateriaEquivalente,
    t_grupos.nombre AS Grupo,
    calificacion_final AS Calificacion,
    tipo AS Tipo,
    eer AS EER,
    total_asistencias AS TotalAsistencias,
    total_clases As TotalClases
FROM
    t_materias_en_tira
    LEFT JOIN t_alumnos_en_semestre ON ext_alumno_en_semestre =
t_alumnos_en_semestre.id
    LEFT JOIN t_alumnos ON ext_alumno = t_alumnos.id
    LEFT JOIN t_clientes ON ext_cliente = t_clientes.id
    LEFT JOIN t_materias ON ext_materia = t_materias.id
--    LEFT JOIN t_materias AS t_materias__equivalentes ON
ext_materia__equivalencia = t_materias.id
    LEFT JOIN t_grupos ON ext_grupo = t_grupos.id
WHERE
    t_alumnos_en_semestre.ext_ciclo = 2222
ORDER BY
    Alumno, Materia;

EXPLAIN ANALYZE
SELECT
    t_materias_en_tira.id AS Id,
    t_clientes.paterno || ' ' || t_clientes.materno || ' ' ||
t_clientes.nombre AS Alumno,
    t_materias.nombre AS Materia,
    t_materias__equivalentes.nombre AS MateriaEquivalente,
    t_grupos.nombre AS Grupo,
    calificacion_final AS Calificacion,
    tipo AS Tipo,
    eer AS EER,
    total_asistencias AS TotalAsistencias,
    total_clases As TotalClases
FROM
    t_materias_en_tira
    LEFT JOIN t_alumnos_en_semestre ON ext_alumno_en_semestre =
t_alumnos_en_semestre.id
    LEFT JOIN t_alumnos ON ext_alumno = t_alumnos.id
    LEFT JOIN t_clientes ON ext_cliente = t_clientes.id
    LEFT JOIN t_materias ON ext_materia = t_materias.id
    LEFT JOIN t_materias AS t_materias__equivalentes ON
ext_materia__equivalencia = t_materias.id
    LEFT JOIN t_grupos ON ext_grupo = t_grupos.id
WHERE
    t_alumnos_en_semestre.ext_ciclo = 2222;

It gave:

                                                                                                  QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=1276.49..1279.51 rows=1210 width=90) (actual
time=341.000..341.000 rows=3473 loops=1)
   Sort Key: (((((t_clientes.paterno)::text || ' '::text) ||
(t_clientes.materno)::text) || ' '::text) ||
(t_clientes.nombre)::text), t_materias.nombre
   InitPlan
     ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=146.000..146.000 rows=1 loops=1)
   ->  Hash Left Join  (cost=1080.48..1214.52 rows=1210 width=90) (actual
time=209.000..284.000 rows=3473 loops=1)
         Hash Cond: ("outer".ext_grupo = "inner".id)
         ->  Hash Left Join  (cost=1078.15..1181.93 rows=1210 width=85)
(actual time=208.000..250.000 rows=3473 loops=1)
               Hash Cond: ("outer".ext_materia = "inner".id)
               ->  Merge Right Join  (cost=1068.43..1154.07 rows=1210
width=67) (actual time=207.000..227.000 rows=3473loops=1)
                     Merge Cond: ("outer".id = "inner".ext_cliente) ->
Index Scan using t_clientes_pkey on t_clientes
(cost=0.00..62.87 rows=1847 width=38) (actual
time=0.000..5.000 rows=1847 loops=1)
                     ->  Sort  (cost=1068.43..1071.46 rows=1210 width=37)
(actual time=207.000..209.000 rows=3473 loops=1)
                           Sort Key: t_alumnos.ext_cliente
                           ->  Hash Left Join  (cost=41.12..1006.48
rows=1210 width=37) (actual
time=152.000..196.000 rows=3473 loops=1)
                                 Hash Cond: ("outer".ext_alumno =
"inner".id) ->  Nested Loop
(cost=0.00..944.18
rows=1210 width=37) (actual
time=146.000..177.000 rows=3473 loops=1)
                                       ->  Index Scan using
i_t_alumnos_en_semestre__ext_ciclo
on t_alumnos_en_semestre
(cost=0.00..8.63 rows=269 width=8)
(actual time=146.000..148.000
rows=457 loops=1)
                                             Index Cond: (ext_ciclo = $0)
                                       ->  Index Scan using
i_t_materias_en_tira__ext_alumno_en_semestre
on t_materias_en_tira
(cost=0.00..3.32 rows=12 width=37)
(actual time=0.009..0.022 rows=8
loops=457)
                                             Index Cond:
(t_materias_en_tira.ext_alumno_en_semestre
= "outer".id)
                                 ->  Hash  (cost=36.50..36.50 rows=1850
width=8) (actual time=6.000..6.000 rows=0
loops=1)
                                       ->  Seq Scan on t_alumnos
(cost=0.00..36.50 rows=1850
width=8) (actual time=0.000..3.000
rows=1850 loops=1)
               ->  Hash  (cost=8.77..8.77 rows=377 width=26) (actual
time=1.000..1.000 rows=0 loops=1)
                     ->  Seq Scan on t_materias  (cost=0.00..8.77 rows=377
width=26) (actual time=0.000..0.000 rows=377loops=1)
         ->  Hash  (cost=2.07..2.07 rows=107 width=13) (actual
time=1.000..1.000 rows=0 loops=1)
               ->  Seq Scan on t_grupos  (cost=0.00..2.07 rows=107
width=13) (actual time=0.000..0.000 rows=107 loops=1)

 Total runtime: 346.000 ms

=== Third EXPLAIN ANALYZE ===

EXPLAIN ANALYZE
SELECT
    t_materias_en_tira.id AS Id,
    t_clientes.paterno || ' ' || t_clientes.materno || ' ' ||
t_clientes.nombre AS Alumno,
    t_materias.nombre AS Materia,
    t_materias__equivalentes.nombre AS MateriaEquivalente,
    t_grupos.nombre AS Grupo,
    calificacion_final AS Calificacion,
    tipo AS Tipo,
    eer AS EER,
    total_asistencias AS TotalAsistencias,
    total_clases As TotalClases
FROM
    t_materias_en_tira
    LEFT JOIN t_alumnos_en_semestre ON ext_alumno_en_semestre =
t_alumnos_en_semestre.id
    LEFT JOIN t_alumnos ON ext_alumno = t_alumnos.id
    LEFT JOIN t_clientes ON ext_cliente = t_clientes.id
    LEFT JOIN t_materias ON ext_materia = t_materias.id
    LEFT JOIN t_materias AS t_materias__equivalentes ON
ext_materia__equivalencia = t_materias.id
    LEFT JOIN t_grupos ON ext_grupo = t_grupos.id
WHERE
    t_alumnos_en_semestre.ext_ciclo = 2222;

Result:

                                                                                                     QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=484.34..4470.54 rows=459 width=112) (actual
time=70.000..18241.000 rows=1309321 loops=1)
   Hash Cond: ("outer".ext_grupo = "inner".id)
   ->  Nested Loop Left Join  (cost=482.01..4456.73 rows=459 width=107)
(actual time=70.000..7912.000 rows=1309321 loops=1)
         Join Filter: ("outer".ext_materia__equivalencia = "outer".id) ->
Hash Left Join  (cost=473.24..554.49 rows=459 width=93)
(actual time=70.000..142.000 rows=3473 loops=1)
               Hash Cond: ("outer".ext_materia = "inner".id)
               ->  Merge Right Join  (cost=463.52..537.90 rows=459
width=71) (actual time=67.000..109.000 rows=3473 loops=1)
                     Merge Cond: ("outer".id = "inner".ext_cliente) ->
Index Scan using t_clientes_pkey on t_clientes
(cost=0.00..62.87 rows=1847 width=38) (actual
time=0.000..14.000 rows=1847 loops=1)
                     ->  Sort  (cost=463.52..464.67 rows=459 width=41)
(actual time=67.000..69.000 rows=3473 loops=1)
                           Sort Key: t_alumnos.ext_cliente
                           ->  Merge Right Join  (cost=379.40..443.23
rows=459 width=41) (actual time=34.000..57.000
rows=3473 loops=1)
                                 Merge Cond: ("outer".id =
"inner".ext_alumno)
                                 ->  Index Scan using t_alumnos_pkey on
t_alumnos  (cost=0.00..52.35 rows=1850
width=8) (actual time=0.000..4.000
rows=1850 loops=1)
                                 ->  Sort  (cost=379.40..380.55 rows=459
width=41) (actual time=34.000..36.000
rows=3473 loops=1)
                                       Sort Key:
t_alumnos_en_semestre.ext_alumno
                                       ->  Nested Loop  (cost=0.00..359.11
rows=459 width=41) (actual
time=0.000..21.000 rows=3473
loops=1)
                                             ->  Index Scan using
i_t_alumnos_en_semestre__ext_ciclo
on t_alumnos_en_semestre
(cost=0.00..4.36 rows=102
width=8) (actual
time=0.000..1.000 rows=457
loops=1)
                                                   Index Cond: (ext_ciclo
= 2222)
                                             ->  Index Scan using
i_t_materias_en_tira__ext_alumno_en_semestre
on t_materias_en_tira
(cost=0.00..3.32 rows=12
width=41) (actual
time=0.004..0.026 rows=8
loops=457)
                                                   Index Cond:
(t_materias_en_tira.ext_alumno_en_semestre
= "outer".id)
               ->  Hash  (cost=8.77..8.77 rows=377 width=26) (actual
time=2.000..2.000 rows=0 loops=1)
                     ->  Seq Scan on t_materias  (cost=0.00..8.77 rows=377
width=26) (actual time=0.000..2.000 rows=377 loops=1)
         ->  Materialize  (cost=8.77..12.54 rows=377 width=22) (actual
time=0.000..0.163 rows=377 loops=3473)
               ->  Seq Scan on t_materias t_materias__equivalentes
(cost=0.00..8.77 rows=377 width=22) (actual
time=0.000..1.000 rows=377 loops=1)
   ->  Hash  (cost=2.07..2.07 rows=107 width=13) (actual time=0.000..0.000
rows=0 loops=1)
         ->  Seq Scan on t_grupos  (cost=0.00..2.07 rows=107 width=13)
(actual time=0.000..0.000 rows=107 loops=1)

 Total runtime: 18787.000 ms

SELECT count(*) FROM t_materias_en_tira;
 count
-------
 41059
(1 row)

SELECT count(*) FROM t_materias;
 count
-------
   377
(1 row)

SELECT version();;
                                        version
---------------------------------------------------------------------------------------
 PostgreSQL 7.4.1 on i686-pc-cygwin, compiled by GCC gcc (GCC) 3.3.1
(cygming special)
(1 row)


--
Octavio Alvarez.
E-mail: alvarezp@alvarezp.ods.org.

Agradezco que sus correos sean enviados siempre a esta dirección.



--
Octavio Alvarez.
E-mail: alvarezp@alvarezp.ods.org.

Agradezco que sus correos sean enviados siempre a esta dirección.

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: severe performance issue with planner
Следующее
От: Stephan Szabo
Дата:
Сообщение: Re: Sorting when LEFT JOINING to 2 same tables, even