Обсуждение: Simple view confuses PostgreSQL query planning

Поиск
Список
Период
Сортировка

Simple view confuses PostgreSQL query planning

От
Manuel Sugawara
Дата:
[I sent this message two months ago and got no answer. I'm resending
now hoping to get some feedback.]

Hi,

I'm facing a wired problem. When I left join two tables PostgreSQL is
able to do it fast using the corresponding indices, However, if I
define a simple view (to format the data) on one of the tables, the
left join does not use the indices. Is something wrong here?

The two tables:

****************************************
\d regacd.profesor                                         Tabla "regacd.profesor"      Columna        |
Tipo            |                     Modificadores 
----------------------+-----------------------------+-------------------------------------------------------id
        | integer                     | not null default nextval('regacd.profesor_sid'::text)grupo_id             |
integer                    | not nulltipo_id              | "char"                      | not nullcargo_id
|integer                     | not nullacadémico_id         | integer                     |última_actualización |
timestampwithout time zone | default now() 
Índices:   "profesor_pkey" llave primaria, btree (id)   "profesor_académico" btree ("académico_id")   "profesor_grupo"
btree(grupo_id) 
Restricciones de llave foránea:   "CargoTipoId" FOREIGN KEY (cargo_id, tipo_id) REFERENCES cargo(id, tipo_id)
"IdentificadorGrupoTipoId"FOREIGN KEY (grupo_id, tipo_id) REFERENCES regacd.grupo(id, tipo_id)   "$1" FOREIGN KEY
("académico_id")REFERENCES personal(id) 
Triggers:   "profesor_última_actualización" BEFORE INSERT OR UPDATE ON regacd.profesor FOR EACH ROW EXECUTE PROCEDURE
"profesor_última_actualización"()  "propaga_actualización_profesor" AFTER INSERT OR UPDATE ON regacd.profesor FOR EACH
ROWEXECUTE PROCEDURE "propaga_actualización_profesor"()   "update_datos_académico" BEFORE INSERT OR UPDATE ON
regacd.profesorFOR EACH ROW EXECUTE PROCEDURE "update_datos_académico"() 

\d ordinario.horario                                         Tabla "ordinario.horario"      Columna        |
Tipo            |                      Modificadores 
----------------------+-----------------------------+---------------------------------------------------------id
          | integer                     | not null default nextval('ordinario.horario_sid'::text)profesor_id          |
integer                    |lu                   | boolean                     | not null default falsema
   | boolean                     | not null default falsemi                   | boolean                     | not null
defaultfalseju                   | boolean                     | not null default falsevi                   | boolean
                 | not null default falsesá                   | boolean                     | not null default
falsehora_inicial        | time without time zone      |hora_final           | time without time zone      |salón_id
        | integer                     |nota                 | text                        |última_actualización |
timestampwithout time zone | default now() 
Índices:   "horario_pkey" llave primaria, btree (id)   "horario_profesor" btree (profesor_id)
Restricciones de llave foránea:   "$2" FOREIGN KEY ("salón_id") REFERENCES "salón"(id)   "$1" FOREIGN KEY (profesor_id)
REFERENCESregacd.profesor(id) ON UPDATE CASCADE ON DELETE CASCADE 
Triggers:   "horario_última_actualización" BEFORE INSERT OR UPDATE ON ordinario.horario FOR EACH ROW EXECUTE PROCEDURE
"horario_última_actualización"()  "propaga_actualización_horario" AFTER INSERT OR UPDATE ON ordinario.horario FOR EACH
ROWEXECUTE PROCEDURE "propaga_actualización_horario"() 
****************************************

Now, a left join query of the tables leads a nice and fast plan:

explain analyze select * from regacd.profesor p left join ordinario.horario h on (h.profesor_id = p.id) where
p.grupo_idIN (129314, 129315, 129316, 129317, 129318, 129319, 129320, 129321, 129322);
                                                                                         QUERY PLAN

                                                        

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Nested
LoopLeft Join  (cost=0.00..150.79 rows=18 width=78) (actual time=0.400..1.232 rows=19 loops=1)  ->  Index Scan using
profesor_grupo,profesor_grupo, profesor_grupo, profesor_grupo, profesor_grupo, profesor_grupo, profesor_grupo,
profesor_grupo,profesor_grupo on profesor p  (cost=0.00..96.50 rows=18 width=25) (actual time=0.231..0.499 rows=19
loops=1)       Index Cond: ((grupo_id = 129314) OR (grupo_id = 129315) OR (grupo_id = 129316) OR (grupo_id = 129317) OR
(grupo_id= 129318) OR (grupo_id = 129319) OR (grupo_id = 129320) OR (grupo_id = 129321) OR (grupo_id = 129322))  ->
IndexScan using horario_profesor on horario h  (cost=0.00..3.01 rows=1 width=53) (actual time=0.020..0.023 rows=1
loops=19)       Index Cond: (h.profesor_id = "outer".id)Total runtime: 1.542 ms 
(6 filas)

However, if I define a simple view on ordinario.horario

\d vordinario.horario               Vista "vordinario.horario"    Columna      |          Tipo          | Modificadores
------------------+------------------------+---------------horario_id       | integer                |profesor_id
|integer                |lu               | boolean                |ma               | boolean                |mi
       | boolean                |ju               | boolean                |vi               | boolean
|sá              | boolean                |días_txt         | text                   |hora_inicial     | time without
timezone |hora_final       | time without time zone |hora_inicial_txt | text                   |hora_final_txt   | text
                 |salón_id         | integer                |salón_txt        | text                   |horario_nota
| text                   | 
Definición de vista:SELECT h.id AS horario_id, h.profesor_id, h.lu, h.ma, h.mi, h.ju, h.vi, h."sá", "días_atxt"(h.lu,
h.ma,h.mi, h.ju, h.vi, h."sá") AS "días_txt", h.hora_inicial, h.hora_final,       CASE           WHEN h.hora_inicial IS
NULLTHEN ''::text           WHEN date_part('minute'::text, h.hora_inicial) = 0::double precision THEN
date_part('hour'::text,h.hora_inicial)::text           ELSE (date_part('hour'::text, h.hora_inicial)::text ||
':'::text)|| to_char(date_part('minute'::text, h.hora_inicial), 'fm00'::text)       END AS hora_inicial_txt,       CASE
         WHEN h.hora_final IS NULL THEN ''::text           WHEN date_part('minute'::text, h.hora_final) = 0::double
precisionTHEN date_part('hour'::text, h.hora_final)::text           ELSE (date_part('hour'::text, h.hora_final)::text
||':'::text) || to_char(date_part('minute'::text, h.hora_final), 'fm00'::text)       END AS hora_final_txt,
h."salón_id","salón_id_atxt"(h."salón_id") AS "salón_txt", h.nota AS horario_nota  FROM ordinario.horario h; 

The left join gives a secuential scan, no matter what:
explain analyze select * from regacd.profesor p left join vordinario.horario h on (h.profesor_id = p.id) where
p.grupo_idIN (129314, 129315, 129316, 129317, 129318, 129319, 129320, 129321, 129322);
                                                                                            QUERY PLAN

                                                           

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Merge
LeftJoin  (cost=2036.01..2107.72 rows=18 width=219) (actual time=1610.715..1611.626 rows=19 loops=1)  Merge Cond:
("outer".id= "inner".profesor_id)  ->  Sort  (cost=96.88..96.92 rows=18 width=25) (actual time=0.299..0.325 rows=19
loops=1)       Sort Key: p.id        ->  Index Scan using profesor_grupo, profesor_grupo, profesor_grupo,
profesor_grupo,profesor_grupo, profesor_grupo, profesor_grupo, profesor_grupo, profesor_grupo on profesor p
(cost=0.00..96.50rows=18 width=25) (actual time=0.062..0.220 rows=19 loops=1)              Index Cond: ((grupo_id =
129314)OR (grupo_id = 129315) OR (grupo_id = 129316) OR (grupo_id = 129317) OR (grupo_id = 129318) OR (grupo_id =
129319)OR (grupo_id = 129320) OR (grupo_id = 129321) OR (grupo_id = 129322))  ->  Sort  (cost=1939.13..1974.94
rows=14323width=194) (actual time=1581.038..1585.742 rows=13900 loops=1)        Sort Key: h.profesor_id        ->
SubqueryScan h  (cost=0.00..950.41 rows=14323 width=194) (actual time=1.180..1549.464 rows=14323 loops=1)
-> Seq Scan on horario h  (cost=0.00..936.09 rows=14323 width=45) (actual time=1.160..1450.191 rows=14323 loops=1)Total
runtime:1616.958 ms 
(11 filas)

Regards,
Manuel.

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
              http://www.postgresql.org/docs/faqs/FAQ.html



Re: Simple view confuses PostgreSQL query planning

От
Tom Lane
Дата:
Manuel Sugawara <masm@fciencias.unam.mx> writes:
> I'm facing a wired problem. When I left join two tables PostgreSQL is
> able to do it fast using the corresponding indices, However, if I
> define a simple view (to format the data) on one of the tables, the
> left join does not use the indices. Is something wrong here?

> Definici�n de vista:
>  SELECT h.id AS horario_id, h.profesor_id, h.lu, h.ma, h.mi, h.ju, h.vi, h."s�", "d�as_atxt"(h.lu, h.ma, h.mi, h.ju,
h.vi,h."s�") AS "d�as_txt", h.hora_inicial, h.hora_final,
 
>         CASE
>             WHEN h.hora_inicial IS NULL THEN ''::text
>             WHEN date_part('minute'::text, h.hora_inicial) = 0::double precision THEN date_part('hour'::text,
h.hora_inicial)::text
>             ELSE (date_part('hour'::text, h.hora_inicial)::text || ':'::text) || to_char(date_part('minute'::text,
h.hora_inicial),'fm00'::text)
 
>         END AS hora_inicial_txt,
>         CASE
>             WHEN h.hora_final IS NULL THEN ''::text
>             WHEN date_part('minute'::text, h.hora_final) = 0::double precision THEN date_part('hour'::text,
h.hora_final)::text
>             ELSE (date_part('hour'::text, h.hora_final)::text || ':'::text) || to_char(date_part('minute'::text,
h.hora_final),'fm00'::text)
 
>         END AS hora_final_txt, h."sal�n_id", "sal�n_id_atxt"(h."sal�n_id") AS "sal�n_txt", h.nota AS horario_nota
>    FROM ordinario.horario h;

I think the issue is that the subquery isn't getting flattened, because
of this test:
       /*        * If we are inside an outer join, only pull up subqueries whose        * targetlists are nullable ---
otherwisesubstituting their tlist        * entries for upper Var references would do the wrong thing (the        *
resultswouldn't become NULL when they're supposed to).        *        * XXX This could be improved by generating
pseudo-variablesfor        * such expressions; we'd have to figure out how to get the pseudo-        * variables
evaluatedat the right place in the modified plan        * tree. Fix it someday.        */       if (...
(!below_outer_join|| has_nullable_targetlist(subquery)))
 

has_nullable_targetlist() is returning false because of the CASE
expressions.  Its analysis could be more detailed, but in point of fact
with this particular definition the targetlist *isn't* nullable ---
the first arm of each CASE will yield a non-null result for null input.
Get rid of the CASEs (perhaps you could wrap them into functions
declared STRICT) and the view would be flattenable.

The reason we need this is shown in this old bug report:
http://archives.postgresql.org/pgsql-bugs/2001-04/msg00223.php
If the view did get flattened then the CASE outputs would give
wrong answers --- nonnull when they should be null --- just as Victor
described for constants.

The general fix mentioned in the comment is still a long way off.
        regards, tom lane


Re: Simple view confuses PostgreSQL query planning

От
Manuel Sugawara
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Get rid of the CASEs (perhaps you could wrap them into functions
> declared STRICT) and the view would be flattenable.

Will do that. Thanks.

Regards,
Manuel.