Обсуждение: [GENERAL] select on view shows different plan than select on table

Поиск
Список
Период
Сортировка

[GENERAL] select on view shows different plan than select on table

От
Bernd Lehmkuhl
Дата:
Hi list,

I have a view defined as:
CREATE VIEW kkm_ergebnisse.v_protokoll_details_mit_dauer AS
SELECT
   id,
   schritt,
   objekt_typ,
   objekt_id,
   zeit,
   rechenweg_thema,
   rechenweg_variante,
   rechenweg_stoffgruppe,
   formel_inhalt,
   formel_stoff,
   formel_variablen,
   ergebnis_variable,
   ergebnis_wert,
   status,
   id_rechenlauf,
   formel_id,
   formel_name,
   formel_variante,
   ergebnis_variable_einheit,
   zeitpunkt,
   DATE_PART('seconds', zeitpunkt::time without time zone -
LAG(zeitpunkt::time without time zone) OVER (ORDER BY id)) * 1000 AS
dauer_in_ms
FROM
   kkm_ergebnisse.t_protokoll_details ;

Table kkm_ergebnisse.t_protokoll_details is defined like this:
CREATE TABLE kkm_ergebnisse.t_protokoll_details
(
   id serial NOT NULL,
   schritt integer,
   objekt_typ smallint NOT NULL,
   objekt_id integer,
   zeit integer,
   rechenweg_thema character varying(256),
   rechenweg_variante character varying(256),
   rechenweg_stoffgruppe character varying(256),
   formel_inhalt character varying(4000),
   formel_stoff character varying(256),
   formel_variablen character varying(4000),
   ergebnis_variable character varying(256),
   ergebnis_wert double precision,
   status character varying(4000),
   id_rechenlauf integer NOT NULL,
   formel_id integer,
   formel_name character varying(256),
   formel_variante character varying(256),
   ergebnis_variable_einheit character varying(255),
   zeitpunkt time with time zone DEFAULT clock_timestamp(),
   CONSTRAINT pk PRIMARY KEY (id),
   CONSTRAINT fk_rechenlauf FOREIGN KEY (id_rechenlauf)
       REFERENCES kkm_ergebnisse.t_rechenlaeufe (id) MATCH SIMPLE
       ON UPDATE CASCADE ON DELETE CASCADE,
   CONSTRAINT ck_protokoll_details_objekt_typ CHECK (objekt_typ = ANY
(ARRAY[1, 2]))
);

Why do I get different execution plans when querying the view like this:
SELECT
   *
FROM
   kkm_ergebnisse.v_protokoll_details_mit_dauer
WHERE
   id_rechenlauf = 123
ORDER BY
   schritt ;

opposed to querying against the definition of the view?
SELECT
   id,
   schritt,
   objekt_typ,
   objekt_id,
   zeit,
   rechenweg_thema,
   rechenweg_variante,
   rechenweg_stoffgruppe,
   formel_inhalt,
   formel_stoff,
   formel_variablen,
   ergebnis_variable,
   ergebnis_wert,
   status,
   id_rechenlauf,
   formel_id,
   formel_name,
   formel_variante,
   ergebnis_variable_einheit,
   zeitpunkt,
   DATE_PART('seconds'::text, zeitpunkt::time without time zone -
LAG(zeitpunkt::time without time zone) OVER (ORDER BY id)) *
1000::double precision AS dauer_in_ms
FROM
   kkm_ergebnisse.t_protokoll_details
WHERE
   id_rechenlauf = 123
ORDER BY
   schritt ;

Execution plan query against view (slow):
Sort  (cost=570776.54..570779.14 rows=1037 width=347) (actual
time=4067.919..4068.133 rows=11250 loops=1)
   Output: v_protokoll_details_mit_dauer.id,
v_protokoll_details_mit_dauer.schritt,
v_protokoll_details_mit_dauer.objekt_typ,
v_protokoll_details_mit_dauer.objekt_id,
v_protokoll_details_mit_dauer.zeit,
v_protokoll_details_mit_dauer.rechenweg_thema, v_proto (...)
   Sort Key: v_protokoll_details_mit_dauer.schritt
   Sort Method: quicksort  Memory: 6185kB
   Buffers: shared hit=30925 read=166050
   ->  Subquery Scan on v_protokoll_details_mit_dauer
(cost=0.43..570724.60 rows=1037 width=347) (actual
time=4038.722..4062.481 rows=11250 loops=1)
         Output: v_protokoll_details_mit_dauer.id,
v_protokoll_details_mit_dauer.schritt,
v_protokoll_details_mit_dauer.objekt_typ,
v_protokoll_details_mit_dauer.objekt_id,
v_protokoll_details_mit_dauer.zeit,
v_protokoll_details_mit_dauer.rechenweg_thema, v (...)
         Filter: (v_protokoll_details_mit_dauer.id_rechenlauf = 123)
         Rows Removed by Filter: 3091203
         Buffers: shared hit=30925 read=166050
         ->  WindowAgg  (cost=0.43..531778.35 rows=3115700 width=339)
(actual time=0.127..3839.099 rows=3102453 loops=1)
               Output: pd.id, pd.schritt, pd.objekt_typ, pd.objekt_id,
pd.zeit, pd.rechenweg_thema, pd.rechenweg_variante,
pd.rechenweg_stoffgruppe, pd.formel_inhalt, pd.formel_stoff,
pd.formel_variablen, pd.ergebnis_variable, pd.ergebnis_wert, pd.status,
p (...)
               Buffers: shared hit=30925 read=166050
               ->  Index Scan using pk on
kkm_ergebnisse.t_protokoll_details pd  (cost=0.43..446096.60
rows=3115700 width=339) (actual time=0.109..1682.884 rows=3102453 loops=1)
                     Output: pd.id, pd.schritt, pd.objekt_typ,
pd.objekt_id, pd.zeit, pd.rechenweg_thema, pd.rechenweg_variante,
pd.rechenweg_stoffgruppe, pd.formel_inhalt, pd.formel_stoff,
pd.formel_variablen, pd.ergebnis_variable, pd.ergebnis_wert, pd.sta (...)
                     Buffers: shared hit=30925 read=166050
Planning time: 0.323 ms
Execution time: 4069.073 ms

Execution plan of query against table (fast):
Sort  (cost=305.58..308.17 rows=1037 width=339) (actual
time=51.558..52.140 rows=11250 loops=1)
   Output: id, schritt, objekt_typ, objekt_id, zeit, rechenweg_thema,
rechenweg_variante, rechenweg_stoffgruppe, formel_inhalt, formel_stoff,
formel_variablen, ergebnis_variable, ergebnis_wert, status,
id_rechenlauf, formel_id, formel_name, formel_variante, (...)
   Sort Key: t_protokoll_details.schritt
   Sort Method: quicksort  Memory: 6185kB
   Buffers: shared hit=687
   ->  WindowAgg  (cost=222.52..253.63 rows=1037 width=339) (actual
time=15.925..37.210 rows=11250 loops=1)
         Output: id, schritt, objekt_typ, objekt_id, zeit,
rechenweg_thema, rechenweg_variante, rechenweg_stoffgruppe,
formel_inhalt, formel_stoff, formel_variablen, ergebnis_variable,
ergebnis_wert, status, id_rechenlauf, formel_id, formel_name, formel_var
(...)
         Buffers: shared hit=687
         ->  Sort  (cost=222.52..225.12 rows=1037 width=339) (actual
time=15.905..16.660 rows=11250 loops=1)
               Output: id, schritt, objekt_typ, objekt_id, zeit,
rechenweg_thema, rechenweg_variante, rechenweg_stoffgruppe,
formel_inhalt, formel_stoff, formel_variablen, ergebnis_variable,
ergebnis_wert, status, id_rechenlauf, formel_id, formel_name, form (...)
               Sort Key: t_protokoll_details.id
               Sort Method: quicksort  Memory: 6185kB
               Buffers: shared hit=687
               ->  Index Scan using fki_protokoll_details_id_rechenlauf
on kkm_ergebnisse.t_protokoll_details  (cost=0.43..170.58 rows=1037
width=339) (actual time=0.037..7.281 rows=11250 loops=1)
                     Output: id, schritt, objekt_typ, objekt_id, zeit,
rechenweg_thema, rechenweg_variante, rechenweg_stoffgruppe,
formel_inhalt, formel_stoff, formel_variablen, ergebnis_variable,
ergebnis_wert, status, id_rechenlauf, formel_id, formel_name (...)
                     Index Cond: (t_protokoll_details.id_rechenlauf = 123)
                     Buffers: shared hit=687
Planning time: 0.229 ms
Execution time: 55.245 ms

Thanks for any insight...


Re: [GENERAL] select on view shows different plan than select on table

От
"David G. Johnston"
Дата:
On Wednesday, April 12, 2017, Bernd Lehmkuhl <bernd.lehmkuhl@mailbox.org> wrote:
Why do I get different execution plans when querying the view like this:
SELECT
  *
FROM
  kkm_ergebnisse.v_protokoll_details_mit_dauer
WHERE
  id_rechenlauf = 123
ORDER BY
  schritt ;

opposed to querying against the definition of the view?
SELECT
  id,
  schritt,
  objekt_typ,
  objekt_id,
  zeit,
  rechenweg_thema,
  rechenweg_variante,
  rechenweg_stoffgruppe,
  formel_inhalt,
  formel_stoff,
  formel_variablen,
  ergebnis_variable,
  ergebnis_wert,
  status,
  id_rechenlauf,
  formel_id,
  formel_name,
  formel_variante,
  ergebnis_variable_einheit,
  zeitpunkt,
  DATE_PART('seconds'::text, zeitpunkt::time without time zone - LAG(zeitpunkt::time without time zone) OVER (ORDER BY id)) * 1000::double precision AS dauer_in_ms
FROM
  kkm_ergebnisse.t_protokoll_details
WHERE
  id_rechenlauf = 123
ORDER BY
  schritt ;


The window aggregate defined in the view is an optimization fence which prevents the view from having the where clause of the user pushed down.  Thus you are computing lag over all three million plus records in the table before throwing away most of them.  When done inline the partition seen is smaller and so is evaluated more quickly.

David J.