Обсуждение: Index Ignored Due To Use Of View

Поиск
Список
Период
Сортировка

Index Ignored Due To Use Of View

От
"Donald Fraser"
Дата:
PostgreSQL 8.3.14
OS: Linux Redhat 5.4

Note: I have used the same subject for this email taken from an email: Post=
ed 2011-02-24 13:29:22-08 by "David Johnston", because this seems to be a v=
ery similar observation.

Bug/Problem Summary:
We are using a simple query based on a simple view and the query optimizer =
is not choosing an index.=20
The same query without the view is using an index.
The same query on an almost identical view, but having either removed a sin=
gle column which was generated via a function call or replace the function =
call with equivalent SQL, then the query optimizer is choosing an index.

Other notes:
We observe the same behaviour on two separate databases running on separate=
 servers (both the same version).
The genetic query optimizer settings are all on defaults for these versions.

Details;
With the following view:

CREATE OR REPLACE VIEW vu_tbl_news_web AS=20
SELECT=20
 n.id, n.id_cmpy, n.id_news, n.id_newshdline,=20
 n.s_origcmpyname, n.s_hdline, n.s_news,=20
 n.b_amend, n.b_replace,=20
 n.dt_publish,=20
 n.tsv_hdline, n.tsv_news,=20
 n.b_hasorigdoc,=20
 (SELECT h.s_hdline FROM tbl_newshdline h WHERE h.id =3D n.id_newshdline LI=
MIT 1) AS s_hdlinetype,=20
 get_cmpyname(n.id_contrib) AS s_provider
FROM  tbl_news n
WHERE  n.dt_publish IS NOT NULL;

Query:
SELECT  n.id, n.id_cmpy, n.s_origcmpyname,=20
 n.s_hdline, n.dt_publish=20
FROM  vu_tbl_news_web n=20
ORDER BY n.dt_publish DESC=20
LIMIT 25

Analyse produces:
Limit  (cost=3D180017.37..180017.43 rows=3D25 width=3D80)
  ->  Sort  (cost=3D180017.37..180110.54 rows=3D37267 width=3D80)
        Sort Key: n.dt_publish
        ->  Subquery Scan n  (cost=3D0.00..178965.72 rows=3D37267 width=3D8=
0)
              ->  Seq Scan on tbl_news n  (cost=3D0.00..178593.05 rows=3D37=
267 width=3D1152)
                    Filter: (dt_publish IS NOT NULL)
                    SubPlan
                      ->  Limit  (cost=3D0.00..4.29 rows=3D1 width=3D22)
                            ->  Seq Scan on tbl_newshdline h  (cost=3D0.00.=
.4.29 rows=3D1 width=3D22)
                                  Filter: (id =3D $0)

Although the number of rows (37k) is small, there is a lot of data in some =
of the columns so, with a seq. scan its taking nearly 2 seconds. Compared t=
o 16ms when using an index.

Now if I perform the same query without using the view, such as:

SELECT  n.id, n.id_cmpy, n.s_origcmpyname,=20
 n.s_hdline, n.dt_publish=20
FROM  tbl_news n=20
WHERE  n.dt_publish IS NOT NULL=20
ORDER BY n.dt_publish DESC=20
LIMIT 25

Analyse produces:
Limit  (cost=3D0.00..21.34 rows=3D25 width=3D73)
  ->  Index Scan Backward using tbl_news_publish1_key on tbl_news n  (cost=
=3D0.00..31807.05 rows=3D37267 width=3D73)

Finally I discovered that if I remove a column from the view, that is gener=
ated via a function, then all works as expected?

CREATE OR REPLACE VIEW vu_tbl_news_web3 AS=20
SELECT=20
 n.id, n.id_cmpy, n.id_news, n.id_newshdline,=20
 n.s_origcmpyname, n.s_hdline, n.s_news,=20
 n.b_amend, n.b_replace,=20
 n.dt_publish, n.tsv_hdline, n.tsv_news,=20
 n.b_hasorigdoc,
 (SELECT h.s_hdline FROM tbl_newshdline h WHERE h.id =3D n.id_newshdline LI=
MIT 1) AS s_hdlinetype
FROM  tbl_news n
WHERE  n.dt_publish IS NOT NULL;

SELECT  n.id, n.id_cmpy, n.s_origcmpyname,=20
 n.s_hdline, n.dt_publish=20
FROM  vu_tbl_news_web3 n=20
ORDER BY n.dt_publish DESC=20
LIMIT 25

Analyse produces:
Limit  (cost=3D0.00..21.34 rows=3D25 width=3D73)
  ->  Index Scan Backward using tbl_news_publish1_key on tbl_news n  (cost=
=3D0.00..31807.05 rows=3D37267 width=3D73)

The definition of the function used is:
CREATE OR REPLACE FUNCTION get_cmpyname(integer)  RETURNS citext AS
$BODY$
DECLARE
 idcmpy ALIAS FOR $1;
 sumb citext;
 sres citext;
BEGIN
 SELECT INTO sumb, sres s_umbname, s_name FROM tbl_cmpy WHERE (id =3D idcmp=
y) LIMIT 1;
 IF length(sumb) > 0 THEN
  sres :=3D  sumb || ' - ' || sres;
 END IF;
 RETURN sres;
END $BODY$
LANGUAGE 'plpgsql' VOLATILE STRICT SECURITY DEFINER COST 100;

I also noted that if I replace the function call in the view with equivalen=
t SQL:
(SELECT (CASE WHEN length(c.s_umbname) > 0 THEN c.s_umbname || ' - '::citex=
t || c.s_name ELSE c.s_name END) AS s_cmpyname FROM tbl_cmpy c WHERE c.id =
=3D n.id_contrib LIMIT 1) AS s_provider
the problem also goes away.

Regards
Donald Fraser=

Re: Index Ignored Due To Use Of View

От
"Donald Fraser"
Дата:
----- Original Message -----=20
  Sent: Wednesday, March 23, 2011 12:50 PM
  Subject: Index Ignored Due To Use Of View


  PostgreSQL 8.3.14
  OS: Linux Redhat 5.4

  Note: I have used the same subject for this email taken from an email: Po=
sted 2011-02-24 13:29:22-08 by "David Johnston", because this seems to be a=
 very similar observation.

  Bug/Problem Summary:
  We are using a simple query based on a simple view and the query optimize=
r is not choosing an index.=20
  The same query without the view is using an index.
  The same query on an almost identical view, but having either removed a s=
ingle column which was generated via a function call or replace the functio=
n call with equivalent SQL, then the query optimizer is choosing an index.

I found the solution to the problem and it would therefore appear as though=
 this is not a bug!
If I change the function definition to be "STABLE" instead of "VOLATILE", t=
hen the problem goes away.

Regards
Donald Fraser=

Re: Index Ignored Due To Use Of View

От
Merlin Moncure
Дата:
On Wed, Mar 23, 2011 at 8:10 AM, Donald Fraser <postgres@kiwi-fraser.net> w=
rote:
> ----- Original Message -----
>
> Sent: Wednesday, March 23, 2011 12:50 PM
> Subject: Index Ignored Due To Use Of View
> PostgreSQL 8.3.14
> OS: Linux Redhat 5.4
>
> Note: I have used the same subject for this email taken from an email:
> Posted 2011-02-24 13:29:22-08 by "David Johnston", because this seems=A0t=
o be
> a very similar observation.
>
> Bug/Problem Summary:
> We are using a simple query based on a simple view and the query optimizer
> is not choosing an index.
> The same query without the view is using an index.
> The same query on an almost identical view, but having either removed a
> single column which was generated via a function call or replace the
> function call=A0with equivalent SQL, then=A0the query optimizer=A0is=A0ch=
oosing an
> index.
>
>
> I found the solution to the problem and it would therefore appear as thou=
gh
> this=A0is not a bug!
> If I change the function definition to be "STABLE" instead of "VOLATILE",
> then the problem goes away.

Also, it's bad practice to do order by/limit in the new definition
like that.  Leave them off, and do it in the calling query.  I would
write your function like this:

CREATE OR REPLACE FUNCTION get_cmpyname(integer)  RETURNS citext AS
$$
  SELECT CASE
    WHEN length(s_umbname) > 0 THEN s_umbname || '-' || s_res
    ELSE ''
  END FROM tbl_cmpy WHERE id =3D $1;
$$ LANGUAGE sql STABLE STRICT;

I took off the limit 1 because you probably don't need it.

merlin

Re: Index Ignored Due To Use Of View

От
Tom Lane
Дата:
"Donald Fraser" <postgres@kiwi-fraser.net> writes:
> I found the solution to the problem and it would therefore appear as though this is not a bug!
> If I change the function definition to be "STABLE" instead of "VOLATILE", then the problem goes away.

Yeah, that's intentional.  If the planner flattens a view that has a
function in its output list, things get a lot less predictable about
when the function will be called.  For a volatile function, that can
matter a lot.  We used to flatten such views anyway, but we got enough
complaints that we changed it.

http://archives.postgresql.org/pgsql-committers/2006-08/msg00358.php

            regards, tom lane