Обсуждение: Planner is ignoring index when view is queried but uses index when running underlying view's sql

Поиск
Список
Период
Сортировка

I've encountered a very strange result when using a view in a query VS the underlying sql of the view. When accessing the view I get a different query plan than running the sql that makes up the view. Using the view is 10x slower and generates a strange query plan that seems to ignore the existence of an index.


Two tables at play:

session and person. a person has zero or more sessions

The following indexes exist:

  • last_seen on person
  • person_id on person
  • person_id on session


SQL below:

 WITH cte AS (
          SELECT session_.person_id,  count(*) AS session_count
           FROM session
           GROUP BY session_.person_id
        )
 SELECT p.*     
 FROM person_ p
 LEFT JOIN cte sc ON (p.person_id = sc.person_id)
 ORDER BY p.last_seen DESC
 LIMIT 5


Below is the view which simply wraps the above sql (minus the order by and limit) in CREATE VIEW :

CREATE OR REPLACE VIEW person_view_ AS
    WITH cte AS (
            SELECT session.person_id,  count(*) AS session_count
             FROM session
             GROUP BY session.person_id
            )
     SELECT p.*     
     FROM person p
     LEFT JOIN cte sc ON (p.person_id = sc.person_id)

When I use the above view like so:

select P.*
from person_view as P
order by last_seen desc 
limit 5

I get a slow running query (over 10 seconds) with the following query plan:

Limit  (cost=374531.52..374531.59 rows=5 width=374)
  ->  Subquery Scan on p  (cost=374531.52..380250.77 rows=457540 width=374)
        ->  Sort  (cost=374531.52..375675.37 rows=457540 width=3329)
              Sort Key: p_1.last_seen
              CTE cte
                ->  GroupAggregate  (cost=0.42..37443.15 rows=138975 width=25)
                      Group Key: session_2f28b2dd_cfa9_40bb_89ed_25482f835838.person_id
                      ->  Index Only Scan using session_2f28b2dd_cfa9_40bb_89ed_25482f835838_person_id_idx on session_2f28b2dd_cfa9_40bb_89ed_25482f835838  (cost=0.42..32164.15 rows=777850 width=25)
              ->  Hash Right Join  (cost=60712.65..294071.52 rows=457540 width=3329)
                    Hash Cond: (sc.person_id = p_1.person_id)
                    ->  CTE Scan on cte sc  (cost=0.00..2779.50 rows=138975 width=108)
                    ->  Hash  (cost=54993.40..54993.40 rows=457540 width=3321)
                          ->  Seq Scan on person_2f28b2dd_cfa9_40bb_89ed_25482f835838 p_1  (cost=0.00..54993.40 rows=457540 width=3321)


But when I run the sql in the view directly (first code block in post) I get results in under a second with the following query plan:

Limit  (cost=37443.57..53083.10 rows=5 width=3329)
  CTE cte
    ->  GroupAggregate  (cost=0.42..37443.15 rows=138975 width=25)
          Group Key: session_2f28b2dd_cfa9_40bb_89ed_25482f835838.person_id
          ->  Index Only Scan using session_2f28b2dd_cfa9_40bb_89ed_25482f835838_person_id_idx on session_2f28b2dd_cfa9_40bb_89ed_25482f835838  (cost=0.42..32164.15 rows=777850 width=25)
  ->  Nested Loop Left Join  (cost=0.42..1431141564.79 rows=457540 width=3329)
        Join Filter: (p.person_id = sc.person_id)
        ->  Index Scan Backward using person_2f28b2dd_cfa9_40bb_89ed_25482f835838_last_seen_idx on person_2f28b2dd_cfa9_40bb_89ed_25482f835838 p  (cost=0.42..213565.14 rows=457540 width=3321)
        ->  CTE Scan on cte sc  (cost=0.00..2779.50 rows=138975 width=108)


If you look at both query plans you will see that when querying the view it performs Seq Scan on person but when using the underlying sql its does Index Scan Backward using person
Why are indexes being ignored by postgresql when using the view?

Regards,

Max Kremer

twitter: @maxtrialfire
skype: maxkremer


Max Kremer <mkremer@trialfire.com> writes:
> I've encountered a very strange result when using a view in a query VS the
> underlying sql of the view. When accessing the view I get a different query
> plan than running the sql that makes up the view. Using the view is 10x
> slower and generates a strange query plan that seems to ignore
> the existence of an index.

I believe the problem is that the WITH attached to the view's SELECT
prevents the view sub-select from being flattened into the outer query
(ie, this is another way in which WITH acts as an optimization fence).
So the sub-select is planned without awareness of the ORDER BY/LIMIT
that would encourage picking a suitably ordered join plan.

Try writing the CTE as a plain sub-select, instead.

(AFAIR, this behavior just stems from a lack of round tuits and not
any fundamental difficulty: the WITHs could perfectly well be hoisted
up to the outer query.  But don't hold your breath waiting for that
to happen.)

(BTW, what PG version is that?)

            regards, tom lane


Thanks Tom.
  So I ended up with the CTE as one of my optimization attempts. I started off with a regular sub-select so that's not it. However I have stumbled onto something - deleting the index (on last_seen) actually makes it faster!!! Somehow the combination of the index and query in the view was tripping up the query planner into selecting a bad plan? Is that possible?

Regards,

Max Kremer

twitter: @maxtrialfire
skype: maxkremer



On Tue, Jun 7, 2016 at 5:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Max Kremer <mkremer@trialfire.com> writes:
> I've encountered a very strange result when using a view in a query VS the
> underlying sql of the view. When accessing the view I get a different query
> plan than running the sql that makes up the view. Using the view is 10x
> slower and generates a strange query plan that seems to ignore
> the existence of an index.

I believe the problem is that the WITH attached to the view's SELECT
prevents the view sub-select from being flattened into the outer query
(ie, this is another way in which WITH acts as an optimization fence).
So the sub-select is planned without awareness of the ORDER BY/LIMIT
that would encourage picking a suitably ordered join plan.

Try writing the CTE as a plain sub-select, instead.

(AFAIR, this behavior just stems from a lack of round tuits and not
any fundamental difficulty: the WITHs could perfectly well be hoisted
up to the outer query.  But don't hold your breath waiting for that
to happen.)

(BTW, what PG version is that?)

                        regards, tom lane