Обсуждение: Unexplained SQL behavior

Поиск
Список
Период
Сортировка

Unexplained SQL behavior

От
"JOE"
Дата:
I am debugging a performance problem with a view.  I have narrowed down the problem to when I adeed columns to my view.
In the examples below I have a view with 10 columns.  I run an explain plan and it uses the primary key of the driver table (enrollment table) as to be expected and executes in 1.86 msec.  When I added another columns (and it is a null value casted to a timestamp - LASTACCESSED) the execution plan now uses a totally different plan and now runs for 8622.16 msec.  Anyone have any ideas or explination that may help understand??  --- thanks
 
drop view lv2 \g
create view lv2
( LEARNERID, ENROLLMENTID, GUID, EXPIREDDATE, FIRSTACCESSDATE, DISPLAYORDER,
     ASSESSMENTSSTARTED,COMPLETE, EXPIRED, TITLE ) AS
select distinct enr.learnerid, enr.enrollmentid, lov.ccguid, enr.enddate, null::timestamp without time zone,
     lov.displayorder,   'false'::character varying,   'F'::character varying,
     case when enr.endDate::timestamp without time zone > now()::timestamp without time zone then 'false' else 'true' end as "expired",
     lov.displaytitlemeid
  from enrollment enr, learningObjectView lov
where enr.productid = lov.productId and not exists
        (select null
          from learnerAttendance la, learnerLaunchableAttendance lla
        where enr.enrollmentID = la.enrollmentID and la.learnerAttendanceID = lla.learnerAttendanceID and lla.launchableGUID = lov.ccGuid) \g
explain analyze select * from lv2 where enrollmentid = 21462\g
NOTICE:  QUERY PLAN:
Subquery Scan lv2  (cost=264.90..264.98 rows=1 width=77) (actual time=1.53..1.61 rows=12 loops=1)
  ->  Unique  (cost=264.90..264.98 rows=1 width=77) (actual time=1.52..1.55 rows=12 loops=1)
        ->  Sort  (cost=264.90..264.90 rows=3 width=77) (actual time=1.52..1.53 rows=12 loops=1)
              ->  Nested Loop  (cost=0.00..264.87 rows=3 width=77) (actual time=0.57..1.20 rows=12 loops=1)
                    ->  Index Scan using enrollment_pk on enrollment enr  (cost=0.00..3.02 rows=1 width=20) (actual time=0.21..0.21 rows=1 loops=1)
                    ->  Index Scan using i_learningobjectview_productid on learningobjectview lov  (cost=0.00..25.51 rows=6 width=57) (actual time=0.19..0.30 rows=13 loops=1)
                    SubPlan
                      ->  Nested Loop  (cost=0.00..36.69 rows=1 width=8) (actual time=0.04..0.04 rows=0 loops=13)
                            ->  Index Scan using i_learnerlaunchattend_lguid on learnerlaunchableattendance lla  (cost=0.00..20.19 rows=5 width=4) (actual time=0.02..0.02 rows=0 loops=13)
                            ->  Index Scan using learnerattendance_pk on learnerattendance la  (cost=0.00..3.02 rows=1 width=4) (actual time=0.16..0.16 rows=1 loops=1)
Total runtime: 1.86 msec
 
NOW I ADDED ANOTHER NULL column (LASTACCESSED field)
 
drop view lv2 \g
create view lv2
( LEARNERID, ENROLLMENTID, GUID, EXPIREDDATE, FIRSTACCESSDATE, LASTACCESSED, DISPLAYORDER,
     ASSESSMENTSSTARTED,COMPLETE, EXPIRED, TITLE ) AS
select distinct enr.learnerid, enr.enrollmentid, lov.ccguid, enr.enddate, null::timestamp without time zone, null::timestamp without time zone,
    lov.displayorder,   'false'::character varying,   'F'::character varying,
     case when enr.endDate::timestamp without time zone > now()::timestamp without time zone then 'false' else 'true' end as "expired",
     lov.displaytitlemeid
  from enrollment enr, learningObjectView lov
where enr.productid = lov.productId and not exists
        (select null
          from learnerAttendance la, learnerLaunchableAttendance lla
        where enr.enrollmentID = la.enrollmentID and la.learnerAttendanceID = lla.learnerAttendanceID and lla.launchableGUID = lov.ccGuid) \g
explain analyze select * from lv2 where enrollmentid = 21462\g
NOTICE:  QUERY PLAN:
Subquery Scan lv2  (cost=1968402.36..1969071.62 rows=2677 width=77) (actual time=8273.16..8516.75 rows=12 loops=1)
  ->  Unique  (cost=1968402.36..1969071.62 rows=2677 width=77) (actual time=8273.08..8458.33 rows=63048 loops=1)
        ->  Sort  (cost=1968402.36..1968402.36 rows=26770 width=77) (actual time=8273.07..8343.14 rows=63048 loops=1)
              ->  Merge Join  (cost=0.00..1965714.31 rows=26770 width=77) (actual time=1.28..7226.99 rows=63048 loops=1)
                    ->  Index Scan using i_enrollment_product on enrollment enr  (cost=0.00..608.96 rows=8746 width=20) (actual time=0.17..54.26 rows=8746 loops=1)
                    ->  Index Scan using i_learningobjectview_productid on learningobjectview lov  (cost=0.00..207.57 rows=3278 width=57) (actual time=0.15..333.79 rows=67003 loops=1)
                    SubPlan
                      ->  Nested Loop  (cost=0.00..36.69 rows=1 width=8) (actual time=0.09..0.09 rows=0 loops=64942)
                            ->  Index Scan using i_learnerlaunchattend_lguid on learnerlaunchableattendance lla  (cost=0.00..20.19 rows=5 width=4) (actual time=0.01..0.03 rows=5 loops=64942)
                            ->  Index Scan using learnerattendance_pk on learnerattendance la  (cost=0.00..3.02 rows=1 width=4) (actual time=0.01..0.01 rows=0 loops=309102)
Total runtime: 8622.16 msec
 

Re: Unexplained SQL behavior

От
Tom Lane
Дата:
"JOE" <joe@piscitella.com> writes:
> In the examples below I have a view with 10 columns.  I run an explain plan=
>  and it uses the primary key of the driver table (enrollment table) as to b=
> e expected and executes in 1.86 msec.  When I added another columns (and it=
>  is a null value casted to a timestamp - LASTACCESSED) the execution plan n=
> ow uses a totally different plan and now runs for 8622.16 msec.  Anyone hav=
> e any ideas or explination that may help understand??  --- thanks

Seems odd to me too, but it's difficult to speculate without being able
to try to reproduce the example.  Could we see the full declarations of
the tables involved?  (pg_dump -s output would be good.)
        regards, tom lane


Re: Unexplained SQL behavior

От
Tom Lane
Дата:
"JOE" <joe@piscitella.com> writes:
> I am debugging a performance problem with a view.  I have narrowed down the=
>  problem to when I adeed columns to my view.

Okay, I see the problem: it is in fact a bug, and one that's been around
for awhile.  (Curious no one's noticed before.)  When you wrote

> select distinct ... ,
>     null::timestamp without time zone,
>     null::timestamp without time zone, ...

the parser felt it could get away with creating only one sort column for
these two entries.  This is logically a valid optimization, but it
confused later stages of the system into thinking you'd written a
DISTINCT ON clause rather than plain DISTINCT.  And that suppresses an
important optimization, namely pushing down the outer query's WHERE
clause into the subselect.  (You'd also find that psql's \d would
display the view definition rather oddly.)

This is a bug and I will fix it for 7.3, but in the meantime the answer
is "don't do that".  I imagine this particular view definition is just a
placeholder until you get around to filling in non-null values for those
columns?  The problem will go away as soon as these two view columns
aren't obviously equal.  If you really need a view that works just like
this, you can work around the bug by making the null columns trivially
different, perhaps

> select distinct ... ,
>     null::timestamp with time zone::timestamp without time zone,
>     null::timestamp without time zone, ...
        regards, tom lane