Re: Need help optimizing this query

Поиск
Список
Период
Сортировка
От Pat Maddox
Тема Re: Need help optimizing this query
Дата
Msg-id 810a540e0707181448y135015edt4a861560db370f58@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Need help optimizing this query  (Michael Glaesemann <grzm@seespotcode.net>)
Ответы Re: Need help optimizing this query  ("Pat Maddox" <pergesu@gmail.com>)
Re: Need help optimizing this query  (Michael Glaesemann <grzm@seespotcode.net>)
Список pgsql-general
On 7/18/07, Michael Glaesemann <grzm@seespotcode.net> wrote:
>
> On Jul 18, 2007, at 16:12 , Pat Maddox wrote:
>
> > ERROR:  invalid reference to FROM-clause entry for table "video_views"
> > LINE 20: JOIN assets ON (video_views.video_id=videos.id)
> >                         ^
> > HINT:  There is an entry for table "video_views", but it cannot be
> > referenced from this part of the query.
>
> It's because I mismatched the JOIN clauses during my copy-and-paste :(
>
> > On 7/18/07, Michael Glaesemann <grzm@seespotcode.net> wrote:
>
> >> FROM video_views
> >> JOIN assets ON (video_views.video_id=videos.id)
> >> JOIN videos ON (video_views.asset_id=assets.id)
>
> This should be
>
> FROM video_views
> JOIN assets ON (video_views.asset_id=assets.id)
> JOIN videos ON (video_views.video_id=videos.id)
>
> Do you have the EXPLAIN ANALYE output of the query?
>
> Michael Glaesemann
> grzm seespotcode net
>
>
>

For some reason the functions you wrote are giving me trouble (there's
a BIGINT involved, I tried changing the functions around but kept
having issues).  So here's the full query, hopefully formatted better:

SELECT
    SUM(CASE WHEN (hit IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0
END) AS count_hits_console,
    SUM(CASE WHEN (hit IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0
END) AS count_hits_remote,
    SUM(CASE WHEN (played IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0
END) AS count_played_console,
    SUM(CASE WHEN (played IS TRUE AND logged_in IS FALSE) THEN 1 ELSE
0 END) AS count_played_remote,
    SUM(CASE WHEN (downloaded IS TRUE AND logged_in IS TRUE) THEN 1
ELSE 0 END) AS count_downloaded_console,
    SUM(CASE WHEN (downloaded IS TRUE AND logged_in IS FALSE) THEN 1
ELSE 0 END) AS count_downloaded_remote,
    SUM(CASE WHEN ((played IS TRUE OR downloaded IS TRUE) AND
logged_in IS TRUE) THEN assets.size ELSE 0 END) as
download_size_console,
    SUM(CASE WHEN ((played IS TRUE OR downloaded IS TRUE) AND
logged_in IS FALSE) THEN assets.size ELSE 0 END) AS
download_size_remote,
    videos.id,
    videos.title,
    videos.guid
FROM video_views
JOIN assets ON (video_views.asset_id=assets.id)
JOIN videos on (video_views.video_id=videos.id)
WHERE videos.company_id=1
GROUP BY  videos.id,
          videos.title,
          videos.guid
ORDER BY count_hits_remote DESC
LIMIT 100



and here's the EXPLAIN ANALYZE output:

 Limit  (cost=127072.90..127073.12 rows=87 width=64) (actual
time=2636.560..2636.567 rows=20 loops=1)
   ->  Sort  (cost=127072.90..127073.12 rows=87 width=64) (actual
time=2636.558..2636.562 rows=20 loops=1)
         Sort Key: sum(CASE WHEN ((video_views.hit IS TRUE) AND
(video_views.logged_in IS FALSE)) THEN 1 ELSE 0 END)
         ->  HashAggregate  (cost=127067.49..127070.10 rows=87
width=64) (actual time=2636.481..2636.506 rows=20 loops=1)
               ->  Hash Join  (cost=880.96..125995.46 rows=38983
width=64) (actual time=24.904..2635.719 rows=122 loops=1)
                     Hash Cond: (video_views.asset_id = assets.id)
                     ->  Hash Join  (cost=195.96..124433.01 rows=39009
width=60) (actual time=8.327..2618.982 rows=122 loops=1)
                           Hash Cond: (video_views.video_id = videos.id)
                           ->  Seq Scan on video_views
(cost=0.00..101352.70 rows=5998470 width=12) (actual
time=0.031..1410.231 rows=5998341 loops=1)
                           ->  Hash  (cost=194.87..194.87 rows=87
width=52) (actual time=1.001..1.001 rows=90 loops=1)
                                 ->  Bitmap Heap Scan on videos
(cost=4.93..194.87 rows=87 width=52) (actual time=0.111..0.840 rows=90
loops=1)
                                       Recheck Cond: (company_id = 1)
                                       ->  Bitmap Index Scan on
index_videos_on_company_id  (cost=0.00..4.90 rows=87 width=0) (actual
time=0.079..0.079 rows=90 loops=1)
                                             Index Cond: (company_id = 1)
                     ->  Hash  (cost=487.78..487.78 rows=15778
width=12) (actual time=16.527..16.527 rows=15778 loops=1)
                           ->  Seq Scan on assets  (cost=0.00..487.78
rows=15778 width=12) (actual time=0.023..9.601 rows=15778 loops=1)
 Total runtime: 2637.043 ms
(17 rows)


That one runs reasonably fine, because there are only 20 videos being
returned and a handful of video views associated with them.  In the
real query there are about 1k videos and a couple million views.  That
took about 80 minutes to run, according to logs.

Pat

В списке pgsql-general по дате отправления:

Предыдущее
От: Alan Hodgson
Дата:
Сообщение: Re: DBI/DBD::Pg and transactions
Следующее
От: "Roderick A. Anderson"
Дата:
Сообщение: Re: DBI/DBD::Pg and transactions