Re: improving performance of UNION and ORDER BY

Поиск
Список
Период
Сортировка
От Chris Gamache
Тема Re: improving performance of UNION and ORDER BY
Дата
Msg-id 20020304142111.21267.qmail@web13807.mail.yahoo.com
обсуждение исходный текст
Ответ на Re: improving performance of UNION and ORDER BY  (Jean-Luc Lachance <jllachan@nsd.ca>)
Ответы Re: improving performance of UNION and ORDER BY  (Darren Ferguson <darren@crystalballinc.com>)
Re: improving performance of UNION and ORDER BY  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Список pgsql-general
New Query...

(select
   a.username as "User",
   a.trans_date as "Date",
   tl.longtype as "Type",
   a.trans_data as "Query Data",
   a.trans_charge as "Charged",
   a.user_reference_id as "Reference ID"
 from a_trans_log a, addtypelong tl
 where (
   (tl.shorttype=a.trans_type) and
   (a.trans_date >= '12/31/01'::TIMESTAMP)
) order by 4 desc, 2 limit 20)
union
(select
   b.username as "User",
   b.trans_date as "Date",
   tl.longtype as "Type",
   b.trans_data as "Query Data",
   b.trans_charge as "Charged",
   b.user_reference_id as "Reference ID"
 from b_trans_log b, addtypelong tl
 where (
   (tl.shorttype=b.trans_type) and
   (b.trans_date >= '12/31/01'::TIMESTAMP)
) order by 4 desc, 2 limit 20)
order by 4 desc, 2 limit 20;

Limit  (cost=12674.47..12674.47 rows=4 width=84)
  ->  Sort  (cost=12674.47..12674.47 rows=4 width=84)
        ->  Unique  (cost=12673.83..12674.43 rows=4 width=84)
              ->  Sort  (cost=12673.83..12673.83 rows=40 width=84)
                    ->  Append  (cost=12055.58..12672.77 rows=40 width=84)
                          ->  Subquery Scan *SELECT* 1
(cost=12055.58..12055.58 rows=20 width=84)
                                ->  Limit  (cost=12055.58..12055.58 rows=20
width=84)
                                      ->  Sort  (cost=12055.58..12055.58
rows=23724 width=84)
                                            ->  Hash Join  (cost=1.20..9674.89
rows=23724 width=84)
                                                  ->  Seq Scan on a_trans_log a
 (cost=0.00..8695.30 rows=24455 width=60)
                                                  ->  Hash  (cost=1.16..1.16
rows=16 width=24)
                                                        ->  Seq Scan on
addtypelong tl  (cost=0.00..1.16 rows=16 width=24)
                          ->  Subquery Scan *SELECT* 2  (cost=617.19..617.19
rows=20 width=84)
                                ->  Limit  (cost=617.19..617.19 rows=20
width=84)
                                      ->  Sort  (cost=617.19..617.19 rows=2462
width=84)
                                            ->  Hash Join  (cost=1.20..478.50
rows=2462 width=84)
                                                  ->  Seq Scan on b_trans_log b
 (cost=0.00..378.61 rows=2462 width=60)
                                                  ->  Hash  (cost=1.16..1.16
rows=16 width=24)
                                                        ->  Seq Scan on
addtypelong tl  (cost=0.00..1.16 rows=16 width=24)

That cut the query down to 6 seconds. About 5 seconds longer than I would like
it. I'll take any performance increase, tho. I tried SET enable_seqscan=off;
And it takes 8 seconds. Strange indeed.

The reason I don't put the two tables together is that the tables don't have
the exact same structure... I suppose I could merge the two, but it would be
counterintuitive to logically apply some fields in table a to data collected
for table b. Plus, I do so many more operations on the single tables than the
joined tables that it wouldn't make good use of programming time to rewrite all
the single table ops.

I know... excuses, excuses! Thanks for the input. Any more ideas?

CG

--- Jean-Luc Lachance <jllachan@nsd.ca> wrote:
> Chris,
>
> I believe you can cut down the processing time by including the order by
> and limit in each select:
>
> (SELECT ... ORDER BY 2 DESC, 4 LIMIT 20)
> union
> (SELECT ... ORDER BY 2 DESC, 4 LIMIT 20)
> ORDER BY 2 DESC, 4 LIMIT 20
>
> Give it a try and let me know
>
>
> JLL
>
> P.S.
>
> Question:  Why not have a single table with a field log_type in ( 'A',
> 'B') ???
>
>
>
> Chris Gamache wrote:
> >
> > Three tables...
> > [...]
> >
> > I imagine the combination of UNION and ORDER BY causes the problem, since
> > Postgres has to locate all the rows that match the search criteria, merge
> them,
> > order them, then return the top 20...
> >
> > Any suggestions? Did I forget to provide any data that would make things
> > clearer?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)



__________________________________________________
Do You Yahoo!?
Yahoo! Sports - sign up for Fantasy Baseball
http://sports.yahoo.com

В списке pgsql-general по дате отправления:

Предыдущее
От: George Osvald
Дата:
Сообщение: Replication
Следующее
От: Charles
Дата:
Сообщение: backend closed