Re: Postgresql 8.1.4 - performance issues for select on

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: Postgresql 8.1.4 - performance issues for select on
Дата
Msg-id 1161212749.31645.285.camel@dogma.v10.wvs
обсуждение исходный текст
Ответ на Re: Postgresql 8.1.4 - performance issues for select on  ("Jim C. Nasby" <jim@nasby.net>)
Список pgsql-performance
On Wed, 2006-10-18 at 17:35 -0500, Jim C. Nasby wrote:
> On Wed, Oct 18, 2006 at 03:32:15PM -0700, Jeff Davis wrote:
> > On Wed, 2006-10-18 at 17:10 -0500, Jim C. Nasby wrote:
> > > Sorry, don't have the earlier part of this thread, but what about...
> > >
> > > SELECT greatest(max(a), max(b)) ...
> > >
> > > ?
> >
> > To fill you in, we're trying to get the max of a union (a view across
> > two physical tables).
>
> UNION or UNION ALL? You definitely don't want to do a plain UNION if you
> can possibly avoid it.

Oops, of course he must be doing UNION ALL, but for some reason I ran my
test queries with plain UNION (thanks for reminding me). However, it
didn't make a difference, see below.

> > It can be done if you're creative with the query; I suggested a query
> > that selected the max of the max()es of the individual tables. Your
> > query could work too. However, the trick would be getting postgresql to
> > recognize that it can transform "SELECT max(x) FROM foo" into that,
> > where foo is a view of a union.
> >
> > If PostgreSQL could sort the result of a union by merging the results of
> > two index scans, I think the problem would be solved. Is there something
> > preventing this, or is it just something that needs to be added to the
> > planner?
>
> Hrm... it'd be worth trying the old ORDER BY ... LIMIT 1 trick just to
> see if that worked in this case, but I don't have much hope for that.

Yeah, that's the solution. Here's the problem:

=> set enable_seqscan = false;
SET
=> EXPLAIN SELECT i FROM (SELECT i FROM t10 UNION ALL SELECT i FROM t11)
t ORDER BY i DESC;
                                     QUERY PLAN
------------------------------------------------------------------------------------
 Sort  (cost=200026772.96..200027272.96 rows=200000 width=4)
   Sort Key: t.i
   ->  Append  (cost=100000000.00..200004882.00 rows=200000 width=4)
         ->  Seq Scan on t10  (cost=100000000.00..100001441.00
rows=100000 width=4)
         ->  Seq Scan on t11  (cost=100000000.00..100001441.00
rows=100000 width=4)
(5 rows)

=> EXPLAIN SELECT i FROM (SELECT i FROM t10) t ORDER BY i DESC;
                                     QUERY PLAN
------------------------------------------------------------------------------------
 Index Scan Backward using t10_idx on t10  (cost=0.00..1762.00
rows=100000 width=4)
(1 row)

=> EXPLAIN SELECT i FROM (SELECT i FROM t11) t ORDER BY i DESC;
                                     QUERY PLAN
------------------------------------------------------------------------------------
 Index Scan Backward using t11_idx on t11  (cost=0.00..1762.00
rows=100000 width=4)
(1 row)

=>

But if PostgreSQL could just merge the index scan results, it could
"ORDER BY i" the result of a UNION ALL without a problem. But it can't
do that, so the syntactical trick introduced for min/max won't work in
his case :(

He'll probably have to change his application to make that query perform
decently if the tables are split.

Ideas?

Regards,
    Jeff Davis


В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Postgresql 8.1.4 - performance issues for select on
Следующее
От: Mark Kirkwood
Дата:
Сообщение: Re: measuring shared memory usage on Windows