Re: Should Oracle outperform PostgreSQL on a complex

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Should Oracle outperform PostgreSQL on a complex
Дата
Msg-id 8627.1134843222@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Should Oracle outperform PostgreSQL on a complex  (Simon Riggs <simon@2ndquadrant.com>)
Ответы Re: Should Oracle outperform PostgreSQL on a complex  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Should Oracle outperform PostgreSQL on a complex  (Mark Kirkwood <markir@paradise.net.nz>)
Re: Should Oracle outperform PostgreSQL on a complex  (Mark Kirkwood <markir@paradise.net.nz>)
Re: Should Oracle outperform PostgreSQL on a complex  (Greg Stark <gsstark@mit.edu>)
Re: Should Oracle outperform PostgreSQL on a complex  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-performance
Simon Riggs <simon@2ndquadrant.com> writes:
> On Fri, 2005-12-16 at 23:28 -0500, Bruce Momjian wrote:
>> How are star joins different from what we do now?

> Methods:
> 1. join all N small tables together in a cartesian product, then join to
> main Large table once (rather than N times)

Of course, the reason the current planner does not think of this is that
it does not consider clauseless joins unless there is no alternative.

However, I submit that it wouldn't pick such a plan anyway, and should
not, because the idea is utterly stupid.  The plan you currently get for
this sort of scenario is typically a nest of hash joins:

                               QUERY PLAN
------------------------------------------------------------------------
 Hash Join  (cost=2.25..4652.25 rows=102400 width=16)
   Hash Cond: ("outer".f1 = "inner".f1)
   ->  Hash Join  (cost=1.12..3115.12 rows=102400 width=12)
         Hash Cond: ("outer".f2 = "inner".f1)
         ->  Seq Scan on fact  (cost=0.00..1578.00 rows=102400 width=8)
         ->  Hash  (cost=1.10..1.10 rows=10 width=4)
               ->  Seq Scan on d2  (cost=0.00..1.10 rows=10 width=4)
   ->  Hash  (cost=1.10..1.10 rows=10 width=4)
         ->  Seq Scan on d1  (cost=0.00..1.10 rows=10 width=4)
(9 rows)

This involves only one scan of the fact table.  As each row is pulled up
through the nest of hash joins, we hash one dimension key and join to
one small table at each level.  This is at worst the same amount of work
as hashing all the keys at once and probing a single cartesian-product
hashtable, probably less work (fewer wasted key-comparisons).  And
definitely less memory.  You do have to keep your eye on the ball that
you don't waste a lot of overhead propagating the row up through
multiple join levels, but we got rid of most of the problem there in
8.1 via the introduction of "virtual tuple slots".  If this isn't fast
enough yet, it'd make more sense to invest effort in further cutting the
executor's join overhead (which of course benefits *all* plan types)
than in trying to make the planner choose a star join.

> 2. transform joins into subselects, then return subselect rows via an
> index bitmap. Joins are performed via a bitmap addition process.

This one might be interesting but it's not clear what you are talking
about.  "Bitmap addition"?

            regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Should Oracle outperform PostgreSQL on a complex
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Should Oracle outperform PostgreSQL on a complex