Re: small table left outer join big table

Поиск
Список
Период
Сортировка
От Gurjeet Singh
Тема Re: small table left outer join big table
Дата
Msg-id AANLkTi=_==09etYitq0Aiysa2BPScWZcq4WSq=mwm-3t@mail.gmail.com
обсуждение исходный текст
Ответ на small table left outer join big table  (Jie Li <jay23jack@gmail.com>)
Список pgsql-hackers
On Tue, Dec 28, 2010 at 5:13 AM, Jie Li <jay23jack@gmail.com> wrote:
Hi,

Please see the following plan:

postgres=# explain select * from small_table left outer join big_table using (id);
                                 QUERY PLAN                                
----------------------------------------------------------------------------
 Hash Left Join  (cost=126408.00..142436.98 rows=371 width=12)
   Hash Cond: (small_table.id = big_table.id)
   ->  Seq Scan on small_table  (cost=0.00..1.09 rows=9 width=8)
   ->  Hash  (cost=59142.00..59142.00 rows=4100000 width=8)
         ->  Seq Scan on big_table  (cost=0.00..59142.00 rows=4100000 width=8)
(5 rows)

Here I have a puzzle, why not choose the small table to build hash table? It can avoid multiple batches thus save significant I/O cost, isn't it?

We can perform this query in two phases:
1) inner join, using the small table to build hash table.
2) check whether each tuple in the hash table has matches before, which can be done with another flag bit

The only compromise is the output order, due to the two separate phases. Not sure whether the SQL standard requires it.


SQL standard does not require the result to be in any particular order unless an ORDER BY is used.

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurjeet@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andy Colson
Дата:
Сообщение: page compression
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Libpq PGRES_COPY_BOTH - version compatibility