Merge Join with an Index Optimization

Поиск
Список
Период
Сортировка
От Michael Malis
Тема Merge Join with an Index Optimization
Дата
Msg-id CAFQtOwp4e1hU6gKi1R_nHgOLTzUJko8EcaC1FdAV0rQFo1ARzA@mail.gmail.com
обсуждение исходный текст
Ответы Re: Merge Join with an Index Optimization  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Hi,

As I understand it, a merge join will currently read all tuples from both subqueries (besides early termination). I believe it should be possible to take advantages of the indexes on one or both of the tables being read from to skip a large number of tuples that would currently be read. As an example, let's say we are doing equi merge join between two tables with the following data:

  (3, 4, 5, 9, 10, 11)
  (0, 1, 2, 6, 7, 8)

Currently, even though no tuples would be returned from the merge join, all of the data will be read from both tables. If there are indexes on both tables, pg should be able to execute as follows. Get the tuple with value 3 from the first table and then look up the first value greater than 3 in the second table using the index. In this case, that value would be 6. Since 3 != 6, pg would then look up the lowest value greater than 6 in the first table. This process repeats, and tuples are output whenever the values are equal. This can be thought of as an alternating nested loop join, where the positions in the indexes are maintained. In the case where only one of the tables has an index, this can be thought of as a nested loop join where the inner relation is the table with the index, and the position in that index is maintained while iterating over the outer relation (the outer relation would need to be sorted for this to work).

I can't demonstrate in any benchmarks, but I imagine this would dramatically speed up cases of a merge join between two tables where the number of tuples that satisfy the join condition is small. I don't know the Postgres internals well enough to know if there is anything obvious that would prevent this optimization.

Thanks,
Michael

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jim Nasby
Дата:
Сообщение: Re: feature request: explain "with details" option
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: [PATCH] Alter or rename enum value