Re: 9.3 Pre-proposal: Range Merge Join

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: 9.3 Pre-proposal: Range Merge Join
Дата
Msg-id 1334598230.19915.16.camel@jdavis
обсуждение исходный текст
Ответ на Re: 9.3 Pre-proposal: Range Merge Join  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: 9.3 Pre-proposal: Range Merge Join
Список pgsql-hackers
On Mon, 2012-04-16 at 02:52 -0400, Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> >  1. Order the ranges on both sides by the lower bound, then upper bound.
> > Empty ranges can be excluded entirely.
> >  2. Left := first range on left, Right := first range on right
> >  3. If Left or Right is empty, terminate.
> >  4. If lower(Left) > upper(Right), discard Right, goto 2
> >  5. If lower(Right) > upper(Left), discard Left, goto 2
> >  6. return (Left, Right) as joined tuple
> >  7. Right := next range on right
> >  8. goto 3
> 
> This is surely not correct in detail.  As written, it will be impossible
> for any tuple on the right side to be joined to more than one left-side
> tuple.  You will need something analogous to the mark-and-restore
> rewinding logic in standard mergejoin to make this work.

Every time you discard a tuple on the left, you go to step 2, which
rewinds the right side back to the first non-discarded tuple.

So, implemented using mark and restore:
 * start off with the marks at the first tuple on each side * "discard" means move the mark down a tuple * setting it
backto the first range means restoring to the mark * going to the next range (step 7) just means getting another
tuple,without changing the mark
 

Only one side really needs the mark and restore logic, but it was easier
to write the pseudocode in a symmetrical way (except step 7).

> I will note that mergejoin is not only one of the more complicated
> executor modules, but it accounts for a huge percentage of the planner's
> complexity; which doesn't exactly make me sympathize with the notion of
> let's-copy-and-paste-all-that.  It'd be a lot better if we could build
> it as a small tweak to mergejoin rather than an independent thing.
> 
> (Having said that, I have no idea how the planner support could work
> at all, because its treatment of mergejoins is intimately tied to
> mergejoinable equality and EquivalenceClasses and PathKeys, which don't
> seem readily extensible to the range situation.)

I think this is the more important problem area. It's clear that I'll
need to do some more investigation into the planning.

Regards,Jeff Davis



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: index-only scans vs. Hot Standby, round two