> On Thu, 19 Jul 2018 at 21:04, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
>
> > > * Just to clarify - the iterating through all the partitions, is it the best
> > > way of finding matching ranges? Correct me if I'm wrong, but from what I see
> > > in the comments for PartitionBoundInfoData, all the ranges are arranged in
> > > increasing order - maybe then it's possible to use binary search for finding
> > > a matching range?
> >
> > The loop in partition_range_bounds_merge() runs as many times as the
> > maximum of number of datums from the given partition bounds. So the
> > complexity is O(n) where n is the maximum of number of datums. With
> > binary search the complexity will increase to O(nlogn). I might be
> > missing something here.
>
> Now I'm confused even more. Correct me if I'm wrong, but here is what I see
> right now:
>
> * We're trying to solve the standard problem of finding overlapping intervals
> from two sets of intervals
>
> * The current implementation implicitly compares every range from one side of a
> join with every range from another side, which is O(n^2).
It's of course wrong, it's going to be O(max(m, n)) as you said, but
the point is still valid - if we have partitions A1, A2 from one side
and B1, ..., BN on another side, we can skip necessary the
partitions from B that are between e.g. A1 and A2 faster with
binary search.