Re: GIN improvements part2: fast scan

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: GIN improvements part2: fast scan
Дата
Msg-id 51C170A8.9060509@vmware.com
обсуждение исходный текст
Ответ на Re: GIN improvements part2: fast scan  (Alexander Korotkov <aekorotkov@gmail.com>)
Ответы Re: GIN improvements part2: fast scan  (Alexander Korotkov <aekorotkov@gmail.com>)
Список pgsql-hackers
On 19.06.2013 11:30, Alexander Korotkov wrote:
> On Wed, Jun 19, 2013 at 11:48 AM, Heikki Linnakangas<
> hlinnakangas@vmware.com>  wrote:
>
>> On 18.06.2013 23:59, Alexander Korotkov wrote:
>>
>>> I would like to illustrate that on example. Imagine you have fulltext
>>> query
>>> "rare_term&   frequent_term". Frequent term has large posting tree while
>>>
>>> rare term has only small posting list containing iptr1, iptr2 and iptr3.
>>> At
>>> first we get iptr1 from posting list of rare term, then we would like to
>>> check whether we have to scan part of frequent term posting tree where
>>> iptr
>>> <   iptr1. So we call pre_consistent([false, true]), because we know that
>>> rare term is not present for iptr<   iptr2. pre_consistent returns false.
>>> So
>>> we can start scanning frequent term posting tree from iptr1. Similarly we
>>> can skip lags between iptr1 and iptr2, iptr2 and iptr3, from iptr3 to
>>> maximum possible pointer.
>>>
>>
>> Thanks, now I understand the rare-term&  frequent-term problem. Couldn't
>> you do that with the existing consistent function? I don't see why you need
>> the new pre-consistent function for this.
>
> In the case of two entries I can. But in the case of n entries things
> becomes more complicated. Imagine you have "term_1&  term_2&  ...&  term_n"
> query. When you get some item pointer from term_1 you can skip all the
> lesser item pointers from term_2, term_3 ... term_n. But if all you have
> for it is consistent function you have to call it with following check
> arguments:
> 1) [false, false, false, ... , false]
> 2) [false, true, false, ... , false]
> 3) [false, false, true, ... , false]
> 4) [false, true, true, ..., false]
> ......
> i.e. you have to call it 2^(n-1) times. But if you know the query specific
> (i.e. in opclass) it's typically easy to calculate exactly what we need in
> single pass. That's why I introduced pre_consistent.

Hmm. So how does that work with the pre-consistent function? Don't you 
need to call that 2^(n-1)-1 times as well?

- Heikki



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: GIN improvements part2: fast scan
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: GIN improvements part2: fast scan