Re: GIN improvements part2: fast scan

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: GIN improvements part2: fast scan
Дата
Msg-id CAPpHfdtt=OAhg0+KtNX=thwpH+JeF=RhkD60u9pdjcrnPYDFTA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: GIN improvements part2: fast scan  (Alexander Korotkov <aekorotkov@gmail.com>)
Список pgsql-hackers
On Wed, Jun 19, 2013 at 12:30 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Wed, Jun 19, 2013 at 11:48 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 18.06.2013 23:59, Alexander Korotkov wrote:
I would like to illustrate that on example. Imagine you have fulltext query
"rare_term&  frequent_term". Frequent term has large posting tree while

rare term has only small posting list containing iptr1, iptr2 and iptr3. At
first we get iptr1 from posting list of rare term, then we would like to
check whether we have to scan part of frequent term posting tree where iptr
<  iptr1. So we call pre_consistent([false, true]), because we know that
rare term is not present for iptr<  iptr2. pre_consistent returns false. So
we can start scanning frequent term posting tree from iptr1. Similarly we
can skip lags between iptr1 and iptr2, iptr2 and iptr3, from iptr3 to
maximum possible pointer.

Thanks, now I understand the rare-term & frequent-term problem. Couldn't you do that with the existing consistent function? I don't see why you need the new pre-consistent function for this.

In the case of two entries I can. But in the case of n entries things becomes more complicated. Imagine you have "term_1 & term_2 & ... & term_n" query. When you get some item pointer from term_1 you can skip all the lesser item pointers from term_2, term_3 ... term_n. But if all you have for it is consistent function you have to call it with following check arguments: 
1) [false, false, false, ... , false]
2) [false, true, false, ... , false]
3) [false, false, true, ... , false]
4) [false, true, true, ..., false]
......
i.e. you have to call it 2^(n-1) times.

To be precise you don't need the first check argument I listed. So, it's 2^(n-1)-1 calls.

------
With best regards,
Alexander Korotkov.  

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: GIN improvements part2: fast scan
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: GIN improvements part2: fast scan