Re: GIN improvements part2: fast scan

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: GIN improvements part2: fast scan
Дата
Msg-id CAPpHfduJ-X9WKR9h-vyTrNEVNR37ZDx05qOLkA3ys3j-14uKXA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: GIN improvements part2: fast scan  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы Re: GIN improvements part2: fast scan  (Alexander Korotkov <aekorotkov@gmail.com>)
Re: GIN improvements part2: fast scan  (Tomas Vondra <tv@fuzzy.cz>)
Список pgsql-hackers
On Fri, Jun 21, 2013 at 11:43 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 19.06.2013 11:56, Alexander Korotkov wrote:
On Wed, Jun 19, 2013 at 12:49 PM, Heikki Linnakangas<
hlinnakangas@vmware.com>  wrote:

On 19.06.2013 11:30, Alexander Korotkov wrote:

On Wed, Jun 19, 2013 at 11:48 AM, Heikki Linnakangas<
hlinnakangas@vmware.com>   wrote:

  On 18.06.2013 23:59, Alexander Korotkov wrote:

  I would like to illustrate that on example. Imagine you have fulltext
query
"rare_term&    frequent_term". Frequent term has large posting tree while

rare term has only small posting list containing iptr1, iptr2 and iptr3.
At
first we get iptr1 from posting list of rare term, then we would like to
check whether we have to scan part of frequent term posting tree where
iptr
<    iptr1. So we call pre_consistent([false, true]), because we know
that
rare term is not present for iptr<    iptr2. pre_consistent returns
false.
So
we can start scanning frequent term posting tree from iptr1. Similarly
we
can skip lags between iptr1 and iptr2, iptr2 and iptr3, from iptr3 to
maximum possible pointer.


Thanks, now I understand the rare-term&   frequent-term problem. Couldn't

you do that with the existing consistent function? I don't see why you
need
the new pre-consistent function for this.


In the case of two entries I can. But in the case of n entries things
becomes more complicated. Imagine you have "term_1&   term_2&   ...&
  term_n"

query. When you get some item pointer from term_1 you can skip all the
lesser item pointers from term_2, term_3 ... term_n. But if all you have
for it is consistent function you have to call it with following check
arguments:
1) [false, false, false, ... , false]
2) [false, true, false, ... , false]
3) [false, false, true, ... , false]
4) [false, true, true, ..., false]
......
i.e. you have to call it 2^(n-1) times. But if you know the query specific
(i.e. in opclass) it's typically easy to calculate exactly what we need in
single pass. That's why I introduced pre_consistent.


Hmm. So how does that work with the pre-consistent function? Don't you
need to call that 2^(n-1)-1 times as well?


I call pre-consistent once with [false, true, true, ..., true].
Pre-consistent knows that each true passed to it could be false positive.
So, if it returns false it guarantees that consistent will be false for all
possible combinations.

Ok, I see.

I spent some time pondering this. I'd like to find a way to do something about this without requiring another user-defined function. A couple of observations:

1. The profile of that rare-term & frequent-term quest, without any patch, looks like this:

28,55%  postgres           ginCompareItemPointers
19,36%  postgres           keyGetItem
15,20%  postgres           scanGetItem
 7,75%  postgres           checkcondition_gin
 6,25%  postgres           gin_tsquery_consistent
 4,34%  postgres           TS_execute
 3,85%  postgres           callConsistentFn
 3,64%  postgres           FunctionCall8Coll
 3,19%  postgres           check_stack_depth
 2,60%  postgres           entryGetNextItem
 1,35%  postgres           entryGetItem
 1,25%  postgres           MemoryContextReset
 1,12%  postgres           MemoryContextSwitchTo
 0,31%  libc-2.17.so       __memcpy_ssse3_back
 0,24%  postgres           collectMatchesForHeapRow

I was quite surprised by seeing ginCompareItemPointers at the top. It turns out that it's relatively expensive to do comparisons in the format we keep item pointers, packed in 6 bytes, in 3 int16s. I hacked together a patch to convert ItemPointers into uint64s, when dealing with them in memory. That helped quite a bit.

Another important thing in the above profile is that calling consistent-function is taking up a lot of resources. And in the example test case you gave, it's called with the same arguments every time. Caching the result of consistent-function would be a big win.

I wrote a quick patch to do that caching, and together with the itempointer hack, I was able to halve the runtime of that test case. That's impressive, we probably should pursue that low-hanging fruit, but it's still slower than your "fast scan" patch by a factor of 100x. So clearly we do need an algorithmic improvement here, along the lines of your patch, or something similar.

For sure, many advantages can be achieved without "fast scan". For example, sphinx is known to be fast, but it straightforwardly scan each posting list just like GIN now.
 
2. There's one trick we could do even without the pre-consistent function, that would help the particular test case you gave. Suppose that you have two search terms A and B.  If you have just called consistent on a row that matched term A, but not term B, you can skip any subsequent rows in the scan that match A but not B. That means that you can skip over to the next row that matches B. This is essentially the same thing you do with the pre-consistent function, it's just a degenerate case of it. That helps as long as the search contains only one frequent term, but if it contains multiple, then you have to still stop at every row that matches more than one of the frequent terms.

Yes, two terms case is confluent and there is no direct need of preConsistent.
 
3. I'm still not totally convinced that we shouldn't just build the "truth table" by calling the regular consistent function with all the combinations of matching keys, as discussed above. I think that would work pretty well in practice, for queries with up to 5-10 frequent terms. Per the profiling, it probably would make sense to pre-compute such a table anyway, to avoid calling consistent repeatedly.

Why do you mention 5-10 _frequent_ items? If we have 5-10 frequent items and 20 rare items we would have to create "truth table" of frequent items for each new combination of rare items. For 20 rare items truth combinations can be unique for each item pointer, in that case you would have to calculate small "truth table" of frequent items for each item pointers. And then it can appear to be not so small. I mean it seems to me that we should take into account both "frequent" and "rare" items when talking about "truth table".
 
4. If we do go with a new function, I'd like to just call it "consistent" (or consistent2 or something, to keep it separate form the old consistent function), and pass it a tri-state input for each search term. It might not be any different for the full-text search implementation, or any of the other ones for that matter, but I think it would be a more understandable API.

Understandable API makes sense. But for now, I can't see even potentional usage of third state (exact false). Also, with preConsistent interface "as is" in some cases we can use old consistent method as both consistent and preConsistent when it implements monotonous boolean function. For example, it's consistent function for opclasses of arrays.

Revised version of patch is attaches with more comments and docs.

------
With best regards,
Alexander Korotkov.
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Support for REINDEX CONCURRENTLY
Следующее
От: Mark Kirkwood
Дата:
Сообщение: Re: [9.4 CF 1] The Commitfest Slacker List