Re: GSoC 2015 proposal. Bitmap Index-only Count

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: GSoC 2015 proposal. Bitmap Index-only Count
Дата
Msg-id CAPpHfds=nurYJewEcSvv=Baru3QPct3Hrc7zqSa03VE_UZbgNA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: GSoC 2015 proposal. Bitmap Index-only Count  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Wed, Mar 25, 2015 at 11:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Anastasia Lubennikova <lubennikovaav@gmail.com> writes:
> 2015-03-24 18:01 GMT+04:00 Tom Lane <tgl@sss.pgh.pa.us>:
>> I wonder whether it'd be possible to teach GIN to support index_getnext
>> instead.  Initially it would probably work only for cases where the
>> index didn't have to return any columns ... but if we did it, maybe the
>> door would be open to cases where GIN could reconstruct actual values.

> Another idea is to write index_getnext() for GIN which would return some
> fake tuple. Since there is no difference for COUNT aggregate what the tuple
> contains. COUNT just wants to know whether we have tuple that satisfy the
> qual.

Well, yeah, that would be the idea (at least initially).  You don't have
to return any real data unless you claim you can do so via amcanreturn.
The planner is still capable of selecting an index-only scan as long as
the query retrieves no columns.

The trick would be to not return the same heap TID more than once per
scan.  A zero-order implementation would be to construct the same bitmap
we do now and then just provide a gingetnext function that scans through
that.  That would be pretty awful in terms of scan startup time, so doing
better would be nice; but perhaps it would be useful even in that form.

My ideal picture for FTS using GIN looks like this:

1) Have lexemes offsets in GIN stored with item pointers.
2) Calculate relevance using only GIN information without using heap.
3) Sort results by relevance in either GIN itself or executor node.
4) Get both TOP-N most relevant rows and total rows count (using index-only scan) from single GIN index scan.

Implementing index_getnext() for GIN looks step forward for me because it allows "index only count" and potentially could be used for ordered output. However, it's unclear for me if it's feasible to have #4? Could we return TOP-N results and total count from single GIN index scan?

------
With best regards,
Alexander Korotkov.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabrízio de Royes Mello
Дата:
Сообщение: Doubt about AccessExclusiveLock in ALTER TABLE .. SET ( .. );
Следующее
От: Dima Ivanovskiy
Дата:
Сообщение: Re[2]: [HACKERS] GSoC 2015: SP-GIST for geometrical objects