Re: GiST subsplit question

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: GiST subsplit question
Дата
Msg-id CAPpHfdsfRpuCAJzJpyW95yvqycqp1zdQ5rktW2Vmj3ryeBu49g@mail.gmail.com
обсуждение исходный текст
Ответ на GiST subsplit question  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: GiST subsplit question  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Wed, May 30, 2012 at 11:21 PM, Jeff Davis <pgsql@j-davis.com> wrote:
I looked for the follow-up commit to support subsplit in the contrib
modules, figuring that would answer some questions, but I couldn't find
it.

The part that's confusing me is that the commit message says: "pickSplit
should set spl_(l|r)datum_exists to 'false'", but I don't see any
picksplit method that actually does that in contrib, nor in the sample
in the docs.
 
The only picksplit implementation I know to support secondary split is fallbackSplit in gistproc.c :). I didn't understand how secondary split works until I get deep into GiST code.

The code in that area is a bit difficult to follow, so it's not obvious
to me exactly what is supposed to happen.

When GiST split index tupes by first column, it find if some tuples can be equally placed to any of groups. If so it calls picksplit for second column with only that tuples. But we already now that some keys of second column should be placed to particular groups, because some tuples are already unambiguously placed to the groups. GiST calculates union of that keys of second column for each group and pass it as ldatum and rdatum to pisckplit function of second column. In order to indicate such secondary split GiST set ldatum_exists and rdatum_exists flags. If picksplit function support secondary split, it should join given keys to existing ldatum and rdatum and set off ldatum_exists and rdatum_exists flags. If picksplit function don't support secondary split than in leave ldatum_exists and rdatum_exists as is, and GiST decide how to join picksplit result with existing groups itself by using penalty function. Also it is possible than only one of ldatum_exists and rdatum_exists flags is set. It indicated that there aren't not null keys of second column which unambiguously join group which flag isn't set. In this case pisksplit function which support secondary split can form one of ldatum or rdatum (which flag isn't set) without limitations.
 
So, do we demote that message to a DEBUG1? Or do we make it more clear
what the authors of a specific picksplit are supposed to do to avoid
that problem? Or am I misunderstanding something?

+1 for demote message to DEBUG1. I think it shouldn't be so noisy, it just indicates something could be improved.
Also I think we defenitely need to document secondary split. Now it's no chances to understand without reverse engeneering it from code.

------
With best regards,
Alexander Korotkov.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Sergey Koposov
Дата:
Сообщение: Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
Следующее
От: Florian Pflug
Дата:
Сообщение: Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile