Обсуждение: Corner cases with GiST n-way splits

Поиск
Список
Период
Сортировка

Corner cases with GiST n-way splits

От
Heikki Linnakangas
Дата:
GiST page splitting has the peculiarity that it sometimes needs to split 
a single page into more than two pages. It happens rarely in practice, 
but it possible (*). With a bad picksplit function, it happens more often.

While testing with a custom gist opclass with truly evil helper 
functions, I found two corner cases with the current implementation when 
a page is split into many halves:

1. If a page is split into more than 100 pages, you run into the same 
limit of 100 simultaneous lwlocks that Tom Forbes reported with a 
pathological intarray index. This time it's not because we hold locks on 
many different levels, but because of a single split.

2. When the root page is split, there is no parent page to update, so we 
just create a new root page with the downlinks. However, when you split 
a page into a lot of siblings, it's possible that all the downlinks 
don't fit on a single page. The code is prepared for that situation. You 
get an error, when it tries to add more downlinks on a single page than 
fit there.

I'm not sure what to do about these. Neither issue is something you'd 
actually bump into in an index that's doing something useful; there's 
been no user complaints about these.


(*) What follows is an explanation of how a page can be split into more 
than two halves, to help you (and me!) understand this:

In a very pathological case, it's possible for a single insertion to 
cause a page to be split into hundreds of pages. Imagine that you have a 
page full of very small tuples (let's imagine that a page can hold 8 
letters, and ignore all tuple overhead for now):

A B C D E F G H

Now you insert one large tuple on the page, DDDD. picksplit algorithm 
can choose to split this as:

A - B C D E F G H DDDD

The right side is still too large to on a single page, so it's 
iteratively split again:

A - B - C D E F G H DDDD

And again:

A - B - C - D E F G H DDDD

And again:

A - B - C - D - E F G H DDDD

In this example, the page was split into 5 halves, but in reality a page 
can hold many more tuples, and the difference between a small and a 
large tuple can be much greater, so you can end up with many more 
siblings in one split.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Corner cases with GiST n-way splits

От
Alexander Korotkov
Дата:
On Thu, May 10, 2012 at 9:14 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
GiST page splitting has the peculiarity that it sometimes needs to split a single page into more than two pages. It happens rarely in practice, but it possible (*). With a bad picksplit function, it happens more often.

While testing with a custom gist opclass with truly evil helper functions, I found two corner cases with the current implementation when a page is split into many halves:

1. If a page is split into more than 100 pages, you run into the same limit of 100 simultaneous lwlocks that Tom Forbes reported with a pathological intarray index. This time it's not because we hold locks on many different levels, but because of a single split.

2. When the root page is split, there is no parent page to update, so we just create a new root page with the downlinks. However, when you split a page into a lot of siblings, it's possible that all the downlinks don't fit on a single page. The code is prepared for that situation. You get an error, when it tries to add more downlinks on a single page than fit there.

I'm not sure what to do about these. Neither issue is something you'd actually bump into in an index that's doing something useful; there's been no user complaints about these.

If such cases are very rare, we could call genericPickSplit if decide user picksplit function result to be bad. We're already doing this if user picksplit puts all the tuples into one page.

GiST can split page into many pages because of nulls and multicolumn indexes independently on user picksplit function.
Imagine we've following tuples:

tuple key1  key2  key3  key4  key5  ...
1     value value value value value ...
2     NULL  value value value value ...
3     NULL  NULL  NULL  value value ...
4     NULL  NULL  NULL  NULL  value ...
5     NULL  NULL  NULL  NULL  NULL  ...
......

In this case splitByKey will find only non-null value in the first key and splits it into separate page. Then it will do same thing for the second key etc. However, this process is limited by INDEX_MAX_KEYS.

BTW, I was thinking that it's not very right to let user picksplit decide what split ratio (ratio of tuples in smaller and greater pages) is acceptable. We could pass minimal acceptable ratio to use picksplit function and call genericPickSplit if ratio is not followed. However, the question arises if we're going to measure ratio in tuples count or in tuples size. Ratio in tuples size seems to be more desirable while ratio in tuples count seems to be easier for user picksplit to follow.

------
With best regards,
Alexander Korotkov.

Re: Corner cases with GiST n-way splits

От
Heikki Linnakangas
Дата:
On 10.05.2012 21:04, Alexander Korotkov wrote:
> On Thu, May 10, 2012 at 9:14 PM, Heikki Linnakangas<
> heikki.linnakangas@enterprisedb.com>  wrote:
>
>> I found two corner cases with the current implementation when a page is
>> split into many halves:
>>
>> 1. If a page is split into more than 100 pages, you run into the same
>> limit of 100 simultaneous lwlocks that Tom Forbes reported with a
>> pathological intarray index. This time it's not because we hold locks on
>> many different levels, but because of a single split.
>>
>> 2. When the root page is split, there is no parent page to update, so we
>> just create a new root page with the downlinks. However, when you split a
>> page into a lot of siblings, it's possible that all the downlinks don't fit
>> on a single page. The code is prepared for that situation. You get an
>> error, when it tries to add more downlinks on a single page than fit there.
>>
>> I'm not sure what to do about these. Neither issue is something you'd
>> actually bump into in an index that's doing something useful; there's been
>> no user complaints about these.
>
> If such cases are very rare, we could call genericPickSplit if decide user
> picksplit function result to be bad. We're already doing this if user
> picksplit puts all the tuples into one page.

Yeah. We just need to decide when we consider picksplit to be doing such 
a bad job that we fall back to genericPickSplit. Something like, if the 
split produces more than 5 pages, perhaps.

> GiST can split page into many pages because of nulls and multicolumn
> indexes independently on user picksplit function.
> Imagine we've following tuples:
>
> tuple key1  key2  key3  key4  key5  ...
> 1     value value value value value ...
> 2     NULL  value value value value ...
> 3     NULL  NULL  NULL  value value ...
> 4     NULL  NULL  NULL  NULL  value ...
> 5     NULL  NULL  NULL  NULL  NULL  ...
> ......
>
> In this case splitByKey will find only non-null value in the first key and
> splits it into separate page. Then it will do same thing for the second key
> etc. However, this process is limited by INDEX_MAX_KEYS.

Interesting, I didn't realize we handle NULLs like that. INDEX_MAX_KEYS 
is 32, which is less than the 100 lwlock limit, so I think we're safe 
with that.

> BTW, I was thinking that it's not very right to let user picksplit decide
> what split ratio (ratio of tuples in smaller and greater pages) is
> acceptable.

It's not hard to imagine a case where it really does make sense to split 
a page so that one tuple goes on one page, and all the rest go to 
another. For example, imagine that a page contains 100 identical tuples, 
plus one that's different from all the rest. Now you insert one more 
tuple that's identical to the 100 tuples, and the insert causes a page 
split. It makes sense to split off the single outlier to a page of its 
own, and put all the rest on one page. One more insert will make the 
page split again, but the tree is better organized.

Whether that's too marginal to worry about, and we should enforce a 
split ratio anyway, I'm not sure..

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com