Re: GIN improvements part 1: additional information

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: GIN improvements part 1: additional information
Дата
Msg-id 52792FA4.3010508@vmware.com
обсуждение исходный текст
Ответ на Re: GIN improvements part 1: additional information  (Alexander Korotkov <aekorotkov@gmail.com>)
Ответы Re: GIN improvements part 1: additional information  (Alexander Korotkov <aekorotkov@gmail.com>)
Список pgsql-hackers
On 04.11.2013 23:44, Alexander Korotkov wrote:
> On Mon, Oct 21, 2013 at 11:12 PM, Alexander Korotkov
> <aekorotkov@gmail.com>wrote:
>
>> Attached version of patch is debugged. I would like to note that number of
>> bugs was low and it wasn't very hard to debug. I've rerun tests on it. You
>> can see that numbers are improved as the result of your refactoring.
>>
>>           event         |     period
>> -----------------------+-----------------
>>   index_build           | 00:01:45.416822
>>   index_build_recovery  | 00:00:06
>>   index_update          | 00:05:17.263606
>>   index_update_recovery | 00:01:22
>>   search_new            | 00:24:07.706482
>>   search_updated        | 00:26:25.364494
>> (6 rows)
>>
>>       label      | blocks_mark
>> ----------------+-------------
>>   search_new     |   847587636
>>   search_updated |   881210486
>> (2 rows)
>>
>>       label     |   size
>> ---------------+-----------
>>   new           | 419299328
>>   after_updates | 715915264
>> (2 rows)
>>
>> Beside simple bugs, there was a subtle bug in incremental item indexes
>> update. I've added some more comments including ASCII picture about how
>> item indexes are shifted.
>>
>> Now, I'm trying to implement support of old page format. Then we can
>> decide which approach to use.
>>
>
> Attached version of patch has support of old page format. Patch still needs
> more documentation and probably refactoring, but I believe idea is clear
> and it can be reviewed. In the patch I have to revert change of null
> category placement for compatibility with old posting list format.

Thanks, just glanced at this quickly.

If I'm reading the patch correctly, old-style non-compressed posting 
tree leaf pages are not vacuumed at all; that's clearly not right.

I argued earlier that we don't need to handle the case that compressing 
a page makes it larger, so that the existing items don't fit on the page 
anymore. I'm having some second thoughts on that; I didn't take into 
account the fact that the "mini-index" in the new page format takes up 
some space. I think it's still highly unlikely that there isn't enough 
space on a 8k page, but it's not totally crazy with a non-standard small 
page size. So at least that situation needs to be checked for with an 
ereport(), rather than Assert.

To handle splitting a non-compressed page, it seems that you're relying 
on the fact that before splitting, we try to insert, which compresses 
the page. The problem with that is that it's not correctly WAL-logged. 
The split record that follows includes a full copy of both page halves, 
but if the split fails for some reason, e.g you run out of disk space, 
there is no WAL record at all of the the compression. I'd suggest doing 
the compression in the insert phase on a temporary copy of the page, 
leaving the original page untouched if there isn't enough space for the 
insertion to complete. (You could argue that this case can't happen 
because the compression must always create enough space to insert one 
more item. maybe so, but at least there should be an explicit check for 
that.)

- Heikki



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Claudio Freire
Дата:
Сообщение: Re: Fast insertion indexes: why no developments
Следующее
От: Leonardo Francalanci
Дата:
Сообщение: Re: Fast insertion indexes: why no developments