Re: GIN improvements part2: fast scan

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: GIN improvements part2: fast scan
Дата
Msg-id 53208B4D.5000806@vmware.com
обсуждение исходный текст
Ответ на Re: GIN improvements part2: fast scan  (Tomas Vondra <tv@fuzzy.cz>)
Ответы Re: GIN improvements part2: fast scan  (Alexander Korotkov <aekorotkov@gmail.com>)
Re: GIN improvements part2: fast scan  (Thom Brown <thom@linux.com>)
Список pgsql-hackers
On 03/12/2014 12:09 AM, Tomas Vondra wrote:
> Hi all,
>
> a quick question that just occured to me - do you plan to tweak the cost
> estimation fot GIN indexes, in this patch?
>
> IMHO it would be appropriate, given the improvements and gains, but it
> seems to me gincostestimate() was not touched by this patch.

Good point. We have done two major changes to GIN in this release cycle: 
changed the data page format and made it possible to skip items without 
fetching all the keys ("fast scan"). gincostestimate doesn't know about 
either change.

Adjusting gincostestimate for the more compact data page format seems 
easy. When I hacked on that, I assumed all along that gincostestimate 
doesn't need to be changed as the index will just be smaller, which will 
be taken into account automatically. But now that I look at 
gincostestimate, it assumes that the size of one item on a posting tree 
page is a constant 6 bytes (SizeOfIptrData), which is no longer true. 
I'll go fix that.

Adjusting for the effects of skipping is harder. gincostestimate needs 
to do the same preparation steps as startScanKey: sort the query keys by 
frequency, and call consistent function to split the keys intao 
"required" and "additional" sets. And then model that the "additional" 
entries only need to be fetched when the other keys match. That's doable 
in principle, but requires a bunch of extra code.

Alexander, any thoughts on that? It's getting awfully late to add new 
code for that, but it sure would be nice somehow take fast scan into 
account.

- Heikki



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: The case against multixact GUCs
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: pgstat wait timeout (RE: contrib/cache_scan)