Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

Поиск
Список
Период
Сортировка
От Anastasia Lubennikova
Тема Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Дата
Msg-id cf36affc-94b4-9331-0886-21b7d1d08a3c@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
24.09.2019 3:13, Peter Geoghegan wrote:
> On Wed, Sep 18, 2019 at 7:25 PM Peter Geoghegan <pg@bowt.ie> wrote:
>> I attach version 16. This revision merges your recent work on WAL
>> logging with my recent work on simplifying _bt_dedup_one_page(). See
>> my e-mail from earlier today for details.
> I attach version 17. This version has changes that are focussed on
> further polishing certain things, including fixing some minor bugs. It
> seemed worth creating a new version for that. (I didn't get very far
> with the space utilization stuff I talked about, so no changes there.)
Attached is v18. In this version bt_dedup_one_page() is refactored so that:
- no temp page is used, all updates are applied to the original page.
- each posting tuple wal logged separately.
This also allowed to simplify btree_xlog_dedup significantly.

> Another infrastructure thing that the patch needs to handle to be committable:
>
> We still haven't added an "off" switch to deduplication, which seems
> necessary. I suppose that this should look like GIN's "fastupdate"
> storage parameter. It's not obvious how to do this in a way that's
> easy to work with, though. Maybe we could do something like copy GIN's
> GinGetUseFastUpdate() macro, but the situation with nbtree is actually
> quite different. There are two questions for nbtree when it comes to
> deduplication within an inde: 1) Does the user want to use
> deduplication, because that will help performance?, and 2) Is it
> safe/possible to use deduplication at all?
I'll send another version with dedup option soon.

> I think that we should probably stash this information (deduplication
> is both possible and safe) in the metapage. Maybe we can copy it over
> to our insertion scankey, just like the "heapkeyspace" field -- that
> information also comes from the metapage (it's based on the nbtree
> version). The "heapkeyspace" field is a bit ugly, so maybe we
> shouldn't go further by adding something similar, but I don't see any
> great alternative right now.
>
Why is it necessary to save this information somewhere but rel->rd_options,
while we can easily access this field from _bt_findinsertloc() and 
_bt_load().

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Liudmila Mantrova
Дата:
Сообщение: Re: JSONPATH documentation
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: Proposal for syntax to support creation of partition tables whencreating parent table