Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Дата
Msg-id CAH2-WzmDYpRVySk08sTNp-4V6CX-bmAicnF5wEuuLu-b5XAtsQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Ответы Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Список pgsql-hackers
On Fri, Aug 16, 2019 at 8:56 AM Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:
> Now the algorithm is the following:
>
> - If bt_findinsertloc() found out that tuple belongs to existing posting tuple's
> TID interval, it sets 'in_posting_offset' variable and passes it to
> _bt_insertonpg()
>
> - If 'in_posting_offset' is valid and origtup is valid,
> merge our itup into origtup.
>
> It can result in one tuple neworigtup, that must replace origtup; or two tuples:
> neworigtup and newrighttup, if the result exceeds BTMaxItemSize,

That sounds like the right way to do it.

> - If two new tuple(s) fit into the old page, we're lucky.
> call _bt_delete_and_insert(..., neworigtup, newrighttup, newitemoff) to
> atomically replace oldtup with new tuple(s) and generate xlog record.
>
> - In case page split is needed, pass both tuples to _bt_split().
>  _bt_findsplitloc() is now aware of upcoming replacement of origtup with
> neworigtup, so it uses correct item size where needed.

That makes sense, since _bt_split() is responsible for both splitting
the page, and inserting the new item on either the left or right page,
as part of the first phase of a page split. In other words, if you're
adding something new to _bt_insertonpg(), you probably also need to
add something new to _bt_split(). So that's what you did.

> It seems that now all replace operations are crash-safe. The new patch passes
> all regression tests, so I think it's ready for review again.

I'm looking at it now. I'm going to spend a significant amount of time
on this tomorrow.

I think that we should start to think about efficient WAL-logging now.

> In the meantime, I'll run more stress-tests.

As you probably realize, wal_consistency_checking is a good thing to
use with your tests here.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Следующее
От: Michael Paquier
Дата:
Сообщение: Plug-in common/logging.h with vacuumlo and oid2name