Re: [PoC] Improve dead tuple storage for lazy vacuum

Поиск
Список
Период
Сортировка
От John Naylor
Тема Re: [PoC] Improve dead tuple storage for lazy vacuum
Дата
Msg-id CAFBsxsF2e-e_m7CTouaGP6fBb2t726okhzq0kjC1+M3egujisw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PoC] Improve dead tuple storage for lazy vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers

On Mon, Jan 16, 2023 at 3:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jan 16, 2023 at 2:02 PM John Naylor
> <john.naylor@enterprisedb.com> wrote:

> > + * Add Tids on a block to TidStore. The caller must ensure the offset numbers
> > + * in 'offsets' are ordered in ascending order.
> >
> > Must? What happens otherwise?
>
> It ends up missing TIDs by overwriting the same key with different
> values. Is it better to have a bool argument, say need_sort, to sort
> the given array if the caller wants?

Now that I've studied it some more, I see what's happening: We need all bits set in the "value" before we insert it, since it would be too expensive to retrieve the current value, add one bit, and put it back. Also, as a consequence of the encoding, part of the tid is in the key, and part in the value. It makes more sense now, but it needs more than zero comments.

As for the order, I don't think it's the responsibility of the caller to guess if it needs sorting -- if unordered offsets lead to data loss, this function needs to take care of it.

> > + uint64 last_key = PG_UINT64_MAX;
> >
> > I'm having some difficulty understanding this sentinel and how it's used.
>
> Will improve the logic.

Part of the problem is the English language: "last" can mean "previous" or "at the end", so maybe some name changes would help.

--
John Naylor
EDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Perform streaming logical transactions by background workers and parallel apply
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: recovery modules