Обсуждение: Modernize error message for malformed B-Tree tuple posting

Поиск
Список
Период
Сортировка

Modernize error message for malformed B-Tree tuple posting

От
Kirill Reshke
Дата:
Hi!

Today, while routinely fixing corruptions reported by our monitoring,
I observed a message:

2026-02-15 01:38:33.060
MSK,"<cut>","<cut>",1725745,"localhost:51330",6990f425.1a5531,1,"UPDATE",2026-02-15
01:16:05 MSK,59/1760774,183431134,ERROR,XX000,"posting list tuple with
3 items cannot be split at offset 20",,,,,,"UPDATE .... <there was
origin query,cut>,,,"","client backend",,3082007398573165030


So, two things are bothering me here. First, this is XX000 and not
XX002 which I cannot find a good reason to. Second, Index name is not
present in the error message.

Another about-corruption error messages seems to be different, for example:

ERROR,XX002,"table tid from new index tuple (58084,119) overlaps with
invalid duplicate tuple at offset 62 of block 181 in index ""<cut
name>""",,,,,,"UPDATE <cut query>

So, v1 changes the errcode for ERR_INDEX_CORRUPTION. For including
index name, we need
to pass relation to _bt_swap_posting function, which I did not do in
v1, because I'm not sure changing is worth it.

WDYT?

-- 
Best regards,
Kirill Reshke

Вложения

Re: Modernize error message for malformed B-Tree tuple posting

От
Chao Li
Дата:

> On Feb 18, 2026, at 15:58, Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> Hi!
>
> Today, while routinely fixing corruptions reported by our monitoring,
> I observed a message:
>
> 2026-02-15 01:38:33.060
> MSK,"<cut>","<cut>",1725745,"localhost:51330",6990f425.1a5531,1,"UPDATE",2026-02-15
> 01:16:05 MSK,59/1760774,183431134,ERROR,XX000,"posting list tuple with
> 3 items cannot be split at offset 20",,,,,,"UPDATE .... <there was
> origin query,cut>,,,"","client backend",,3082007398573165030
>
>
> So, two things are bothering me here. First, this is XX000 and not
> XX002 which I cannot find a good reason to. Second, Index name is not
> present in the error message.
>
> Another about-corruption error messages seems to be different, for example:
>
> ERROR,XX002,"table tid from new index tuple (58084,119) overlaps with
> invalid duplicate tuple at offset 62 of block 181 in index ""<cut
> name>""",,,,,,"UPDATE <cut query>
>
> So, v1 changes the errcode for ERR_INDEX_CORRUPTION. For including
> index name, we need
> to pass relation to _bt_swap_posting function, which I did not do in
> v1, because I'm not sure changing is worth it.
>
> WDYT?
>
> --
> Best regards,
> Kirill Reshke
> <v1-0001-Modernize-error-message-for-malformed-B-Tree-tupl.patch>

Hi Kirill,

Thanks for the patch. It makes sense to me to replace elog with the more modern ereport() and to add a REINDEX hint.

One small thing: the hint message "Please REINDEX it" feels a bit ambiguous to me. I initially thought it would be
betterto include the index name in the hint, but then I noticed that _bt_swap_posting() doesn’t have access to the
Relation,so it doesn’t know which index is affected. 

If passing a Relation rel parameter down to this function isn’t acceptable, then maybe we could at least rephrase the
hintto something like: “You might need to REINDEX the index.”, which sounds a bit clearer and avoids the vague “it”. 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: Modernize error message for malformed B-Tree tuple posting

От
Andrey Borodin
Дата:

> On 18 Feb 2026, at 12:58, Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> So, v1 changes the errcode for ERR_INDEX_CORRUPTION.

This totally makes sense to me.
And this seems the only such case in nbtdedup.c, other elog() are governing PageAddItem()'s
cant-happen case. As to others:

elog(ERROR, "fell off the end of index \"%s\"",
elog(ERROR, "no live root page found in index \"%s\"",
elog(ERROR, "root page %u of index \"%s\" has level %u, expected %u",

are of the same kind, but I do not remember seeing them on production.

> For including
> index name, we need
> to pass relation to _bt_swap_posting function, which I did not do in
> v1, because I'm not sure changing is worth it.

When it's the only message in the log - index name would be valuable.
Well, if it's overly complex - probably user (we) will be able to deduce corrupted index
from a query...


Best regards, Andrey Borodin.


Re: Modernize error message for malformed B-Tree tuple posting

От
Andrey Borodin
Дата:

> On 27 Feb 2026, at 17:31, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> 
> elog(ERROR, "fell off the end of index \"%s\"",
> elog(ERROR, "no live root page found in index \"%s\"",
> elog(ERROR, "root page %u of index \"%s\" has level %u, expected %u",
> 
> are of the same kind, but I do not remember seeing them on production.

I think 8ec97e78 and fd6ec93 established that we use error codes
when an error is potentially reachable. And so far we do not have evidence
for these cases. So, no need to change this.


Best regards, Andrey Borodin.