Re: BUG #17406: Segmentation fault on GiST index after 14.2 upgrade

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: BUG #17406: Segmentation fault on GiST index after 14.2 upgrade
Дата
Msg-id 5cd74597-f555-d4e1-4dc2-d399180bb8cc@enterprisedb.com
обсуждение исходный текст
Ответ на Re: BUG #17406: Segmentation fault on GiST index after 14.2 upgrade  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: BUG #17406: Segmentation fault on GiST index after 14.2 upgrade  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-bugs
On 2/24/22 16:18, Tomas Vondra wrote:
> On 2/23/22 16:41, Tomas Vondra wrote:
>> On 2/23/22 16:36, Tomas Vondra wrote:
>>>>
>>>> I got approval to send a table with its data and index in the subject
>>>> datafile.
>>>> Taken from the v14 database.
>>>>
>>>
>>> Thanks for the data! I've been able to reproduce the issue (load into
>>> 12, pg_upgrade to 14 and run the query). After bisecting this for a
>>> while, this seems like a bug in PG13 commit
>>>
>>> commit 911e70207703799605f5a0e8aad9f06cff067c63 (HEAD)
>>> Author: Alexander Korotkov <akorotkov@postgresql.org>
>>> Date:   Mon Mar 30 19:17:11 2020 +0300
>>>
>>>     Implement operator class parameters
>>>
>>>     ...
>>>
>>> It works fine when upgrading to an earlier build, and crashes since this
>>> commit. I haven't investigated this further, but I guess there's some
>>> thinko in gist_ltree_ops, affecting how we interpret existing indexes.
>>>
>>> Alexander, any ideas?
>>>
>>
>> Sorry, I accidentally used Alexander's old address, so let me CC him
>> with the correct/current one.
>>
> 
> I tried investigating this a bit further, but my gist-foo is rather
> limited so I haven't made much progress so far. But as Victor already
> mentioned, REINDEX fixes this - so I tried comparing the broken index
> with a new one built on PG13.
> 
> And the it seems the indexes are mostly the same, with the exception of
> LSN in the page header. The only page that substantially differs is the
> first page, which is also storing less data (the broken version):
> 
> test=# SELECT * FROM
> page_header(get_raw_page('v3_region_copy_ltree_path_idx', 0));
>     lsn    | checksum | flags | lower | upper | special | pagesize | ...
> ----------+----------+-------+-------+-------+---------+----------+- ...
>  0/1CD80C8 |        0 |     0 |    40 |  7856 |    8176 |     8192 | ...
>   (1 row)
> 
> test=# SELECT * FROM
> page_header(get_raw_page('v3_region_copy_13_ltree_path_idx', 0));
>     lsn    | checksum | flags | lower | upper | special | pagesize | ...
> -----------+----------+-------+-------+-------+---------+----------+-...
>  0/7029760 |        0 |     0 |    40 |  7760 |    8176 |     8192 | ...
> (1 row)
> 

It took me a bit of digging and debugging, but I think I got it. The
problem is that to determine siglen value, some of the functions use

  LTREE_GET_ASIGLEN

macro, which inspects the opclass options and returns the value. But if
there are no explicit options for the particular index (as happens for
indexes created in older versions), it uses LTREE_ASIGLEN_DEFAULT, which
is 28 bytes. But that's wrong - this value for ltree[] gist opclass, not
for plain ltree - that uses just 8B. So we're off by 20B, which messes
all kinds of stuff, of course.

My guess this is a simple mistake, because the macro is in section
labeled as

  /* GiST support for ltree[] */

and the "A" in the name probably refers to "array" too.

Attached is a patch that fixes this, but I wonder what will happen to
new indexes created on 13+. Will check.

Victor, can you try with this patch applied?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: BUG #17406: Segmentation fault on GiST index after 14.2 upgrade
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: BUG #17406: Segmentation fault on GiST index after 14.2 upgrade